i2 Analyze Deployment Tooling

    Show / Hide Table of Contents

    Ingesting development data into the Information Store

    When you are developing a configuration, ingest a small amount of representative test data into the system to ensure the schema is suitable for your data and you can configure i2 Analyze to meet your requirements. For more information about ingesting data, see Ingesting data into the Information Store.

    Ensure that the DEPLOYMENT_PATTERN variable in the <config_name>/utils/variables.conf file is set to a pattern that includes the Information Store.
    For example:

    DEPLOYMENT_PATTERN="i2c_istore"
    

    Process overview:

    • Provide a data set
    • Create the ingestion sources
    • Provide and run scripts to complete the ingestion process

    If you have deployed with the law enforcement schema, complete the steps in Example ingestion process to ingest example data into your Information Store.

    Data sets

    The /i2a-data directory is used to contain the data sets that you ingest into the Information Store of a deployment. The data that you ingest into the Information Store must conform to the Information Store schema. However, one data set can be ingested with different configs. There is not a 1-to-1 mapping between data sets and configs. Each data set must contain at least one ingestion script. This script contains the functions that populate the staging tables with your data and calls the ETL toolkit tools that ingest the data.

    The expected directory structure is as follows:

    - i2a-data
        - <data_set>
            - scripts
                - <script1>
                - <script2>
    

    Ingesting data into the config dev environment

    The manage-data command is used to manage the ingestion process in the dev environment.

    To ingest data into the Information Store, you must create scripts that call the ETL tools that complete the actions required by i2 Analyze for you ingest data.
    For more information, see: - ETL tools - Ingesting data into the Information Store

    • You can find example scripts in the examples/ingestion/scripts directory. These scripts demonstrate how to create staging tables, populate them, and ingest the data into the Information Store.

    To run scripts, the manage-data command is called as follows:

    manage-data -c <config_name> -t ingest -d <data_set> -s <script_name>
    

    Where:

    • <config_name> is the name of the config that is currently deployed and running in the config dev environment
    • <data_set> is the name of a directory in i2a-data
    • <script_name> is the name of a script in the directory specified for <data_set>

    Creating ingestion sources

    The ingestion sources for a config are contained in the configuration. Ingestion sources are defined in <config_name>/configuration/ingestion/scripts/create-ingestion-sources.

    1. Copy the examples/ingestion/scripts/create-ingestion-sources file to the <config_name>/configuration/ingestion/scripts/ directory.

    In the script, the INGESTION_SOURCES array contains the name and description of 2 example sources.

    INGESTION_SOURCES=(
        [Example Ingestion Source 1]=EXAMPLE_1
        [Example Ingestion Source 2]=EXAMPLE_2
    )
    

    You can modify or add to the array of ingestion sources.

    To create the ingestion sources in the array, the manage-data command is called as follows:

    manage-data -c <config_name> -t sources
    

    Example ingestion process

    The i2 Analyze minimal toolkit contains the example law-enforcement-data-set-1 data that can be ingested when the example law enforcement schema (law-enforcement-schema.xml) is deployed. This contains a number of CSV files that contain the data, and a mapping.xml file. For more information about the mapping file, see Ingestion mapping files.

    Before you can ingest the law enforcement example data, complete the following steps to provide the data set and scripts:

    1. Copy the pre-reqs/i2analyze/toolkit/examples/data/law-enforcement-data-set-1 directory to the i2a-data directory.
    2. Copy the examples/ingestion/scripts directory to the i2a-data/law-enforcement-data-set-1 directory.
      The directory structure is as follows:

      - i2a-data
          - law-enforcement-data-set-1
              - scripts
                  - ingest-law-enforcement-data-set-1
                  - create-staging-tables
      
    3. Copy the examples/ingestion/scripts/create-ingestion-sources file to the <config_name>/configuration/ingestion/scripts/ directory.

    4. Use the manage-data to create the ingestion sources defined in the example create-ingestion-sources.
      For example:

      manage-data -c config-development -t sources
      
    5. The example scripts separate the creation of the staging tables from the ingestion of data. To ingest the example data into the config-development config, run the following commands:

      manage-data -c config-development -t ingest -d law-enforcement-data-set-1 -s create-staging-tables
      manage-data -c config-development -t ingest -d law-enforcement-data-set-1 -s ingest-law-enforcement-data-set-1
      

    The manage-data script

    The scripts/manage-data script is used to manage data in an environment. It can be used to run scripts that use the ETL toolkit tools, or to remove all data from the Information Store.

    The following usage and help is provided for the manage-data command:

    Usage:
      manage-data -c <config_name> -t {ingest} -d <data_set> -s <script_name> [-v]
      manage-data -c <config_name> -t {sources} [-s <script_name>] [-v]
      manage-data -c <config_name> -t {delete} [-v]
      manage-data -h
    
    Options:
      -c <config_name>             Name of the config to use.
      -t {delete|ingest|sources}   The task to run. Either delete or ingest data, or add ingestion sources. Delete permanently removes all data from the database.
      -d <data_set>                Name of the data set to ingest.
      -s <script_name>             Name of the ingestion script file.
      -v                           Verbose output.
      -h                           Display the help.
    

    After you add data to your environment, you can configure the rest of the configuration.

    Back to top © N. Harris Computer Corporation