i2 Analyze Deployment Tooling

    Show / Hide Table of Contents

    ETL Tools

    This topic describes how to perform ETL tasks by using the ETL toolkit in a containerized deployment of i2 Analyze.

    The run_etl_toolkit_tool_as_i2_etl client function is used to run the ETL tools described in this topic as the i2ETL user. For more information about this client function, see run_etl_toolkit_tool_as_i2_etl

    Building an ETL Client image

    The ETL client image is built from the Dockerfile in images/etl_client.

    The following docker build command builds the configured image:

    docker build -t "etlclient_redhat:4.4.4" "images/etl_client"
    

    Add Information Store ingestion source

    The addInformationStoreIngestionSource tool defines an ingestion source in the Information Store. For more information about ingestion sources in the Information Store, see Defining an ingestion source.

    You must provide the following arguments to the tool:

    Argument Description Maximum characters
    n A unique name for the ingestion source 30
    d A description of the ingestion source that might appear in the user interface 100

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/addInformationStoreIngestionSource 
            -n <> 
            -d <> "
    

    Drop Information Store error tables

    The dropInformationStoreErrorTables tool is used to remove the _ERROR and _REJECT tables from the Information Store.

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/dropInformationStoreErrorTables"
    

    Clear Information Store staging schema

    The clearInformationStoreStagingSchema tool is used to clear all the tables in the Information Store Staging Schema.

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/clearInformationStoreStagingSchema"
    

    Create Information Store staging table

    The createInformationStoreStagingTable tool creates the staging tables that you can use to ingest data into the Information Store. For more information about creating the tables, see Creating the staging tables.

    You must provide the following arguments to the tool:

    Argument Description
    stid The schema type identifier of the item type to create the staging table for
    sn The name of the database schema to create the staging table in
    tn The name of the staging table to create

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/createInformationStoreStagingTable 
            -stid <> 
            -sn <>
            -tn <> "
    

    Ingest Information Store records

    The ingestInformationStoreRecords is used to ingest data into the Information Store. For more information about ingesting data into the Information Store, see The ingestInformationStoreRecords toolkit task

    You can use the following arguments with the tool:

    Argument Description
    imf The full path to the ingestion mapping file.
    imid The ingestion mapping identifier in the ingestion mapping file of the mapping to use
    im Optional: The import mode to use. Possible values are STANDARD, VALIDATE, BULK, DELETE, BULK_DELETE or DELETE_PREVIEW. The default is STANDARD.
    icf Optional: The full path to an ingestion settings file
    il Optional: A label for the ingestion that you can use to refer to it later
    lcl Optional: Whether (true/false) to log the links that were deleted/affected as a result of deleting entities

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/ingestInformationStoreRecords 
        -imf <> 
        -imid <> 
        -im <>"
    

    Sync Information Store records

    The syncInformationStoreCorrelation tool is used after an error during correlation, to synchronize the data in the Information Store with the data in the Solr index so that the data returns to a usable state

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/syncInformationStoreCorrelation"
    

    Duplicate provenance check

    The duplicateProvenanceCheck tool can be used for identifying records in the Information Store with duplicate origin identifiers. Any provenance that has a duplicated origin identifier is added to a staging table in the Information Store.

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    runEtlToolkitTool
        bash -c "/opt/i2/etltoolkit/syncInformationStoreCorrelation"
    

    Duplicate provenance delete

    The duplicateProvenanceDelete tool deletes (entity/link) provenance from the Information Store that has duplicated origin identifiers. The provenance to delete is identified in the staging tables created by the duplicateProvenanceCheck tool.

    You can provide the following argument to the tool:

    Argument Description
    stn The name of the staging table that contains the origin identifiers to delete.

    If no arguments are provided, duplicate origin identifiers are deleted from all staging tables.

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/syncInformationStoreCorrelation"
    

    Generate Information Store index creation scripts

    The generateInformationStoreIndexCreationScript tool generates the scripts that create the indexes for each item type in the Information Store. For more information, see database index management

    You must provide the following arguments to the tool:

    Argument Description
    stid The schema type identifier of the item type to create the index creation scripts for.
    op The location to create the scripts.

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    runEtlToolkitTask 
        bash -c "/opt/i2/etltoolkit/generateInformationStoreIndexCreationScript 
        -op <>
        -stid <> "
    

    Generate Information Store index drop scripts

    The generateInformationStoreIndexDropScript tool generates the scripts that drop the indexes for each item type in the Information Store. For more information, see database index management

    You must provide the following arguments to the tool:

    Argument Description
    stid The schema type identifier of the item type to create the index drop scripts for.
    op The location to create the scripts.

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    runEtlToolkitTask 
        bash -c "/opt/i2/etltoolkit/generateInformationStoreIndexDropScript 
        --op <> 
        -stid <> "
    

    Delete orphaned database objects

    The deleteOrphanedDatabaseObjects tool deletes (entity/link) database objects that are not associated with an i2 Analyze record from the Information Store.

    You can provide the following arguments to the tool:

    Argument Description
    iti Optional: The schema type identifier of the item type to delete orphaned database objects for.

    If no item type id is provided, orphaned objects for all item types are removed

    Use the run_etl_toolkit_tool_as_i2_etl client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/deleteOrphanedDatabaseObjects 
            -iti <> "
    

    Disable merged property values

    The disableMergedPropertyValues tool removes the database views used to define the property values of merged i2 Analyze records.

    You can provide the following arguments to the tool:

    Argument Description
    etd The location of the root of the etl toolkit.
    stid The schema type identifier to disable the views for.

    If no schema type identifier is provided, the views for all of the item types are be removed

    Use the run_etl_toolkit_tool_as_dba client function to run the tool. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/disableMergedPropertyValues 
            -etd <>
            -stid <>"
    

    For more information about correlation, see Information Store data correlation

    Enable merge property values

    The enableMergedPropertyValues tool creates the database views used to define the property values of merged i2 Analyze records.

    You can provide the following arguments to the tool:

    Argument Description
    etd The location of the root of the etl toolkit.
    stid The schema type identifier to create the views for.

    If no schema type identifier is provided, the views for all of the item types are generated. If the views already exist, they are overwritten.

    Use the run_etl_toolkit_tool_as_dba client function to run the tool as the database administrator. For example:

    run_etl_toolkit_tool_as_i2_etl 
        bash -c "/opt/i2/etltoolkit/enableMergedPropertyValues 
            -etd <> 
            -stid <> "
    

    For more information about correlation, see Information Store data correlation

    Back to top © N. Harris Computer Corporation