Deploying with an external database instance
You can deploy analyze-deployment-tooling to connect to a database instance that is not managed by analyze-deployment-tooling. When you deploy analyze-deployment-tooling it can create and manage the database, but it does not manage the database instance.
The configuration that determines if an external database instance is per config.
The prerequisites for an external database instance:
- The external database must be able to communicate with the analyze-deployment-tooling containers on the
eiaDocker network and the containers must be able to communicate with it.
Process overview:
- Generate and run the database creation scripts.
- Create a dba user.
- Deploy analyze-deployment-tooling with the external database instance.
Generating and running the database creation scripts
If you are deploying with an external database, you must configure the external database and generate the scripts to create the database.
Update the analyze-deployment-tooling config with the details of your external database system:
In the
/configs/<config-name>/configuration/InfoStoreNames<db_dialect>.propertiesfile, specify the name of the tablespace to use in your external system: For example:# The default tablespace for the Information Store database. DatabaseTablespace=istore # The tablespace for big data tables. BigDataTablespace=istore # The tablespace for indexes on big data tables. BigIndexTablespace=istore # The tablespace for small data tables. SmallDataTablespace=istore # The tablespace for indexes on small data tables. SmallIndexTablespace=istore # Tablespace Locations istoreTablespaceLocationDir=/var/lib/postgresql/data/istoreNOTE: You will need to create the tablespace in your external database system before executing database creation scripts.
In the
/configs/<config-name>/utils/variables.conffile specify values for the following variables:Where:
DB_USER_MANAGEDspecifies whether the config uses an external database. The possible values aretrueorfalse.DB_NAMEspecifies the name of the Information Store database. By default the value isISTORE.DB_FQDNis the fully-qualified domain name of the external database server.DB_PORTis the port number to connect to the external database.DB_SSL_CONNECTIONspecifies whether the connection to the external database is secure. The possible values aretrueorfalse. When set to true, you must provide the CA's public certificate in theconfigs/<config-name>/configuration/secrets/<db-dialect>/directory to ensure trust.For example:
DB_USER_MANAGED="true" DB_NAME="ISTORE" DB_FQDN="postgres.eia" DB_PORT="5432" DB_SSL_CONNECTION="false"
To generate the scripts for your config, run the following command:
deploy -t generate-db-scriptsThe scripts are generated in the
configs/<config-name>/database-scripts/generated/staticdirectory. Copy the generated scripts to a location where you can run them against your external database system. Run all the scripts that are listed inrunDatabaseCreationScripts.shagainst the external database system. You can find the script in theconfigs/<config-name>/database-scripts/generateddirectory.
Creating the dba user
To create the dba user, run the create_dba_login_and_user.sh script against your external database system.
The script is generated in the configs/<config-name>/database-scripts/generated/static directory.
NOTE:
- The script is generated with a placeholder for the dba password, please replace the placeholder with the actual password before executing the script.
- You can set the
DBA_USERNAMEin theconfigs/<config-name>/utils/variables.conf, to overwrite the default value ofdba_${DB_NAME}.
Deploying analyze-deployment-tooling with an external database instance
Provide the external database dba user's password.
Create the directory and file to store the password secret.
- For PostgreSQL:
configs/<config-name>/configuration/secrets/postgres/dba_PASSWORD For SQL Server:
configs/<config-name>/configuration/secrets/sqlserver/dba_PASSWORDImportant: If you changed the default dba username, you must update the file name to match the new username. E.g.
DBA_USERNAME="dba_istore"would require the fileconfigs/<config-name>/configuration/secrets/postgres/dba_istore_PASSWORD.
- For PostgreSQL:
In the password file, provide the user's password.
The password is stored in plain text, do not store this file in source control.Edit the
configs/<config-name>/utils/variables.conffile to provide the required information.Variable Description DB_USER_MANAGEDCan be set to trueorfalse. If set totrue, analyze-deployment-tooling does not manage the database instance.DB_FQDNThe fully qualified domain name of the database instance. DB_PORTThe port number to connect to the database instance. DB_SSL_CONNECTIONCan be set to trueorfalse. If set tofalsethen traffic to/from the instance will be sent in clear text. If set totruethen traffic to/from the instance will be encrypted via SSL.DB_LOCATION_DIRThe location in the external database instance to create the database. DBA_USERNAMEThe dba username to connect to the database instance. An example PostgreSQL configuration:
DB_USER_MANAGED="true" DB_FQDN="external-postgres.eia" DB_PORT="5432" DB_SSL_CONNECTION="true" DB_LOCATION_DIR="/var/lib/postgresql/data" DBA_USERNAME="dba_istore"An example SQL Server configuration:
DB_USER_MANAGED="true" DB_FQDN="external-sqlserver.eia" DB_PORT="1433" DB_SSL_CONNECTION="true" DB_LOCATION_DIR="/var/opt/mssql/data" DBA_USERNAME="dba_istore"Note:
- If your external database is hosted in a Docker container on a different network to your config development environment, connect your external database container to the
eianetwork. For example, rundocker network connect eia <external-database-container>. - If your database is running on your host machine, use the
host.docker.internalas theDB_FQDNvalue.
- If your external database is hosted in a Docker container on a different network to your config development environment, connect your external database container to the
Configure SSL.
If your external database uses SSL (you have
DB_SSL_CONNECTION="true") then the certificate must be one that the deployment "trusts". For security reasons, certificate verification is not optional; there is no "ignore verification and automatically trust all certificates" option.If your database's SSL certificate is not issued by a "well known" issuer (e.g. it is a self-signed certificate, or signed by a corporate internal CA rather than a public commercial provider) then you will need to provide the certificate(s) to the deployment so that the deployment knows to "trust" it:
- For a certificate that has been signed by an untrusted (self-signed or private) CA, add the (public part of the) CA certificate to
configs/<config-name>/configuration/secrets/additional-trust-certificates.cer. - This file is a text file in OpenSSL PEM format, i.e. certificates should start "
-----BEGIN CERTIFICATE-----". - Do NOT include the private key to the deployment. That should be private to your database (and only your database).
- For a certificate that has been signed by an untrusted (self-signed or private) CA, add the (public part of the) CA certificate to
Deploy the config using the
deploycommand in the usual way. For example:deployIf you have multiple configs, specify the config name with
-c <config-name>.
Additional PostgreSQL database configuration
- Configure permissions for data ingestion.
If you are using PostgreSQL as your external database, you must configure the permissions for data ingestion. You will need to execute the following SQL commands on your external database as a superuser:
Where ${ETL_USERNAME} is the username that you have configured in the variables.conf file, or is default to etl_${DB_NAME}.
GRANT pg_read_server_files TO ${ETL_USERNAME}_role;
- Configure pg_cron extension.
If you are using PostgreSQL as your external database, you must configure the pg_cron extension. This extension is a job scheduling extension for PostgreSQL, which runs inside the database and is used to schedule SQL commands.
To enable the pg_cron extension and grant usage to the DBA user, you can use the following commands in you local database:
CREATE EXTENSION IF NOT EXISTS pg_cron;
GRANT USAGE ON SCHEMA cron TO ${DBA_USERNAME}_role;
If you are using AWS RDS, you can enable the pg_cron extension by adding it to the shared_preload_libraries parameter in your RDS DB Parameter Group:
RdsDbParameterGroup:
Type: AWS::RDS::DBParameterGroup
Properties:
Parameters:
shared_preload_libraries: pg_cron
For other cloud providers, refer to their documentation on how to enable the pg_cron extension.