Request
Currently, the clp and clp-s binaries read database credentials and connection parameters from YAML configuration files. This approach has several drawbacks:
- Credentials are stored in plain text files, which can be a security risk
- It's not consistent with modern practices of using environment variables for sensitive information
- It makes deployment in containerized environments more complex
The goal is to modify these binaries to:
- Read database credentials (username, password) from environment variables
- Accept other database connection parameters (host, port, database name, etc.) as command-line arguments
- Remove support for YAML configuration files for database credentials
Possible implementation
-
Enhance GlobalMetadataDBConfig class:
- Remove the
parse_config_file method
- Add methods to initialize from command-line arguments and environment variables:
- Add a method to register command-line options with
boost::program_options
- Add a method to initialize the config from parsed command-line options
- Add methods to read credentials from environment variables:
CLP_DB_USERNAME for database username
CLP_DB_PASSWORD for database password
- Add methods to validate that all required parameters are provided
-
clp: Update command-line argument parsing in clp/CommandLineArguments.cpp to:
- Use
GlobalMetadataDBConfig to register database command-line options
- Initialize
GlobalMetadataDBConfig with parsed arguments and environment variables
- Remove any code related to parsing database config files
-
clp-s: Update command-line argument parsing in clp_s/CommandLineArguments.cpp to:
- Use
GlobalMetadataDBConfig to register database command-line options
- Initialize database configuration with parsed arguments and environment variables
- Remove any code related to parsing database config files
-
clp-s/indexer: Update command-line argument parsing in clp_s/indexer/CommandLineArguments.cpp to:
- Use
GlobalMetadataDBConfig to register database command-line options
- Initialize database configuration with parsed arguments and environment variables
- Remove any code related to parsing database config files
-
Update the manual db migration script: components/core/tools/scripts/db/init-db.py
- Accept DB config via command line and environment variables
-
Update wrapper scripts: Update components/job-orchestration/job_orchestration/executor/compress/compression_task.py:
- Pass DB config via command line and environment variables
-
Dependency cleanup:
- Since we're removing YAML configuration file support, we should also remove the yaml-cpp dependency from components that no longer need it
-
Documentation updates:
- Remove
components/core/config/metadata-db.yml as it's no longer needed
- Update relevant documentation in
docs/src to reflect the new way of configuring database connections
- Update any references to the old YAML-based configuration approach
Testing Strategy:
Add unit tests for GlobalMetadataDBConfig:
- Test initialization from command-line arguments
- Test reading credentials from environment variables
- Test validation of required parameters
Request
Currently, the
clpandclp-sbinaries read database credentials and connection parameters from YAML configuration files. This approach has several drawbacks:The goal is to modify these binaries to:
Possible implementation
Enhance
GlobalMetadataDBConfigclass:parse_config_filemethodboost::program_optionsCLP_DB_USERNAMEfor database usernameCLP_DB_PASSWORDfor database passwordclp: Update command-line argument parsing inclp/CommandLineArguments.cppto:GlobalMetadataDBConfigto register database command-line optionsGlobalMetadataDBConfigwith parsed arguments and environment variablesclp-s: Update command-line argument parsing inclp_s/CommandLineArguments.cppto:GlobalMetadataDBConfigto register database command-line optionsclp-s/indexer: Update command-line argument parsing inclp_s/indexer/CommandLineArguments.cppto:GlobalMetadataDBConfigto register database command-line optionsUpdate the manual db migration script:
components/core/tools/scripts/db/init-db.pyUpdate wrapper scripts: Update
components/job-orchestration/job_orchestration/executor/compress/compression_task.py:Dependency cleanup:
Documentation updates:
components/core/config/metadata-db.ymlas it's no longer neededdocs/srcto reflect the new way of configuring database connectionsTesting Strategy:
Add unit tests for
GlobalMetadataDBConfig: