Skip to content

Modify clp/clp-s binaries to read database credentials from environment variables and connection parameters from command line #1146

@junhaoliao

Description

@junhaoliao

Request

Currently, the clp and clp-s binaries read database credentials and connection parameters from YAML configuration files. This approach has several drawbacks:

  1. Credentials are stored in plain text files, which can be a security risk
  2. It's not consistent with modern practices of using environment variables for sensitive information
  3. It makes deployment in containerized environments more complex

The goal is to modify these binaries to:

  1. Read database credentials (username, password) from environment variables
  2. Accept other database connection parameters (host, port, database name, etc.) as command-line arguments
  3. Remove support for YAML configuration files for database credentials

Possible implementation

  1. Enhance GlobalMetadataDBConfig class:

    • Remove the parse_config_file method
    • Add methods to initialize from command-line arguments and environment variables:
      • Add a method to register command-line options with boost::program_options
      • Add a method to initialize the config from parsed command-line options
      • Add methods to read credentials from environment variables:
        • CLP_DB_USERNAME for database username
        • CLP_DB_PASSWORD for database password
    • Add methods to validate that all required parameters are provided
  2. clp: Update command-line argument parsing in clp/CommandLineArguments.cpp to:

    • Use GlobalMetadataDBConfig to register database command-line options
    • Initialize GlobalMetadataDBConfig with parsed arguments and environment variables
    • Remove any code related to parsing database config files
  3. clp-s: Update command-line argument parsing in clp_s/CommandLineArguments.cpp to:

    • Use GlobalMetadataDBConfig to register database command-line options
    • Initialize database configuration with parsed arguments and environment variables
    • Remove any code related to parsing database config files
  4. clp-s/indexer: Update command-line argument parsing in clp_s/indexer/CommandLineArguments.cpp to:

    • Use GlobalMetadataDBConfig to register database command-line options
    • Initialize database configuration with parsed arguments and environment variables
    • Remove any code related to parsing database config files
  5. Update the manual db migration script: components/core/tools/scripts/db/init-db.py

    • Accept DB config via command line and environment variables
  6. Update wrapper scripts: Update components/job-orchestration/job_orchestration/executor/compress/compression_task.py:

    • Pass DB config via command line and environment variables
  7. Dependency cleanup:

    • Since we're removing YAML configuration file support, we should also remove the yaml-cpp dependency from components that no longer need it
  8. Documentation updates:

    • Remove components/core/config/metadata-db.yml as it's no longer needed
    • Update relevant documentation in docs/src to reflect the new way of configuring database connections
    • Update any references to the old YAML-based configuration approach

Testing Strategy:

Add unit tests for GlobalMetadataDBConfig:

  • Test initialization from command-line arguments
  • Test reading credentials from environment variables
  • Test validation of required parameters

Metadata

Metadata

Assignees

No one assigned

    Labels

    dependenciesPull requests that update a dependency filedocumentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions