Skip to content

Migrate start_clp.py to Docker Compose #1177

@junhaoliao

Description

@junhaoliao

Request

Migrate the functionality of components/clp-package-utils/clp_package_utils/scripts/start_clp.py to use Docker Compose.

Currently, start_clp.py orchestrates the startup of multiple CLP components by:

  1. Reading and validating CLP configuration files
  2. Starting various services (database, queue, redis, results cache, schedulers, workers, etc.)
  3. Performing health checks to ensure services are ready
  4. Setting up appropriate Docker mounts and environment variables for each service

The goal is to replace this Python orchestration script with a Docker Compose file that achieves the same functionality while adding an init script to handle the configuration validation and setup tasks.

Possible implementation

1. Initialization script enhancement

Implement a dedicated initialization script to handle pre-startup configuration tasks:

Responsibilities:

  • Configuration validation: Verify system settings and credential integrity
  • Directory management: Create required directories and set appropriate permissions
  • Configuration generation: Generate component-specific configuration files and write to disk
    • UI configuration: Update on-disk web interface settings
  • Port validation: Validate port availability (potentially redundant if Docker Compose handles this)

Implementation approach:
The initial version will maintain the current start_clp.py pattern by running directly on the host machine for simplicity. However, we should plan to migrate this functionality to a dedicated init container within the Docker Compose setup. This migration will eliminate Python dependencies from the host environment and provide better containerization isolation.

2. Health checks and service dependencies

Implement comprehensive Docker Compose dependency management using three distinct conditions:

Dependency conditions:

  • service_started: Container has successfully started
  • service_healthy: Health checks pass (mysqladmin ping, rabbitmq-diagnostics, redis-cli ping, mongosh ping)
  • service_completed_successfully: One-time initialization jobs completed

Structured dependency hierarchy:

  1. Infrastructure services (service_healthy):

    • database
    • queue
    • redis
    • results_cache
  2. Initialization jobs (service_completed_successfully):

    • db_table_creator
    • results_cache_indices_creator
  3. Application services (mixed conditions):

    • compression_scheduler
    • query_scheduler
    • compression_worker
    • query_worker
    • reducer
    • webui

3. Network isolation

Establish a contained network environment with the following principles:

  • Isolated service communication: Services communicate exclusively through Docker-managed networks
  • Minimal host exposure: Only user-facing ports are mapped to the host system
  • Service discovery: Inter-service communication uses container hostnames rather than host networking
  • Security benefits: Reduced attack surface through network isolation

This approach ensures proper service separation while maintaining accessibility for user interfaces.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions