Skip to content

refactor(deployment): Refactor Package Docker Compose files using service-level compositions.#1522

Merged
sitaowang1998 merged 8 commits into
y-scope:mainfrom
sitaowang1998:compose-extends
Oct 30, 2025
Merged

refactor(deployment): Refactor Package Docker Compose files using service-level compositions.#1522
sitaowang1998 merged 8 commits into
y-scope:mainfrom
sitaowang1998:compose-extends

Conversation

@sitaowang1998

@sitaowang1998 sitaowang1998 commented Oct 29, 2025

Copy link
Copy Markdown
Contributor

Description

This PR improves docker compose file extensibility by:

  • Moving all services' default definitions into a docker-compose-all.yaml.
  • Using extends to include default services in the docker compose files.

This PR also includes minor improvements:

  • Renames docker-compose.base.yaml to docker-compose-base.yaml.
  • Uses a separate function in controller.py to find the docker file name.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Compression and search work on full deployment.
  • Containers up successfully on base deployment.
  • GitHub workflows pass.

Summary by CodeRabbit

  • New Features

    • Added query scheduler and query worker services for distributed query processing capabilities.
    • Added reducer service for result aggregation and processing.
    • Added MCP server service for enhanced protocol integrations.
  • Refactor

    • Restructured deployment configuration for improved maintainability and modularity across services.

@sitaowang1998 sitaowang1998 requested a review from a team as a code owner October 29, 2025 21:20
@coderabbitai

coderabbitai Bot commented Oct 29, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

A refactoring of the Docker Compose configuration structure introduces a hierarchical composition pattern. A new helper method in the controller selects between deployment-type-specific compose files. Service definitions are centralized in a new docker-compose-all.yaml file, with docker-compose-base.yaml and docker-compose.yaml extending from it. Four new services (query-scheduler, query-worker, reducer, mcp-server) are added alongside a shared utilities file for common volumes.

Changes

Cohort / File(s) Summary
Controller helper method
components/clp-package-utils/clp_package_utils/controller.py
Added private method _get_docker_file_name(self) -> str to DockerComposeController that returns docker-compose-base.yaml for BASE deployment type, otherwise docker-compose.yaml. The start() method refactored to use this helper instead of inline conditional logic.
Centralized service definitions
tools/deployment/package/docker-compose-all.yaml
New file containing all service definitions. Project renamed from clp-package-base to clp-package-all. Removed top-level empty volumes block. Added four new services: query-scheduler (with broker/Redis, healthcheck, scheduling entrypoint), query-worker (with Celery worker configuration), reducer (with concurrency and upsert settings), and mcp-server (with MCP profile and host port mapping).
Shared utilities
tools/deployment/package/docker-compose.utils.yaml
New file defining a tmpfs-backed empty volume for use when bind mounts are not desired.
Base and default deployment configurations
tools/deployment/package/docker-compose-base.yaml, tools/deployment/package/docker-compose.yaml
docker-compose-base.yaml added as a new file that includes docker-compose.utils.yaml and defines base services extending from docker-compose-all.yaml. docker-compose.yaml refactored to include docker-compose.utils.yaml and uses extends references to docker-compose-all.yaml for all services instead of inline configurations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Controller method: Straightforward addition with clear logic; verify that deployment_type comparison handles all expected values.
  • docker-compose-all.yaml: Review new service configurations (query-scheduler, query-worker, reducer, mcp-server) for correctness of entrypoints, dependencies, healthchecks, and resource settings. Verify that removed volumes block does not break existing functionality.
  • docker-compose-base.yaml and docker-compose.yaml: Verify that all service extends references are correctly structured and that the inheritance chain from docker-compose-all.yaml produces the intended effective configurations. Ensure no configuration gaps or unintended overrides occur.
  • docker-compose.utils.yaml: Confirm that the tmpfs volume definition is correctly specified and that all services that require it reference it appropriately.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "refactor(deployment): Refactor Package Docker Compose files using service-level compositions" accurately describes the primary objective of the changeset. The main work involves refactoring Docker Compose configuration files to centralize service definitions in docker-compose-all.yaml and use the extends mechanism across service-level compose files, which directly aligns with the title's emphasis on "service-level compositions." The title is concise, uses proper conventional commit formatting, and provides sufficient specificity for a developer scanning history to understand the core refactoring effort.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76aa39e and bf47641.

📒 Files selected for processing (3)
  • tools/deployment/package/docker-compose-base.yaml (1 hunks)
  • tools/deployment/package/docker-compose.utils.yaml (1 hunks)
  • tools/deployment/package/docker-compose.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
PR: y-scope/clp#1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • tools/deployment/package/docker-compose.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (3)
tools/deployment/package/docker-compose-base.yaml (1)

1-55: Solid service composition structure for base deployment.

The docker-compose-base.yaml file correctly centralizes the base service definitions through the extends mechanism from docker-compose-all.yaml. The include of docker-compose.utils.yaml provides access to shared utilities. All 10 base services follow a consistent pattern.

tools/deployment/package/docker-compose.yaml (2)

3-4: Appropriate include of shared utilities.

The include of docker-compose.utils.yaml enables access to the shared volumes section across FULL deployment services, aligning with the new composition structure.


7-75: Consistent extends pattern across all services.

All 14 services (including the four new ones: query-scheduler, query-worker, reducer, mcp-server) correctly extend their definitions from docker-compose-all.yaml using a uniform pattern. This aligns well with the refactoring goal of centralizing service definitions.

Please verify that all 14 services referenced here are defined in docker-compose-all.yaml with matching service names and that no mismatches exist between the extends declarations and the actual service definitions.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56ebca1 and 85474f0.

📒 Files selected for processing (4)
  • components/clp-package-utils/clp_package_utils/controller.py (2 hunks)
  • tools/deployment/package/docker-compose-all.yaml (1 hunks)
  • tools/deployment/package/docker-compose-base.yaml (1 hunks)
  • tools/deployment/package/docker-compose.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • tools/deployment/package/docker-compose-base.yaml
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
PR: y-scope/clp#1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • tools/deployment/package/docker-compose-base.yaml
  • tools/deployment/package/docker-compose.yaml
🧬 Code graph analysis (1)
components/clp-package-utils/clp_package_utils/controller.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
  • get_deployment_type (777-781)
  • DeploymentType (90-92)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: package-image
🔇 Additional comments (9)
components/clp-package-utils/clp_package_utils/controller.py (2)

797-804: LGTM! Clean separation of compose file selection logic.

The method correctly maps deployment types to their respective compose files, making the logic explicit and maintainable.


746-746: LGTM! Proper integration of dynamic file selection.

The change correctly integrates the new method to select the appropriate compose file based on deployment type.

tools/deployment/package/docker-compose-all.yaml (5)

362-399: LGTM! Query scheduler service is well-configured.

The service definition follows established patterns from the compression-scheduler, with appropriate dependencies, environment variables, and health checks.


401-433: LGTM! Query worker configuration mirrors compression worker pattern.

The service appropriately adapts the compression-worker pattern for query workloads, with correct volume mounts for archive and stream access.


435-458: LGTM! Reducer service is properly configured.

The service correctly depends on the query-scheduler being healthy and has appropriate volume mounts for its operational needs.


460-496: LGTM! MCP server service follows established patterns.

The service is properly configured with conditional deployment via the "mcp" profile, matching the controller logic at line 747-748 of controller.py.


489-496: No issues found – curl is installed in the container image.

Verification confirms that curl is explicitly installed in the clp-package container image via the setup script (./tools/docker-images/clp-package/setup-scripts/install-prebuilt-packages.sh). The healthcheck at lines 489–496 will function correctly.

tools/deployment/package/docker-compose.yaml (1)

12-80: LGTM! Extends pattern correctly eliminates service definition duplication.

The use of extends to reference service definitions from docker-compose-all.yaml follows Docker Compose best practices and successfully removes redundant configuration.

tools/deployment/package/docker-compose-base.yaml (1)

1-60: LGTM! Base deployment configuration is correct.

The package name clp-package-presto is appropriate for BASE deployment (Presto), and the service selection correctly excludes query-related services that are not needed for Presto-based deployments. The extends pattern successfully reduces duplication while maintaining deployment-specific service composition.

Comment thread tools/deployment/package/docker-compose.yaml Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for these services, can we label with a profile in the "all.yaml"? like profiles: ["base"]

Then we don't need this "base.yaml". To launch bases services, we can just run docker compose -f docker-compose.all.yaml --profile base up

We should still keep the docker-compose.yaml file, there we reset the profiles of the services with the !reset [] syntax, so we can just use docker compose up to launch the default set of services

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sitaowang1998 sitaowang1998 Oct 30, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole point of this PR is to make the docker compose file easily extensible. Adding a base profile does not really help if we later introduces a new "profile". Then we need to introduce another profile, and maybe a new docker file as well with resets, making it hard to track how the variables are defined, and what docker compose file to use.

Using the extends approach, for each profile we have a single file listing all the services it needs and the override variables. This way is more structured than using the profile.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, that makes a lot of sense

Comment thread tools/deployment/package/docker-compose.yaml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cc9f7f and 3252f82.

📒 Files selected for processing (4)
  • tools/deployment/package/docker-compose-all.yaml (2 hunks)
  • tools/deployment/package/docker-compose-base.yaml (1 hunks)
  • tools/deployment/package/docker-compose.volume.yaml (1 hunks)
  • tools/deployment/package/docker-compose.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
PR: y-scope/clp#1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • tools/deployment/package/docker-compose.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: package-image

Comment thread tools/deployment/package/docker-compose.volume.yaml Outdated

@junhaoliao junhaoliao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes lgtm. i posted some stylistic comments

for the title, would this be better?

refactor(deployment): Refactor Package Docker Compose files using service-level compositions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, that makes a lot of sense

Comment thread tools/deployment/package/docker-compose.volume.yaml Outdated
Comment thread tools/deployment/package/docker-compose.utils.yaml

@junhaoliao junhaoliao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(let's first see if the stylistic comments make sense and address as needed

Co-authored-by: Junhao Liao <junhao@junhao.ca>
@sitaowang1998 sitaowang1998 changed the title refactor(deployment): Use extends to remove redundant service definition. refactor(deployment): Refactor Package Docker Compose files using service-level compositions. Oct 30, 2025
@sitaowang1998 sitaowang1998 merged commit caa2f9d into y-scope:main Oct 30, 2025
20 checks passed
@sitaowang1998 sitaowang1998 deleted the compose-extends branch October 30, 2025 20:35
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
…vice-level compositions. (y-scope#1522)

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants