feat(clp-package): Add MCP Server config interface and Docker Compose orchestration.#1481
Conversation
… better reflect their purpose.
… by prefixing `_`; reorder methods.
… docstring for CLPConfig.transform_for_container
# Conflicts: # components/clp-package-utils/clp_package_utils/scripts/start_clp.py # components/clp-package-utils/clp_package_utils/scripts/stop_clp.py # components/clp-py-utils/clp_py_utils/clp_config.py # docs/src/dev-docs/building-package.md
…only base services when Presto query engine is configured.
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (5)
components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py (1)
41-46: Validate environment variables before use.Line 42 will raise
TypeErrorifCLP_LOGS_DIRis not set (Path(None)is invalid). Line 46 should provide a default forCLP_LOGGING_LEVEL.Apply defensive validation:
# Setup logging to file + clp_logs_dir = os.getenv("CLP_LOGS_DIR") + if clp_logs_dir is None: + logger.error("CLP_LOGS_DIR environment variable is not set.") + return 1 + log_file_path = Path(clp_logs_dir) / "mcp_server.log" - log_file_path = Path(os.getenv("CLP_LOGS_DIR")) / "mcp_server.log" + log_file_path.parent.mkdir(parents=True, exist_ok=True) logging_file_handler = logging.FileHandler(filename=log_file_path, encoding="utf-8") logging_file_handler.setFormatter(get_logging_formatter()) logger.addHandler(logging_file_handler) - set_logging_level(logger, os.getenv("CLP_LOGGING_LEVEL")) + set_logging_level(logger, os.getenv("CLP_LOGGING_LEVEL", "INFO"))tools/deployment/package/docker-compose.yaml (1)
146-165: Move healthcheck to the bottom for consistency.For consistency with other services like
query-scheduler(lines 61-68), move thehealthcheckblock below thecommandblock.Apply this reordering:
volumes: - *volume_clp_config_readonly - *volume_clp_logs - healthcheck: - <<: *healthcheck_defaults - test: [ - "CMD", - "curl", - "-f", - "http://mcp_server:8000/health" - ] depends_on: db-table-creator: condition: "service_completed_successfully" results-cache-indices-creator: condition: "service_completed_successfully" command: [ "python3", "-u", "-m", "clp_mcp_server.clp_mcp_server", "--host", "mcp_server", "--port", "8000", "--config-path", "/etc/clp-config.yml", ] + healthcheck: + <<: *healthcheck_defaults + test: [ + "CMD", + "curl", + "-f", + "http://mcp_server:8000/health" + ]components/clp-package-utils/clp_package_utils/controller.py (3)
533-538: Update docstring for consistency.The docstring should match the pattern used in other component setup methods (e.g.,
_set_up_env_for_reducerat line 385).Apply this change:
def _set_up_env_for_mcp_server(self) -> EnvVarsDict: """ - Sets up environment variables and directories for the MCP server component. + Sets up environment variables for the MCP server component. :return: Dictionary of environment variables necessary to launch the component. """
540-560: Fix logging call, reorder setup steps, add chown, and alphabetize env groups.Multiple issues need addressing:
- Line 540: f-string in logging violates Ruff G004
- Lines 542-545:
validate_mcp_server_configshould be called beforemkdir(per junhaoliao's comment and the pattern that validation should verify the path could be created, not that it already exists)- Missing
_chown_paths_if_root(logs_dir)call after mkdir- Lines 549-558: Env var groups should be alphabetized (Connection before Logging)
Based on learnings
Apply these fixes:
component_name = MCP_SERVER_COMPONENT_NAME - logger.info(f"Setting up environment for {component_name}...") + logger.info("Setting up environment for %s...", component_name) logs_dir = self._clp_config.logs_directory / component_name + validate_mcp_server_config(self._clp_config, logs_dir) logs_dir.mkdir(parents=True, exist_ok=True) - - validate_mcp_server_config(self._clp_config, logs_dir) + _chown_paths_if_root(logs_dir) env_vars = EnvVarsDict() # Connection config env_vars |= { "CLP_MCP_HOST": _get_ip_from_hostname(self._clp_config.mcp_server.host), "CLP_MCP_PORT": str(self._clp_config.mcp_server.port), } # Logging config env_vars |= { "CLP_MCP_LOGGING_LEVEL": self._clp_config.mcp_server.logging_level, }Note: This assumes
validate_mcp_server_configingeneral.pyhas been updated to check if the path could be a directory rather than checking if it already exists (per junhaoliao's comment on lines 542-545).
661-662: Consider adding explicitenabledflag for MCP server.The current check
len(self._clp_config.mcp_server.model_fields_set) > 0works but is implicit. An explicitenabled: boolfield in theMcpServerconfig model would make the intent clearer and align with junhaoliao's suggestion.Example config structure:
mcp_server: enabled: false host: "localhost" port: 8000This would allow:
if self._clp_config.mcp_server.enabled: cmd += ["--profile", "mcp"]This is a design decision for the maintainers to consider.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py(2 hunks)components/clp-package-utils/clp_package_utils/controller.py(5 hunks)tools/deployment/package/docker-compose.yaml(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-20T20:19:40.504Z
Learnt from: 20001020ycx
PR: y-scope/clp#1436
File: components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py:22-27
Timestamp: 2025-10-20T20:19:40.504Z
Learning: In the clp-mcp-server project (components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py), the --config Click option should use exists=True to validate the configuration file path at option processing time, even though this requires the default path to exist on all machines.
Applied to files:
components/clp-package-utils/clp_package_utils/controller.py
📚 Learning: 2025-09-28T15:00:22.170Z
Learnt from: LinZhihao-723
PR: y-scope/clp#1340
File: components/job-orchestration/job_orchestration/executor/compress/compression_task.py:528-528
Timestamp: 2025-09-28T15:00:22.170Z
Learning: In components/job-orchestration/job_orchestration/executor/compress/compression_task.py, there is a suggestion to refactor from passing logger as a parameter through multiple functions to creating a ClpCompressor class that takes the logger as a class member, with current helper functions becoming private member functions.
Applied to files:
components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py
📚 Learning: 2025-08-08T06:59:42.436Z
Learnt from: junhaoliao
PR: y-scope/clp#1152
File: components/clp-package-utils/clp_package_utils/scripts/start_clp.py:613-613
Timestamp: 2025-08-08T06:59:42.436Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/start_clp.py, generic_start_scheduler sets CLP_LOGGING_LEVEL using clp_config.query_scheduler.logging_level for both schedulers; compression scheduler should use its own logging level. Tracking via an issue created from PR #1152 discussion.
Applied to files:
components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py
🧬 Code graph analysis (2)
components/clp-package-utils/clp_package_utils/controller.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
validate_mcp_server_config(588-595)
components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py (1)
components/clp-py-utils/clp_py_utils/clp_logging.py (3)
get_logger(18-26)get_logging_formatter(14-15)set_logging_level(29-35)
🪛 Ruff (0.14.1)
components/clp-package-utils/clp_package_utils/controller.py
540-540: Logging statement uses f-string
(G004)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: package-image
Co-authored-by: Junhao Liao <junhao@junhao.ca>
…/clp into 9-30-clp-package-integration
junhaoliao
left a comment
There was a problem hiding this comment.
the final batch of comments. sorry for missing those earlier
junhaoliao
left a comment
There was a problem hiding this comment.
the final batch of comments. sorry for missing those earlier
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
579-585: Addtransform_for_container()method to McpServer.The
McpServerclass is missing atransform_for_container()method that all similar components implement. Without it, the MCP server will attempt to connect tolocalhost:8000in the container instead of using the Docker service hostnamemcp_server:8000.Apply this diff to add the method:
class McpServer(BaseModel): DEFAULT_PORT: ClassVar[int] = 8000 host: DomainStr = "localhost" port: Port = DEFAULT_PORT logging_level: LoggingLevel = "INFO" + + def transform_for_container(self): + self.host = MCP_SERVER_COMPONENT_NAME + self.port = self.DEFAULT_PORTAdditionally, ensure
CLPConfig.transform_for_container()calls this method. After line 807, add:self.query_scheduler.transform_for_container() self.reducer.transform_for_container() + if self.mcp_server is not None: + self.mcp_server.transform_for_container()components/clp-package-utils/clp_package_utils/controller.py (2)
540-543: Fix f-string logging violations.Lines 540 and 543 use f-strings in logging statements, violating Ruff rule G004. Use parameterized logging instead.
Apply this diff:
if self._clp_config.mcp_server is None: - logger.info(f"The MCP Server is not configured, skipping {component_name} creation...") + logger.info("The MCP Server is not configured, skipping %s creation...", component_name) return EnvVarsDict() - logger.info(f"Setting up environment for {component_name}...") + logger.info("Setting up environment for %s...", component_name)
545-547: Add_chown_paths_if_root()call after creating logs directory.The logs directory ownership should be corrected for root execution, consistent with other component setup methods.
Apply this diff:
logs_dir = self._clp_config.logs_directory / component_name validate_mcp_server_config(self._clp_config, logs_dir) logs_dir.mkdir(parents=True, exist_ok=True) + _chown_paths_if_root(logs_dir)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
components/clp-package-utils/clp_package_utils/controller.py(5 hunks)components/clp-py-utils/clp_py_utils/clp_config.py(3 hunks)components/package-template/src/etc/clp-config.yml(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-20T20:19:40.504Z
Learnt from: 20001020ycx
PR: y-scope/clp#1436
File: components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py:22-27
Timestamp: 2025-10-20T20:19:40.504Z
Learning: In the clp-mcp-server project (components/clp-mcp-server/clp_mcp_server/clp_mcp_server.py), the --config Click option should use exists=True to validate the configuration file path at option processing time, even though this requires the default path to exist on all machines.
Applied to files:
components/clp-package-utils/clp_package_utils/controller.py
🧬 Code graph analysis (1)
components/clp-package-utils/clp_package_utils/controller.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
validate_mcp_server_config(588-593)
🪛 Ruff (0.14.1)
components/clp-package-utils/clp_package_utils/controller.py
540-540: Logging statement uses f-string
(G004)
663-663: subprocess call: check for execution of untrusted input
(S603)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: package-image
🔇 Additional comments (6)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
38-38: LGTM! The constant follows naming conventions.The placement of
MCP_SERVER_COMPONENT_NAMEafterWEBUI_COMPONENT_NAMEand beforeGARBAGE_COLLECTOR_COMPONENT_NAMEaligns with the logical grouping of optional/ancillary services.
623-623: LGTM! Optional configs are properly grouped.The placement of
mcp_serverwith other optional configurations (immediately beforepresto) follows the established pattern for optional components.components/clp-package-utils/clp_package_utils/controller.py (4)
22-22: LGTM! Imports are correct.The new imports for
MCP_SERVER_COMPONENT_NAMEandvalidate_mcp_server_configare properly placed and alphabetically ordered within their respective import groups.Also applies to: 51-51
663-664: LGTM! Conditional profile activation is correct.The MCP profile is conditionally enabled only when
mcp_serveris configured, aligning with the "usenullto disable" pattern discussed in past reviews.Note: The S603 static analysis warning is a false positive—the subprocess command is constructed from trusted configuration values and string constants.
788-788: LGTM! Method call is correctly placed.The call to
_set_up_env_for_mcp_server()is properly sequenced with other component setups, and the null check is handled within the method itself.
551-560: Consider alphabetizing environment variable groups.For consistency with junhaoliao's feedback, place the Connection config group before the Logging config group alphabetically.
Based on learnings
Apply this diff:env_vars = EnvVarsDict() # Connection config env_vars |= { "CLP_MCP_HOST": _get_ip_from_hostname(self._clp_config.mcp_server.host), "CLP_MCP_PORT": str(self._clp_config.mcp_server.port), } # Logging config env_vars |= { "CLP_MCP_LOGGING_LEVEL": self._clp_config.mcp_server.logging_level, }Note: This is already in alphabetical order (Connection before Logging), so this comment is to confirm the current structure aligns with the suggested pattern.
Likely an incorrect or invalid review comment.
| #mcp_server: | ||
| # host: "localhost" | ||
| # port: 8000 | ||
| # logging_level: "INFO" |
There was a problem hiding this comment.
credit goes to rabbit
| mcp-server: | ||
| <<: *service_defaults | ||
| hostname: "mcp_server" | ||
| profiles: ["mcp"] |
There was a problem hiding this comment.
RFC: Standardize optional service enable/disable mechanisms in Docker Compose
We currently have three different mechanisms for optionally enabling or disabling services in our Docker Compose setup:
- Replica count toggling (0/1)
- Example: Garbage Collector
- Controlled via env var
deploy.replicas = 1or0
- Profiles
- Example: MCP server
- Enabled via
--profile=mcp - Requires controller logic to inject flags at startup
- Separate overlay compose files
- Example: Query services (and soon compression services via spider integration)
- Additional
docker-compose-XYZ.yamlextending the base - Enabled by passing
-f docker-compose-XYZ.yamlat runtime - Used for toggling groups of services when the composition varies significantly
Each mechanism serves a slightly different use case, but the result is fragmented enable/disable logic, increasing maintenance burden and making controller code inconsistent.
Problem
Because we use multiple patterns to control optional service activation:
- The controller code becomes inconsistent.
- Replica toggling uses environment setup functions (
CLP_GARBAGE_COLLECTOR_ENABLEDinside_set_up_env_for_garbage_collector()). - Profiles require injecting CLI flags in the
start()path (cmd += ["--profile", "mcp"]), which is less isolated.
- Replica toggling uses environment setup functions (
- It's harder to reason about which services support which activation pattern.
- Adding new optional services may copy-paste the existing inconsistency.
Proposal
Adopt (1) and (3) as the two primary supported mechanisms, and phase out (2) profiles.
- Use (1) replica count toggling for individual services that are simply enabled or disabled.
- Use (3) overlay compose files for groups of services whose compositions vary significantly depending on the configuration (e.g., different schedulers with different service sets).
Remove or migrate existing usage of (2) profiles, since it introduces controller-side CLI logic and is less maintainable. i.e., Remove the MCP server’s profile-based activation and migrate it to mechanism (1) using an environment-variable-controlled replica count (e.g., CLP_MCP_ENABLED -> replicas 0 or 1).
@coderabbitai create an issue
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create GitHub issue for Docker Compose standardization RFC
gh issue create \
--repo y-scope/clp \
--title "RFC: Standardise optional service enable/disable mechanisms in Docker Compose" \
--body "## Background
We currently have three different mechanisms for optionally enabling or disabling services in our Docker Compose setup:
1. **Replica count toggling (0/1)**
* Example: Garbage Collector
* Controlled via env var
* \`deploy.replicas = 1\` or \`0\`
2. **Profiles**
* Example: MCP server
* Enabled via \`--profile=mcp\`
* Requires controller logic to inject flags at startup
3. **Separate overlay compose files**
* Example: Query services (and soon compression services via spider integration)
* Additional \`docker-compose-XYZ.yaml\` extending the base
* Enabled by passing \`-f docker-compose-XYZ.yaml\` at runtime
* Used for toggling **groups of services** when the composition varies significantly
Each mechanism serves a slightly different use case, but the result is fragmented enable/disable logic, increasing maintenance burden and making controller code inconsistent.
## Problem
Because we use multiple patterns to control optional service activation:
1. The controller code becomes inconsistent.
* Replica toggling uses environment setup functions (\`CLP_GARBAGE_COLLECTOR_ENABLED\` inside \`_set_up_env_for_garbage_collector()\`).
* Profiles require injecting CLI flags in the \`start()\` path (\`cmd += [\"--profile\", \"mcp\"]\`), which is less isolated.
2. It's harder to reason about which services support which activation pattern.
3. Adding new optional services may copy-paste the existing inconsistency.
## Proposal
Adopt (1) and (3) as the two primary supported mechanisms, and phase out (2) profiles.
* Use (1) replica count toggling for individual services that are simply enabled or disabled.
* Use (3) overlay compose files for groups of services whose compositions vary significantly depending on the configuration (e.g., different schedulers with different service sets).
Remove or migrate existing usage of (2) profiles, since it introduces controller-side CLI logic and is less maintainable. Specifically, remove the MCP server's profile-based activation and migrate it to mechanism (1) using an environment-variable-controlled replica count (e.g., \`CLP_MCP_ENABLED\` -> replicas 0 or 1).
---
**Related PR:** #1481
**Comment:** https://github.com/y-scope/clp/pull/1481#discussion_r2539586942
**Requested by:** @junhaoliao" \
--assignee junhaoliao
Length of output: 42
@junhaoliao I've created issue #1620 to track this RFC on standardising the Docker Compose service enable/disable mechanisms.
The issue captures your proposal to consolidate around replica count toggling for individual services and overlay compose files for service groups, while phasing out the profile-based approach currently used by the MCP server. It includes the complete problem statement, rationale, and proposed migration path.
🧠 Learnings used
Learnt from: 20001020ycx
Repo: y-scope/clp PR: 1368
File: components/clp-mcp-server/clp_mcp_server/__init__.py:11-15
Timestamp: 2025-10-02T15:48:58.961Z
Learning: In the clp-mcp-server component (components/clp-mcp-server/clp_mcp_server/__init__.py), the default host binding of 0.0.0.0 is intentional because the server is designed to be deployed in Docker containers where this binding is necessary to accept external connections.
… orchestration. (y-scope#1481) Co-authored-by: Junhao Liao <junhao@junhao.ca> Co-authored-by: Junhao Liao <junhao.liao@yscope.com> Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Description
Adding the MCP server as a service in docker-compose.yaml.
Checklist
breaking change.
Validation performed
task docker-images:packageto build the package../start-clp.shSummary by CodeRabbit
New Features
Documentation
Chores