Skip to content

feat(clp-package): Make API server optional and enforce storage engine compatibility; split clp-config templates by flavours.#1709

Merged
hoophalab merged 7 commits into
y-scope:mainfrom
hoophalab:twoconfig
Dec 2, 2025
Merged

Conversation

@hoophalab

@hoophalab hoophalab commented Dec 1, 2025

Copy link
Copy Markdown
Contributor

Description

  1. Add two new templates clp-config.template.text.yaml and clp-config.template.json.yaml
  2. Remove seds in clp-*-pkg-tar tasks.
  3. In clp-{flavour}-pkg-tar, rename clp-config.template.{flavour}.yaml as clp-config.yaml and delete two template files
  4. In task package, copy clp-config.template.{flavour}.yaml to clp-config.yaml
  5. Add a validation: if storage engine is clp, and api server isn't null, then raise an error

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. ls build/clp-package/etc/ has clp-config.yaml from json template
  2. task clp-json-pkg-tarhas clp-config.yaml from json template. template files are removed.
  3. task clp-text-pkg-tar has clp-config.yaml from text template. template files are removed.
  4. If api_server: null, then api server isn't started.
  5. If api_server is not null and storage engine is clp, then an error is raised.
  6. If the user do nothing and run clp-text-x86_64-v0.6.1.tar.gz, the api server won't boot

Summary by CodeRabbit

  • New Features

    • API server config is now optional; deployments can omit it.
    • Packaging now uses a config-template workflow for consistent configuration files.
  • Chores

    • Startup now skips API server setup when none is configured.
    • Improved config validation to surface incompatible API-server/storage combinations early.
    • Packaging tasks simplified (fewer required vars).

✏️ Tip: You can customize this high-level summary in your review settings.

@hoophalab hoophalab requested a review from a team as a code owner December 1, 2025 22:14
@coderabbitai

coderabbitai Bot commented Dec 1, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Replaces storage-engine-driven API server disabling with explicit config validation and nullable api_server. Startup now calls the new validation hook. Controller skips API setup when api_server is null. Packaging uses config templates instead of many replace-text steps; templates added/reordered.

Changes

Cohort / File(s) Summary
Config: clp_config
components/clp-py-utils/clp_py_utils/clp_config.py
ClpConfig.api_server is now nullable (`ApiServer
Startup script
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
Removed StorageEngine import; added early call to clp_config.validate_api_server(); removed prior conditional warning about API server + StorageEngine.CLP.
Controller env setup
components/clp-package-utils/clp_package_utils/controller.py
_set_up_env_for_api_server() now checks api_server is None and skips API-server setup returning CLP_API_SERVER_ENABLED="0". Removed storage-engine-based disabling logic.
Config template (YAML)
components/package-template/src/etc/clp-config.template.text.yaml
Added a comprehensive CLP YAML template scaffold (includes api_server: null, commented examples and subsystem defaults).
Config template (JSON/YAML ordering)
components/package-template/src/etc/clp-config.template.json.yaml
Moved API server config block earlier and removed a duplicate later in the file; values unchanged.
Build/taskfile
taskfile.yaml
Removed STORAGE_ENGINE from task vars/signatures; replaced multiple replace-text steps with template move/application and an rsync of etc/clp-config.template.json.yamletc/clp-config.yaml; added template cleanup steps.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Inspect clp_config.validate_api_server() logic and error message.
  • Verify start_clp.py calls validation at the correct point and handles exceptions.
  • Confirm _set_up_env_for_api_server() returns CLP_API_SERVER_ENABLED="0" when api_server is None and sets env vars correctly when present.
  • Review taskfile.yaml template move/rsync for each FLAVOUR to ensure produced etc/clp-config.yaml is correct.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main objectives of the PR: introducing separate config templates for text and json flavours, making the API server optional, and enforcing storage engine compatibility.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d7204e and 3821a88.

📒 Files selected for processing (3)
  • components/package-template/src/etc/clp-config.template.json.yaml (1 hunks)
  • components/package-template/src/etc/clp-config.template.text.yaml (1 hunks)
  • taskfile.yaml (3 hunks)
🧰 Additional context used
🧠 Learnings (22)
📓 Common learnings
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/package-template/src/etc/clp-config.template.json.yaml
  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-11-10T05:19:56.600Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1575
File: components/clp-py-utils/clp_py_utils/clp_config.py:602-607
Timestamp: 2025-11-10T05:19:56.600Z
Learning: In the y-scope/clp repository, the `ApiServer` class in `components/clp-py-utils/clp_py_utils/clp_config.py` does not need a `transform_for_container()` method because no other containerized service depends on the API server - it's only accessed from the host, so no docker-network communication is expected.

Applied to files:

  • components/package-template/src/etc/clp-config.template.json.yaml
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
Repo: y-scope/clp PR: 1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/package-template/src/etc/clp-config.template.json.yaml
  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-09-04T12:26:54.788Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1170
File: tools/deployment/presto-clp/scripts/generate-metadata-filter-file.py:0-0
Timestamp: 2025-09-04T12:26:54.788Z
Learning: The tools/deployment/presto-clp/coordinator/config-template/ directory already exists in the CLP deployment structure and contains configuration files, so directory creation is not needed when writing metadata-filter.json to this location.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-10-22T21:14:12.225Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1466
File: .github/workflows/clp-rust-checks.yaml:14-15
Timestamp: 2025-10-22T21:14:12.225Z
Learning: Repository y-scope/clp: In GitHub Actions workflows (e.g., .github/workflows/clp-rust-checks.yaml), YAML anchors/aliases are acceptable and preferred to avoid duplication; if actionlint flags an alias node (e.g., on push.paths) as an error, treat it as a tool limitation and do not require inlining unless the team asks to silence the warning.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
  • taskfile.yaml
📚 Learning: 2025-10-07T07:54:32.427Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-py-utils/clp_py_utils/clp_config.py:47-47
Timestamp: 2025-10-07T07:54:32.427Z
Learning: In components/clp-py-utils/clp_py_utils/clp_config.py, the CONTAINER_AWS_CONFIG_DIRECTORY constant is intentionally set to pathlib.Path("/") / ".aws" (i.e., `/.aws`) rather than a user-specific home directory. This hardcoded path is part of the container orchestration design.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-09-25T05:13:13.298Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-07-07T17:41:15.655Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-25T00:45:05.464Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1242
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:39-41
Timestamp: 2025-08-25T00:45:05.464Z
Learning: Task v3.44.1 has a regression that breaks shell command processing, particularly rsync commands with brace expansion (e.g., `file.{d.ts,js,wasm}`). This causes CI failures in clp-ffi-js project (issue #110), so CLP should avoid v3.44.1 and use v3.44.0 instead, which fixes the dynamic variable bug without the shell processing regression.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-01T14:52:02.418Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-01T14:51:19.172Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-13T03:32:19.293Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1414
File: tools/docker-images/clp-package/Dockerfile:20-24
Timestamp: 2025-10-13T03:32:19.293Z
Learning: In the clp repository's Dockerfiles (e.g., tools/docker-images/clp-package/Dockerfile), ENV directives should be split into separate lines for readability rather than consolidated to reduce layer count. This is especially true for PATH modifications, as agreed upon in PR #1166. Later ENV settings may depend on earlier ones (e.g., referencing CLP_HOME).

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-19T14:41:28.901Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1152
File: components/clp-package-utils/clp_package_utils/general.py:0-0
Timestamp: 2025-08-19T14:41:28.901Z
Learning: In the CLP codebase, prefer explicit failures over automatic directory creation in utility functions like dump_config. The user junhaoliao prefers to let file operations fail when parent directories don't exist, as this helps catch implementation errors during development rather than masking setup issues with automatic directory creation.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-01T14:52:15.217Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like `apt-get update && apt-get install` or `apk update && apk add`), preserve this pattern for uniform script structure.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-27T04:05:01.560Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1507
File: taskfiles/deps/lock.yaml:66-76
Timestamp: 2025-10-27T04:05:01.560Z
Learning: In the CLP project, the `uv` CLI tool is documented as a prerequisite in docs/src/dev-docs/building-package.md and is not installed via a task dependency like the rust toolchain. UV is expected to be available in the environment rather than being provisioned by the taskfile system.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-09-25T19:26:32.436Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:01:31.391Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:01:31.391Z
Learning: In y-scope/clp, for the clp-rust-checks workflow (GitHub Actions) and the Taskfile target deps:lock:check-rust, we should verify Rust Cargo.lock is in sync with Cargo.toml using a non-mutating method (e.g., cargo metadata --locked / cargo check --locked) to keep CI deterministic and avoid updating dependencies during validation.

Applied to files:

  • taskfile.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: package-image
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: rust-checks
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: build (ubuntu-24.04)
🔇 Additional comments (4)
components/package-template/src/etc/clp-config.template.json.yaml (1)

1-14: API server configuration well-positioned and properly formatted.

The JSON template establishes a sensible default with storage_engine: "clp-s" and a fully configured api_server block with host, port, and query_job_polling parameters. The blank line after the schema hint and the consolidated API server definition align with the configuration template refactoring objectives. The uncommented package: and api_server: blocks enable easy customization for JSON deployments.

components/package-template/src/etc/clp-config.template.text.yaml (1)

1-9: Correct defaults prevent API server validation error.

The text template sets storage_engine: "clp" and api_server: null, which safely bypasses the validation error from validate_api_server() (the error is raised only when both storage_engine is "clp" AND api_server is not null). The blank line after the schema directive and uncommented configuration blocks follow the formatting guidelines established in prior commits.

taskfile.yaml (2)

470-470: Correct variable requirement after removing STORAGE_ENGINE.

Line 470 now requires only FLAVOUR, removing the STORAGE_ENGINE dependency per the PR's objective to eliminate sed-based substitutions and adopt a template-based approach. This aligns with the new packaging flow.


137-140: JSON template rsync paths properly quoted.

The rsync command to copy the JSON template to the package output directory uses fully quoted path arguments, maintaining consistency with robustness guidelines.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6b373c and 525b5bf.

📒 Files selected for processing (5)
  • components/clp-package-utils/clp_package_utils/controller.py (1 hunks)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2 hunks)
  • components/clp-py-utils/clp_py_utils/clp_config.py (2 hunks)
  • components/package-template/src/etc/clp-config.template.text.yaml (1 hunks)
  • taskfile.yaml (3 hunks)
🧰 Additional context used
🧠 Learnings (18)
📓 Common learnings
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1405
File: components/clp-package-utils/pyproject.toml:5-15
Timestamp: 2025-10-13T03:24:35.074Z
Learning: In the y-scope/clp repository, the Python 3.9 to 3.10 version requirement change was intentionally deferred to a separate PR (after PR #1405) to reduce review effort, as decided in an offline discussion between junhaoliao and kirkrodrigues.
📚 Learning: 2025-11-10T05:19:56.600Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1575
File: components/clp-py-utils/clp_py_utils/clp_config.py:602-607
Timestamp: 2025-11-10T05:19:56.600Z
Learning: In the y-scope/clp repository, the `ApiServer` class in `components/clp-py-utils/clp_py_utils/clp_config.py` does not need a `transform_for_container()` method because no other containerized service depends on the API server - it's only accessed from the host, so no docker-network communication is expected.

Applied to files:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
  • components/clp-package-utils/clp_package_utils/controller.py
  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2024-11-18T16:49:20.248Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.

Applied to files:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
📚 Learning: 2025-08-08T06:59:42.436Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1152
File: components/clp-package-utils/clp_package_utils/scripts/start_clp.py:613-613
Timestamp: 2025-08-08T06:59:42.436Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/start_clp.py, generic_start_scheduler sets CLP_LOGGING_LEVEL using clp_config.query_scheduler.logging_level for both schedulers; compression scheduler should use its own logging level. Tracking via an issue created from PR #1152 discussion.

Applied to files:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
📚 Learning: 2025-10-17T19:59:25.596Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:315-315
Timestamp: 2025-10-17T19:59:25.596Z
Learning: In components/clp-package-utils/clp_package_utils/controller.py, worker log directories (compression_worker, query_worker, reducer) created via `mkdir()` do not need `_chown_paths_if_root()` calls because directories are created with the same owner as the script caller. This differs from infrastructure service directories (database, queue, Redis, results cache) which do require explicit ownership changes.

Applied to files:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
  • components/clp-package-utils/clp_package_utils/controller.py
📚 Learning: 2025-10-07T07:54:32.427Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-py-utils/clp_py_utils/clp_config.py:47-47
Timestamp: 2025-10-07T07:54:32.427Z
Learning: In components/clp-py-utils/clp_py_utils/clp_config.py, the CONTAINER_AWS_CONFIG_DIRECTORY constant is intentionally set to pathlib.Path("/") / ".aws" (i.e., `/.aws`) rather than a user-specific home directory. This hardcoded path is part of the container orchestration design.

Applied to files:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-07-07T17:41:15.655Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-27T04:05:01.560Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1507
File: taskfiles/deps/lock.yaml:66-76
Timestamp: 2025-10-27T04:05:01.560Z
Learning: In the CLP project, the `uv` CLI tool is documented as a prerequisite in docs/src/dev-docs/building-package.md and is not installed via a task dependency like the rust toolchain. UV is expected to be available in the environment rather than being provisioned by the taskfile system.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-09-25T19:26:32.436Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:01:31.391Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:01:31.391Z
Learning: In y-scope/clp, for the clp-rust-checks workflow (GitHub Actions) and the Taskfile target deps:lock:check-rust, we should verify Rust Cargo.lock is in sync with Cargo.toml using a non-mutating method (e.g., cargo metadata --locked / cargo check --locked) to keep CI deterministic and avoid updating dependencies during validation.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/clp-package-utils/clp_package_utils/controller.py
  • components/package-template/src/etc/clp-config.template.text.yaml
  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-09-04T12:26:54.788Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1170
File: tools/deployment/presto-clp/scripts/generate-metadata-filter-file.py:0-0
Timestamp: 2025-09-04T12:26:54.788Z
Learning: The tools/deployment/presto-clp/coordinator/config-template/ directory already exists in the CLP deployment structure and contains configuration files, so directory creation is not needed when writing metadata-filter.json to this location.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
Repo: y-scope/clp PR: 1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-09-25T05:13:13.298Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
🧬 Code graph analysis (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
  • validate_api_server (893-897)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: rust-checks
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: antlr-code-committed (macos-15)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: build (ubuntu-24.04)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: build (macos-15)
🔇 Additional comments (5)
components/package-template/src/etc/clp-config.template.text.yaml (1)

1-148: Text template defaults look consistent with storage_engine clp and nullable API server

The schema reference, default package values, and api_server: null align with the new validation logic and existing directory conventions; I don’t see issues here.

components/clp-py-utils/clp_py_utils/clp_config.py (1)

736-748: Nullable api_server field aligns with template and controller changes

Updating the field to ApiServer | None = ApiServer() is compatible with the new templates (api_server: null) and the controller’s api_server is None guard. Configs that omit or set api_server: null remain valid, while still allowing a fully specified ApiServer when appropriate.

components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

6-74: Startup wiring of validate_api_server() looks good

Calling clp_config.validate_api_server() in the main validation block ensures misconfigured API server / storage engine combinations are caught before any directories are created or services started, and the simplified import reflects that StorageEngine is no longer needed here. This aligns cleanly with the new config/model logic.

components/clp-package-utils/clp_package_utils/controller.py (1)

533-551: Confirm CLP_API_SERVER_ENABLED semantics when API server is configured

The code sets CLP_API_SERVER_ENABLED="0" when api_server is None, but when configured, it only sets host and port without explicitly setting CLP_API_SERVER_ENABLED="1". Verify that either:

  • The unset default enables the API server, or
  • You add "CLP_API_SERVER_ENABLED": "1" in the configured path
taskfile.yaml (1)

470-487: Clarify intended default config flavour and tighten template handling

Two concerns about the template approach in package-tar:

  1. The task implicitly uses JSON as the default config in the build tree by creating it in package, then overwriting it per-flavour in package-tar. If TEXT is intended as first-class, clarify whether package should skip creating clp-config.yaml entirely and let package-tar materialize it.

  2. The rm {{.OUTPUT_DIR}}/etc/clp-config.template.*.yaml command assumes no other template files exist in that directory. If this assumption holds, document it; if not, consider a more targeted cleanup approach.

For consistency with surrounding commands, consider adding quotes to the mv and rm path arguments.

Comment thread components/clp-py-utils/clp_py_utils/clp_config.py
Comment thread taskfile.yaml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
components/clp-py-utils/clp_py_utils/clp_config.py (1)

893-898: API server/storage engine compatibility logic is correct; consider tightening integration.

The validation method correctly enforces the constraint. However, as noted in the previous review, consider:

  1. Using is not None instead of != None for Python style compliance
  2. Converting to a Pydantic @model_validator(mode="after") to ensure the rule runs automatically after model construction, rather than relying on ad-hoc calls
taskfile.yaml (1)

137-140: Consider quoting rsync paths for robustness.

The rsync command uses unquoted template variables. For consistency with surrounding commands and to handle edge cases with spaces in paths, consider quoting the arguments.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 525b5bf and 9dccebc.

📒 Files selected for processing (2)
  • components/clp-py-utils/clp_py_utils/clp_config.py (2 hunks)
  • taskfile.yaml (3 hunks)
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
📚 Learning: 2025-07-07T17:41:15.655Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-25T00:45:05.464Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1242
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:39-41
Timestamp: 2025-08-25T00:45:05.464Z
Learning: Task v3.44.1 has a regression that breaks shell command processing, particularly rsync commands with brace expansion (e.g., `file.{d.ts,js,wasm}`). This causes CI failures in clp-ffi-js project (issue #110), so CLP should avoid v3.44.1 and use v3.44.0 instead, which fixes the dynamic variable bug without the shell processing regression.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-01T14:52:02.418Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:14:12.225Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1466
File: .github/workflows/clp-rust-checks.yaml:14-15
Timestamp: 2025-10-22T21:14:12.225Z
Learning: Repository y-scope/clp: In GitHub Actions workflows (e.g., .github/workflows/clp-rust-checks.yaml), YAML anchors/aliases are acceptable and preferred to avoid duplication; if actionlint flags an alias node (e.g., on push.paths) as an error, treat it as a tool limitation and do not require inlining unless the team asks to silence the warning.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-27T04:05:01.560Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1507
File: taskfiles/deps/lock.yaml:66-76
Timestamp: 2025-10-27T04:05:01.560Z
Learning: In the CLP project, the `uv` CLI tool is documented as a prerequisite in docs/src/dev-docs/building-package.md and is not installed via a task dependency like the rust toolchain. UV is expected to be available in the environment rather than being provisioned by the taskfile system.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-09-25T19:26:32.436Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:01:31.391Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:01:31.391Z
Learning: In y-scope/clp, for the clp-rust-checks workflow (GitHub Actions) and the Taskfile target deps:lock:check-rust, we should verify Rust Cargo.lock is in sync with Cargo.toml using a non-mutating method (e.g., cargo metadata --locked / cargo check --locked) to keep CI deterministic and avoid updating dependencies during validation.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-11-10T05:19:56.600Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1575
File: components/clp-py-utils/clp_py_utils/clp_config.py:602-607
Timestamp: 2025-11-10T05:19:56.600Z
Learning: In the y-scope/clp repository, the `ApiServer` class in `components/clp-py-utils/clp_py_utils/clp_config.py` does not need a `transform_for_container()` method because no other containerized service depends on the API server - it's only accessed from the host, so no docker-network communication is expected.

Applied to files:

  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-09-26T21:12:21.655Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1338
File: components/clp-py-utils/clp_py_utils/clp_config.py:753-753
Timestamp: 2025-09-26T21:12:21.655Z
Learning: Python linting rules that catch YODA conditions exist but are not enabled for Python modules in the CLP package due to time constraints. The team prefers to avoid YODA conditions in Python for consistency.

Applied to files:

  • components/clp-py-utils/clp_py_utils/clp_config.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: package-image
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: build (macos-15)
  • GitHub Check: build (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: rust-checks
🔇 Additional comments (2)
taskfile.yaml (1)

470-470: LGTM! Simplified variable requirements.

Removing STORAGE_ENGINE from the required variables simplifies the task interface, as FLAVOUR alone now determines which config template to apply. This aligns well with the PR's goal of template-based configuration.

components/clp-py-utils/clp_py_utils/clp_config.py (1)

736-736: Reconsider the default value for nullable api_server.

The field signature ApiServer | None = ApiServer() allows None but defaults to a non-None instance. This seems contradictory to the PR's intent of making the API server optional. If the API server should be absent by default (as indicated by the template setting api_server: null), consider changing the default to None:

-    api_server: ApiServer | None = ApiServer()
+    api_server: ApiServer | None = None

This would align with the template-based configuration approach where api_server: null disables the API server.

Comment thread taskfile.yaml Outdated
Comment on lines +484 to +489
# Apply the config template
- >-
mv
{{.OUTPUT_DIR}}/etc/clp-config.template.{{.FLAVOUR}}.yaml
{{.OUTPUT_DIR}}/etc/clp-config.yaml
- "rm {{.OUTPUT_DIR}}/etc/clp-config.template.*.yaml"

@coderabbitai coderabbitai Bot Dec 1, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Quote template variables in shell commands.

The mv and rm commands use unquoted template variables. If OUTPUT_DIR or FLAVOUR contain spaces or special characters, these commands will fail. For robustness and consistency with surrounding commands, quote all path arguments:

Apply this diff:

       # Apply the config template
       - >-
         mv
-        {{.OUTPUT_DIR}}/etc/clp-config.template.{{.FLAVOUR}}.yaml
-        {{.OUTPUT_DIR}}/etc/clp-config.yaml
+        "{{.OUTPUT_DIR}}/etc/clp-config.template.{{.FLAVOUR}}.yaml"
+        "{{.OUTPUT_DIR}}/etc/clp-config.yaml"
-      - "rm {{.OUTPUT_DIR}}/etc/clp-config.template.*.yaml"
+      - "rm '{{.OUTPUT_DIR}}/etc/clp-config.template.'*.yaml"

Note: The rm command uses single quotes around the base path and leaves the glob pattern unquoted so shell expansion works correctly.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In taskfile.yaml around lines 484 to 489, the mv and rm lines use unquoted
template variables; update them to quote path arguments so spaces/special chars
are safe: change the mv invocation to quote the source and destination (e.g.
quote the {{.OUTPUT_DIR}} and include {{.FLAVOUR}} inside the quoted source
path) and change the rm invocation to use single quotes around the base path
while leaving the glob wildcard unquoted so shell expansion still works (i.e.
quote the directory portion but keep the trailing *.yaml unquoted).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with the rabbit - paths should be quoted

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junhaoliao Thanks for confirming! Here's the corrected version with properly quoted paths:

      # Apply the config template
      - >-
        mv
        "{{.OUTPUT_DIR}}/etc/clp-config.template.{{.FLAVOUR}}.yaml"
        "{{.OUTPUT_DIR}}/etc/clp-config.yaml"
      - "rm '{{.OUTPUT_DIR}}/etc/clp-config.template.'*.yaml"

The changes ensure:

  • Both mv source and destination paths are fully quoted to handle spaces or special characters
  • The rm command quotes the base path ('{{.OUTPUT_DIR}}/etc/clp-config.template.') while leaving the glob pattern (*.yaml) unquoted so shell expansion works correctly

🧠 Learnings used
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.

Learnt from: junhaoliao
Repo: y-scope/clp PR: 1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Learnt from: junhaoliao
Repo: y-scope/clp PR: 1414
File: tools/docker-images/clp-package/Dockerfile:20-24
Timestamp: 2025-10-13T03:32:19.293Z
Learning: In the clp repository's Dockerfiles (e.g., tools/docker-images/clp-package/Dockerfile), ENV directives should be split into separate lines for readability rather than consolidated to reduce layer count. This is especially true for PATH modifications, as agreed upon in PR #1166. Later ENV settings may depend on earlier ones (e.g., referencing CLP_HOME).

Learnt from: junhaoliao
Repo: y-scope/clp PR: 1242
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:39-41
Timestamp: 2025-08-25T00:45:05.464Z
Learning: Task v3.44.1 has a regression that breaks shell command processing, particularly rsync commands with brace expansion (e.g., `file.{d.ts,js,wasm}`). This causes CI failures in clp-ffi-js project (issue #110), so CLP should avoid v3.44.1 and use v3.44.0 instead, which fixes the dynamic variable bug without the shell processing regression.

Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Comment thread components/package-template/src/etc/clp-config.template.text.yaml Outdated
Comment thread components/package-template/src/etc/clp-config.template.text.yaml Outdated
Comment thread components/package-template/src/etc/clp-config.template.text.yaml
Comment thread taskfile.yaml Outdated
Comment thread taskfile.yaml Outdated
Comment on lines +484 to +489
# Apply the config template
- >-
mv
{{.OUTPUT_DIR}}/etc/clp-config.template.{{.FLAVOUR}}.yaml
{{.OUTPUT_DIR}}/etc/clp-config.yaml
- "rm {{.OUTPUT_DIR}}/etc/clp-config.template.*.yaml"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with the rabbit - paths should be quoted

Comment thread components/package-template/src/etc/clp-config.template.json.yaml Outdated
Comment thread components/clp-package-utils/clp_package_utils/scripts/start_clp.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
components/package-template/src/etc/clp-config.template.json.yaml (1)

2-4: Uncomment the package: and api_server: blocks to match the TEXT template and past review feedback.

The JSON template currently has these sections commented, but the TEXT template has them uncommented. A previous review (junhaoliao) explicitly requested: "Let's leave the package: and api_server: blocks uncommented." Both templates should be consistent. Additionally, the text template shows api_server: null (active), which aligns with the PR's nullable API server design and validation logic.

Apply this diff to uncomment these blocks:

 # yaml-language-server: $schema=../usr/share/config-schemas/clp-config.schema.json
-#package:
-#  storage_engine: "clp-s"
-#  query_engine: "clp-s"
-#
-## API server config
-#api_server:
-#  host: "localhost"
-#  port: 3001
-#  default_max_num_query_results: 1000
-#  query_job_polling:
-#    initial_backoff_ms: 100
-#    max_backoff_ms: 5000
-#
+
+package:
+  storage_engine: "clp-s"
+  query_engine: "clp-s"
+
+# API server config
+api_server:
+  host: "localhost"
+  port: 3001
+  default_max_num_query_results: 1000
+  query_job_polling:
+    initial_backoff_ms: 100
+    max_backoff_ms: 5000
+

Also applies to: 6-14

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dccebc and 0d7204e.

📒 Files selected for processing (2)
  • components/package-template/src/etc/clp-config.template.json.yaml (1 hunks)
  • components/package-template/src/etc/clp-config.template.text.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
📚 Learning: 2025-09-04T12:26:54.788Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1170
File: tools/deployment/presto-clp/scripts/generate-metadata-filter-file.py:0-0
Timestamp: 2025-09-04T12:26:54.788Z
Learning: The tools/deployment/presto-clp/coordinator/config-template/ directory already exists in the CLP deployment structure and contains configuration files, so directory creation is not needed when writing metadata-filter.json to this location.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
  • components/package-template/src/etc/clp-config.template.json.yaml
📚 Learning: 2025-10-22T21:14:12.225Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1466
File: .github/workflows/clp-rust-checks.yaml:14-15
Timestamp: 2025-10-22T21:14:12.225Z
Learning: Repository y-scope/clp: In GitHub Actions workflows (e.g., .github/workflows/clp-rust-checks.yaml), YAML anchors/aliases are acceptable and preferred to avoid duplication; if actionlint flags an alias node (e.g., on push.paths) as an error, treat it as a tool limitation and do not require inlining unless the team asks to silence the warning.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
Repo: y-scope/clp PR: 1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
  • components/package-template/src/etc/clp-config.template.json.yaml
📚 Learning: 2025-10-07T07:54:32.427Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-py-utils/clp_py_utils/clp_config.py:47-47
Timestamp: 2025-10-07T07:54:32.427Z
Learning: In components/clp-py-utils/clp_py_utils/clp_config.py, the CONTAINER_AWS_CONFIG_DIRECTORY constant is intentionally set to pathlib.Path("/") / ".aws" (i.e., `/.aws`) rather than a user-specific home directory. This hardcoded path is part of the container orchestration design.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • components/package-template/src/etc/clp-config.template.text.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: package-image
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build (macos-15)
  • GitHub Check: build (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: rust-checks

Comment thread components/package-template/src/etc/clp-config.template.text.yaml

@junhaoliao junhaoliao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the pr title, how about:

feat(clp-package): Make API server optional and enforce storage engine compatibility; split clp-config templates by flavours.

@hoophalab hoophalab changed the title feat(clp-package): Add separate config templates for text and json flavour. feat(clp-package): Make API server optional and enforce storage engine compatibility; split clp-config templates by flavours. Dec 2, 2025
@hoophalab hoophalab merged commit 63b0ef1 into y-scope:main Dec 2, 2025
28 checks passed
@hoophalab hoophalab deleted the twoconfig branch December 2, 2025 18:20
junhaoliao pushed a commit to junhaoliao/clp that referenced this pull request May 17, 2026
…e compatibility; split clp-config templates by flavours. (y-scope#1709)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants