feat(clp-package)!: Change default flavor to clp-json; Fix pkg-tar tasks to uncomment and modify package block in etc/clp-config.yml (fixes #1568).#1572
Conversation
…r task regex patterns (fixes y-scope#1568).
WalkthroughDefault storage and query engine defaults changed from CLP to CLP_S in the Python config and package template; documentation and the taskfile were updated to apply edits via the YAML package block only. (22 words) Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-docs/quick-start/clp-text.md(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
Learnt from: gibber9809
Repo: y-scope/clp PR: 955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: anlowee
Repo: y-scope/clp PR: 1228
File: tools/deployment/presto-clp/scripts/init.py:298-306
Timestamp: 2025-09-02T13:35:33.396Z
Learning: In the CLP codebase, all configuration values loaded from YAML files are string values, so functions that retrieve config values should use Optional[str] rather than Optional[Any] for type hints.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1405
File: components/clp-package-utils/pyproject.toml:5-15
Timestamp: 2025-10-13T03:24:35.074Z
Learning: In the y-scope/clp repository, the Python 3.9 to 3.10 version requirement change was intentionally deferred to a separate PR (after PR #1405) to reduce review effort, as decided in an offline discussion between junhaoliao and kirkrodrigues.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1152
File: components/clp-package-utils/clp_package_utils/scripts/start_clp.py:613-613
Timestamp: 2025-08-08T06:59:42.436Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/start_clp.py, generic_start_scheduler sets CLP_LOGGING_LEVEL using clp_config.query_scheduler.logging_level for both schedulers; compression scheduler should use its own logging level. Tracking via an issue created from PR #1152 discussion.
🪛 markdownlint-cli2 (0.18.1)
docs/src/user-docs/quick-start/clp-text.md
42-42: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: package-image
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
- GitHub Check: build (macos-15)
- GitHub Check: rust-checks (ubuntu-24.04)
- GitHub Check: rust-checks (ubuntu-22.04)
- GitHub Check: rust-checks (macos-15)
🔇 Additional comments (1)
docs/src/user-docs/quick-start/clp-text.md (1)
32-43: Warning block content is accurate and contextually appropriate.The warning clearly communicates the breaking change (default engine is now
clp-s, optimized for JSON) and provides the correct configuration for users who need clp-text functionality. This directly addresses the PR's documentation requirements.
| :::{warning} | ||
| **Do not comment out or remove the `package` block in `etc/clp-config.yml`**, as the default storage | ||
| engine is `clp-s`, which is optimized for JSON logs rather than unstructured text logs. | ||
|
|
||
| To use `clp-text`, the `package` block should be configured as follows: | ||
|
|
||
| ```yaml | ||
| package: | ||
| storage_engine: "clp" | ||
| query_engine: "clp" | ||
| ``` | ||
| ::: |
There was a problem hiding this comment.
Add blank line after fenced code block (MD031).
The closing fence on line 42 requires a blank line before the closing ::: directive on line 43 to comply with Markdown linting standards.
package:
storage_engine: "clp"
query_engine: "clp":::
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
:::{warning}
**Do not comment out or remove the `package` block in `etc/clp-config.yml`**, as the default storage
engine is `clp-s`, which is optimized for JSON logs rather than unstructured text logs.
To use `clp-text`, the `package` block should be configured as follows:
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
42-42: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
🤖 Prompt for AI Agents
In docs/src/user-docs/quick-start/clp-text.md around lines 32 to 43, the fenced
code block ending at line 42 needs a blank line before the closing ::: directive
on line 43 to satisfy MD031; insert a single empty line between the closing ```
and the subsequent closing `:::` so the document has a blank line after the
fenced code block and before the directive.
| package: | ||
| storage_engine: "clp-s" | ||
| query_engine: "clp-s" | ||
|
|
There was a problem hiding this comment.
Rather than doing this, couldn't we make the clp-text-pkg-tar task uncomment this block and change it?
| ``` | ||
|
|
||
| The build will be in `build/clp-package` and defaults to using the storage engine for `clp-text`. | ||
| The build will be in `build/clp-package` and defaults to using the storage engine for `clp-json`. |
There was a problem hiding this comment.
Can we also add package.query_engine to line 48?
…eview Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
…ons to uncomment package config
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
feat(clp-package)!: Change default flavor to clp-json; Fix `pkg-tar` tasks by uncommenting and modifying `package` block in `etc/clp-config.yml` (fixes #1568).
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
pkg-tar tasks by uncommenting and modifying package block in etc/clp-config.yml (fixes #1568).
pkg-tar tasks by uncommenting and modifying package block in etc/clp-config.yml (fixes #1568).pkg-tar tasks to uncomment and modify package block in etc/clp-config.yml (fixes #1568).
kirkrodrigues
left a comment
There was a problem hiding this comment.
Edited the PR title slightly.
| storage_engine: StorageEngineStr = StorageEngine.CLP_S | ||
| query_engine: QueryEngineStr = QueryEngine.CLP_S |
There was a problem hiding this comment.
the webui settings json files, whose default values are used at development time, should also be updated to reflect this new default
@coderabbitai create an issue to track
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create GitHub issue to track updating webui settings JSON files
gh issue create \
--repo y-scope/clp \
--title "Update webui settings JSON files to reflect CLP-JSON as default flavor" \
--body "The webui settings JSON files, whose default values are used at development time, should be updated to reflect the new default flavor (CLP-JSON instead of CLP Text).
This follows the change in PR #1572 where the default \`storage_engine\` and \`query_engine\` were changed from \`clp\` to \`clp-s\`.
**Related:**
- PR: https://github.com/y-scope/clp/pull/1572
- Comment: https://github.com/y-scope/clp/pull/1572#discussion_r2517681254
- Requested by: @junhaoliao" \
--assignee junhaoliao
Length of output: 42
@junhaoliao, I've created issue #1588 to track updating the webui settings JSON files to reflect the new CLP-JSON default flavor. The issue has been assigned to you.
🧬 Code Graph Analysis Results
File: components/clp-py-utils/clp_py_utils/core.py
- Snippet: lines 28-42
def get_config_value(config, key):
"""
Gets a value from the given dictionary using a dot-separated configuration
key, where each dot represents a deeper dictionary. NOTE: This method does
not support keys that contain dots since that is indistinguishable from a
deeper dictionary.
:param config:
:param key:
"""
singular_keys = key.split(".")
current_config = config
for current_key in singular_keys:
current_config = current_config[current_key]
return current_config
- Snippet: lines 45-56
def make_config_path_absolute(default_root: pathlib.Path, config_path: pathlib.Path):
"""
Turns relative paths into absolute paths by prefixing them with the
default_root
:param default_root:
:param config_path:
"""
if config_path.is_absolute():
return config_path
else:
return default_root / config_path
- Snippet: lines 59-65
def read_yaml_config_file(yaml_config_file_path: pathlib.Path):
with open(yaml_config_file_path, "r") as yaml_config_file:
try:
config = yaml.load(yaml_config_file, Loader=yaml.CSafeLoader)
except ParserError as ex:
raise ValueError(f"Unable to parse configuration from {yaml_config_file_path}: {ex}")
return config
- Snippet: lines 83-104
def resolve_host_path_in_container(host_path: pathlib.Path) -> pathlib.Path:
"""
Resolves a host path and translates it into its container-mount equivalent absolute path. See
`resolve_host_path` for details about the resolution.
:param host_path: The host path.
:return: The translated path (with /mnt/host prefix).
"""
host_path = host_path.expanduser()
if not host_path.is_absolute():
pwd_host = os.getenv("CLP_PWD_HOST")
if pwd_host is None:
raise ValueError("Relative host path provided but CLP_PWD_HOST is not set.")
host_path = pathlib.Path(pwd_host) / host_path
host_path = host_path.absolute()
resolved_path = _resolve_symlinks_in_path(host_path)
if resolved_path is not None:
return resolved_path
return CONTAINER_DIR_FOR_HOST_ROOT / host_path.relative_to("/")
- Snippet: lines 107-114
def validate_path_could_be_dir(path: pathlib.Path):
part = path
while True:
if part.exists():
if not part.is_dir():
raise ValueError(f"{part} is not a directory.")
return
part = part.parent
File: components/clp-py-utils/clp_py_utils/serialization_utils.py
- Snippet: lines 16-23
def serialize_path(path: pathlib.Path) -> str:
"""
Serializes a `pathlib.Path` to its string representation.
:param path:
:return: The string representation of the path.
"""
return str(path)
- Snippet: lines 6-13
def serialize_str_enum(member: StrEnum) -> str:
"""
Serializes a `strenum.StrEnum` member to its underlying value.
:param member:
:return: The underlying string value of the enum member.
"""
return member.value
🧠 Learnings used
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-py-utils/clp_py_utils/clp_config.py:47-47
Timestamp: 2025-10-07T07:54:32.427Z
Learning: In components/clp-py-utils/clp_py_utils/clp_config.py, the CONTAINER_AWS_CONFIG_DIRECTORY constant is intentionally set to pathlib.Path("/") / ".aws" (i.e., `/.aws`) rather than a user-specific home directory. This hardcoded path is part of the container orchestration design.
Learnt from: haiqi96
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-07-29T14:04:13.769Z
Learning: User haiqi96 requested creating a GitHub issue to document a bug fix from PR #1136, which addressed MySQL compatibility issues with invalid SQL CAST operations in the WebUI component.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1405
File: components/clp-package-utils/pyproject.toml:5-15
Timestamp: 2025-10-13T03:24:35.074Z
Learning: In the y-scope/clp repository, the Python 3.9 to 3.10 version requirement change was intentionally deferred to a separate PR (after PR #1405) to reduce review effort, as decided in an offline discussion between junhaoliao and kirkrodrigues.
…tasks to uncomment and modify `package` block in `etc/clp-config.yml` (fixes y-scope#1568). (y-scope#1572) Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Description
This PR changes the default package flavor to CLP-JSON, and it addresses #1568 by fixing the broken
clp-json-pkg-tarandclp-text-pkg-tarbuild taskspackageblock inclp-config.ymltemplate to explicitly set defaultsstorage_engineandquery_enginefromclptoclp-sinclp-config.ymlstorage_engineandquery_enginefromclptoclp-sinclp_config.pypackage-tartask. Remove replacement operations on remove clp_config.py file which is now backed into the CLP Package Docker image.packageconfiguration blockChecklist
breaking change.
Validation performed
Applied patch:
Summary by CodeRabbit
Configuration Changes
Documentation
Chores