feat(package): Add settings to configure webui Presto client (resolves #1259, resolves #1260).#1198
Conversation
WalkthroughAdds Presto configuration and wiring: new Presto model and CLPConfig validation, start_clp exposes ClpQueryEngine and conditionally writes PrestoHost/PrestoPort, package template and .env gains Presto placeholders/defaults, WebUI env/schema extended, and Presto plugin receives catalog/schema/user options. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Admin
participant Starter as start_clp.py
participant Config as CLPConfig
participant Settings as server_settings.json
participant WebUI
participant PrestoPlugin
Admin->>Starter: start_webui()
Starter->>Config: load CLPConfig (package.query_engine, presto?)
Note over Config: if package.query_engine == PRESTO\npresto{host,port} must be present
Config-->>Starter: return query_engine, presto{host,port}
Starter->>Settings: write {ClpQueryEngine, PrestoHost/PrestoPort|null}
Admin->>WebUI: launch
WebUI->>WebUI: load env (USER, PRESTO_CATALOG, PRESTO_SCHEMA)
WebUI->>PrestoPlugin: init({host: settings.PrestoHost, port: settings.PrestoPort, catalog, schema, user})
PrestoPlugin-->>WebUI: decorated Presto client
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (1)**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (2)📓 Common learnings📚 Learning: 2025-08-25T16:27:50.549ZApplied to files:
🧬 Code graph analysis (1)components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
🔇 Additional comments (5)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
@kirkrodrigues do u want to review this or someone else? |
There was a problem hiding this comment.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (4)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py(1 hunks)components/clp-py-utils/clp_py_utils/clp_config.py(2 hunks)components/package-template/src/etc/clp-config.yml(1 hunks)components/webui/server/src/plugins/app/Presto.ts(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/server/src/plugins/app/Presto.ts
🧬 Code Graph Analysis (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
validate_port(192-204)
🔇 Additional comments (4)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
699-712: LGTM: New Presto config model is consistent and validated.Good reuse of existing host/port validators and sensible defaults. This integrates cleanly with the rest of the config models.
731-731: LGTM: Exposingprestoon CLPConfig.This makes the values available to the launcher without affecting existing consumers. No conflicts with
dump_to_primitive_dict().components/webui/server/src/plugins/app/Presto.ts (1)
38-42: Verified — server settings.json includes ClpQueryEngine, PrestoHost, PrestoPortConfirmed: components/webui/server/settings.json defines the required keys and types, and the Presto plugin imports that file, so it will receive defined values.
- components/webui/server/settings.json — lines ~21–23: "ClpQueryEngine": "clp", "PrestoHost": "localhost", "PrestoPort": 8889
- components/webui/server/src/plugins/app/Presto.ts — imports ../../../settings.json and uses settings.ClpQueryEngine to gate initialization and settings.PrestoHost / settings.PrestoPort for clientOptions
No change required in this PR; the original concern is resolved.
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
915-918: Confirm the server settings.json path matches the module import pathI ran the check: the Presto plugin import "../../../settings.json" resolves to components/webui/server/dist/server/settings.json, but that file is not present in the repository (dist is likely a build artifact). I cannot conclusively verify from source alone — please confirm the built/deployed artifact contains settings.json at that path and that start_clp writes to it.
Files/locations to verify:
- components/clp-package-utils/clp_package_utils/scripts/start_clp.py (around lines 915–918) — ensure it writes to components/webui/server/dist/server/settings.json.
- components/webui/server/src/plugins/app/Presto.ts (and the built file components/webui/server/dist/server/src/plugins/app/Presto.js) — ensure "../../../settings.json" resolves to the same server/dist/server/settings.json at runtime.
| # port: 4000 | ||
| # results_metadata_collection_name: "results-metadata" | ||
| # | ||
| #presto: |
There was a problem hiding this comment.
should we consider making the order to be consistent with the order we declare them in CLPConfig? (line 731)
There was a problem hiding this comment.
sure but should garbage collector be higher up then? like above archive output settings?
There was a problem hiding this comment.
sure but should garbage collector be higher up then? like above archive output settings?
True, the order isn't that simple.
Do we start presto as a component, or will it be something user configure themselves and only need to let CLP know the port?
There was a problem hiding this comment.
per kirk, plan is for user to start their own presto cluster, then the package/webui connects to it
There was a problem hiding this comment.
How about this:
- Let's make the declaration of Presto to be consistent in the clp-config.yml and clp_config.py -- since now class Presto is defined after class garbage collector in clp_config.py, let's keep the same order and move Presto lower in the clp-config.yml
- for
presto: Presto = Presto(). I would say let's move it to the block starting at line734, the reasoning is that the target in the first block are launchable components.
It looks like that there might be some inconsistent decisions made on where to put what, I would say let's leave those for future issues and PRs.
| import settings from "../../../settings.json" with {type: "json"}; | ||
|
|
||
|
|
||
| const CLP_PRESTO_USER = "clp-webui"; |
There was a problem hiding this comment.
I am not familiar with the context, wonder if this value is only limited to webui.
If something else could use the value in the future, shall we consider moving it into clp-config.yml, and passin via settings?
There was a problem hiding this comment.
hmm i dont think anyone would really benefit from changing this. The username does nothing from a functional perspective, it is just metadata. Normally the library we are using defaults to using linux username automatically. This hardcode is basically a quick fix since when running in docker there is no username, and library we are using requires a user name. Maybe a better solution is to check process.env.USER, and if undefined, then set to hardcoded "clp-webui"?
There was a problem hiding this comment.
Sure, checking process.env.USER sounds reasonable.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
components/package-template/src/etc/clp-config.yml (1)
118-122: Add a brief usage note to reduce misconfiguration (when to enable, required storage engine, container reachability)This mirrors prior feedback and helps users wiring the web UI to an external Presto.
Apply:
## Presto client config #presto: # host: "localhost" # port: 8889 +# +# Note: +# - Enable this block only when `package.query_engine` is "presto". +# - Presto queries require `package.storage_engine` to be "clp-s". +# - Ensure host/port are reachable from the web UI container and match your Presto endpoint.components/webui/server/src/plugins/app/Presto.ts (2)
10-12: Allow an explicit env override and prefer nullish coalescingSupport CLP_PRESTO_USER for ops overrides, then fall back to process.env.USER, then the default. Using
??avoids treating empty strings as falsy.Apply:
-// Use process.env.USER if defined, otherwise fallback to "clp-webui" -const CLP_PRESTO_USER_FALLBACK = "clp-webui"; +// Resolve Presto user in priority order: +// 1) CLP_PRESTO_USER (explicit override for ops) +// 2) USER (typical Linux/Docker env) +// 3) "clp-webui" (safe fallback when no user is available, e.g., some containers) +const CLP_PRESTO_USER_FALLBACK = "clp-webui";
42-42: Use explicit user resolution chain (env override → USER → fallback)This matches the documented intent and prior feedback.
Apply:
- user: process.env.USER || CLP_PRESTO_USER_FALLBACK, + user: process.env.CLP_PRESTO_USER ?? process.env.USER ?? CLP_PRESTO_USER_FALLBACK,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
components/clp-py-utils/clp_py_utils/clp_config.py(2 hunks)components/package-template/src/etc/clp-config.yml(1 hunks)components/webui/server/src/plugins/app/Presto.ts(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/server/src/plugins/app/Presto.ts
🧠 Learnings (1)
📚 Learning: 2025-07-01T14:49:07.333Z
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Applied to files:
components/package-template/src/etc/clp-config.yml
🧬 Code graph analysis (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
validate_port(192-204)
🔇 Additional comments (1)
components/package-template/src/etc/clp-config.yml (1)
118-122: LGTM: sane defaults and placementPresto block is placed after the garbage collector section, matching the Python config’s declaration order. Defaults align with the new model. No functional concerns here.
| host: settings.PrestoHost, | ||
| port: settings.PrestoPort, | ||
| user: CLP_PRESTO_USER, | ||
| user: process.env.USER || CLP_PRESTO_USER_FALLBACK, |
haiqi96
left a comment
There was a problem hiding this comment.
The changes in python and yml looks good to me.
Leave one webui syntax for Junhao to review, since we need Junhao's approval anyway.
| const clientOptions: ClientOptions = { | ||
| host: settings.PrestoHost, | ||
| port: settings.PrestoPort, | ||
| user: process.env.USER || CLP_PRESTO_USER_FALLBACK, |
There was a problem hiding this comment.
do we want to use the fastify env plugin instead of directly reading it from process.env?
There was a problem hiding this comment.
That way, we don't have to configure the fallback with ||
There was a problem hiding this comment.
Sure, i made this change
| @@ -912,6 +912,9 @@ def start_webui( | |||
| "ClientDir": str(container_webui_dir / "client"), | |||
| "LogViewerDir": str(container_webui_dir / "yscope-log-viewer"), | |||
| "StreamTargetUncompressedSize": container_clp_config.stream_output.target_uncompressed_size, | |||
There was a problem hiding this comment.
off topic: @haiqi96 do you know why we read some of the settings from container_clp_config instead of clp_config?
There was a problem hiding this comment.
iirc, we once mounted host directories to different paths in the container (for example, the package directory on the host is mounted to /opt/clp in the container. The paths in clp_config are host path, whereas paths in container_clp_config are container paths.
That said, we implemented a change a while ago to let most (but not all) paths in the container to be the same as the host. So technically, we are now duplicating the entire config to just account for any path that's still different between the host and the container.
One possibility to improve this is to create a smaller class that only contains the different mounted path, and stick to the original clp_config (with everything loaded, such as password) for all other purposes. It's likely the CLPDockerMounts can already achieve something similar.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/webui/server/.env (1)
10-12: Clarify where host/port come from to prevent misconfiguration.Since host/port are sourced from settings (generated via start_clp.py) rather than env, add a brief comment so operators don’t look for PRESTO_HOST/PORT here.
PRESTO_CATALOG=clp PRESTO_SCHEMA=default + +# Note: Presto host/port are configured via settings.json (generated by start_clp.py), +# not via env. Update clp-config.yml (presto.host / presto.port) to override.
♻️ Duplicate comments (1)
components/webui/server/src/plugins/app/Presto.ts (1)
37-42: Env-driven Presto options: LGTM, and this addresses prior feedback.Using
fastify.config.USER,PRESTO_CATALOG, andPRESTO_SCHEMAis the right direction and integrates with the env plugin cleanly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
components/webui/server/.env(1 hunks)components/webui/server/src/plugins/app/Presto.ts(1 hunks)components/webui/server/src/plugins/external/env.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/server/src/plugins/app/Presto.tscomponents/webui/server/src/plugins/external/env.ts
🔇 Additional comments (2)
components/webui/server/.env (1)
10-11: Presto catalog/schema defaults: looks good.Defaults align with the new env schema and Presto client usage. No concerns.
components/webui/server/src/plugins/app/Presto.ts (1)
37-42: Guard against empty strings via validation (preferred) or fallback.If env values are set but empty, the client may receive empty headers/params. Prefer the schema hardening in env.ts (minLength: 1). Otherwise, defensively omit empty strings here.
If you’d like a code fallback instead, I can provide a patch that normalizes empty strings to
undefinedforcatalog/schemaor defaultsuserto"clp-webui".
Co-authored-by: haiqi96 <14502009+haiqi96@users.noreply.github.com>
haiqi96
left a comment
There was a problem hiding this comment.
One note to add:
The reason we put pre=true in root_validator is to prevent validating Presto config when it is not allowed.
For example, if user specifies a Presto config with invalid port when query_engine is NOT presto. Without pre=true, the script will first complain about invalid port, and only if user fixes the port, it will then let user know that presto is not allowed when query_engine is not presto.
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
feat(package): Add settings to configure webui Presto client (resolves #1259, resolves #1260).
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
811-811: Typo in type annotation breaks Pydantic field definition
reducer: Reducer() = Reducer()should bereducer: Reducer = Reducer().- reducer: Reducer() = Reducer() + reducer: Reducer = Reducer()
♻️ Duplicate comments (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
780-793: Presto.port should be > 0 and reject unknown keysPort 0 isn’t valid for a remote Presto service; also forbid extraneous keys to catch typos early.
class Presto(BaseModel): host: str port: int @validator("host") def validate_host(cls, field): _validate_host(cls, field) return field @validator("port") def validate_port(cls, field): _validate_port(cls, field) + if field <= 0: + raise ValueError(f"{cls.__name__}.port must be greater than 0") return field + + class Config: + extra = "forbid"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
components/clp-py-utils/clp_py_utils/clp_config.py(3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Applied to files:
components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-04-21T16:21:07.371Z
Learnt from: haiqi96
PR: y-scope/clp#743
File: components/clp-py-utils/clp_py_utils/clp_config.py:339-365
Timestamp: 2025-04-21T16:21:07.371Z
Learning: When validating enum values in Pydantic models with a pre=True root_validator, convert string values to enum instances using EnumClass(string_value) within a try/except block before comparison, as direct comparison between strings and enum instances will always return False.
Applied to files:
components/clp-py-utils/clp_py_utils/clp_config.py
🧬 Code graph analysis (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
validate_port(195-207)
🪛 GitHub Actions: clp-lint
components/clp-py-utils/clp_py_utils/clp_config.py
[error] 983-994: Black formatting check failed. 1 file would be reformatted. Run 'black --write' to fix code style issues in this file.
🔇 Additional comments (2)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
821-821: Makingprestooptional is the right interfaceOptional
Prestoaligns with the root validator enforcing presence only when Presto is selected. LGTM.
983-989: Fix Black failure by collapsing the multi-line f-stringThe current two-line f-string violates Black’s formatting. The diff above replaces it with a single-line string, which satisfies Black.
| # We set `pre=True` so that we print errors about a mismatch between the query engine and presto | ||
| # config before errors about the presto config itself. | ||
| @root_validator(pre=True) | ||
| def validate_presto_config(cls, values): | ||
| package = values.get("package") | ||
| if not isinstance(package, Package): | ||
| # Skip validation since `package` is not a valid `Package` (Pydantic will validate it | ||
| # later and throw an error). | ||
| return values | ||
|
|
||
| query_engine = package.get("query_engine") | ||
| presto = values.get("presto") | ||
| if query_engine == QueryEngine.PRESTO and presto is None: | ||
| raise ValueError(f"`presto` config must be non-null when query_engine is" | ||
| f" `{query_engine}`") | ||
| return values | ||
|
|
There was a problem hiding this comment.
Root validator currently skips validation and uses an invalid API
- With
pre=True,packageis a dict; the early return prevents any Presto check from running. - If
packagewere a model,package.get(...)is invalid;BaseModelhas attributes, not.get.
Normalize query_engine to the enum and handle both dict/model inputs; also fix the long f-string that failed Black.
- @root_validator(pre=True)
- def validate_presto_config(cls, values):
- package = values.get("package")
- if not isinstance(package, Package):
- # Skip validation since `package` is not a valid `Package` (Pydantic will validate it
- # later and throw an error).
- return values
-
- query_engine = package.get("query_engine")
- presto = values.get("presto")
- if query_engine == QueryEngine.PRESTO and presto is None:
- raise ValueError(f"`presto` config must be non-null when query_engine is"
- f" `{query_engine}`")
- return values
+ @root_validator(pre=True)
+ def validate_presto_config(cls, values):
+ package = values.get("package")
+ # Handle both raw dict (pre=True) and already-validated model.
+ if isinstance(package, dict):
+ qe = package.get("query_engine")
+ elif isinstance(package, Package):
+ qe = package.query_engine
+ else:
+ return values
+
+ # Normalise to enum; if invalid or absent, let other validators raise later.
+ try:
+ qe_enum = QueryEngine(qe) if qe is not None else None
+ except Exception:
+ return values
+
+ presto = values.get("presto")
+ if qe_enum == QueryEngine.PRESTO and presto is None:
+ raise ValueError("`presto` config must be non-null when query_engine is 'presto'")
+ return values📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # We set `pre=True` so that we print errors about a mismatch between the query engine and presto | |
| # config before errors about the presto config itself. | |
| @root_validator(pre=True) | |
| def validate_presto_config(cls, values): | |
| package = values.get("package") | |
| if not isinstance(package, Package): | |
| # Skip validation since `package` is not a valid `Package` (Pydantic will validate it | |
| # later and throw an error). | |
| return values | |
| query_engine = package.get("query_engine") | |
| presto = values.get("presto") | |
| if query_engine == QueryEngine.PRESTO and presto is None: | |
| raise ValueError(f"`presto` config must be non-null when query_engine is" | |
| f" `{query_engine}`") | |
| return values | |
| # We set `pre=True` so that we print errors about a mismatch between the query engine and presto | |
| # config before errors about the presto config itself. | |
| @root_validator(pre=True) | |
| def validate_presto_config(cls, values): | |
| package = values.get("package") | |
| # Handle both raw dict (pre=True) and already-validated model. | |
| if isinstance(package, dict): | |
| qe = package.get("query_engine") | |
| elif isinstance(package, Package): | |
| qe = package.query_engine | |
| else: | |
| return values | |
| # Normalise to enum; if invalid or absent, let other validators raise later. | |
| try: | |
| qe_enum = QueryEngine(qe) if qe is not None else None | |
| except Exception: | |
| return values | |
| presto = values.get("presto") | |
| if qe_enum == QueryEngine.PRESTO and presto is None: | |
| raise ValueError("`presto` config must be non-null when query_engine is 'presto'") | |
| return values |
|
@kirkrodrigues - I tested this and it appears Presto Client only attempts connect when starting a query, not on startup. So it is okay as is |
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
818-818: Type annotation bug:reducer: Reducer() = Reducer()will break Pydantic field parsingThe annotation calls the type, creating an instance as the annotation. Use the class type in the annotation.
- reducer: Reducer() = Reducer() + reducer: Reducer = Reducer()
♻️ Duplicate comments (7)
components/package-template/src/etc/clp-config.yml (1)
119-121: Tighten the Presto section: remove stray “#” and add brief usage guidanceThe lone “#” looks accidental. Adding a short note prevents misconfiguration when enabling Presto.
-## Presto client config -#presto: null -# +## Presto client config +# Use only when `package.query_engine` is "presto" and `package.storage_engine` is "clp-s". +# Ensure host/port are reachable from the WebUI container. +#presto: nullOptional: also include a commented example block to guide users without setting defaults:
#presto: null +#presto: +# host: "localhost" +# port: 8889components/webui/server/src/plugins/external/env.ts (3)
43-47: Disallow blank USER to avoid emitting an empty Presto userEmpty strings can override the default when injected by containers; enforce non-empty.
// System USER: { - type: "string", - default: "clp-webui", + type: "string", + minLength: 1, + default: "clp-webui", },
10-10: Scope user to Presto; prefer PRESTO_USER with USER fallbackA generic USER key is ambiguous. Add PRESTO_USER now and let the Presto plugin prefer it over USER.
declare module "fastify" { export interface FastifyInstance { config: { PORT: number; HOST: string; USER: string; + PRESTO_USER?: string; CLP_DB_USER: string; CLP_DB_PASS: string; PRESTO_CATALOG: string; PRESTO_SCHEMA: string; RATE_LIMIT: number; }; } }// Presto +PRESTO_USER: { + type: "string", + minLength: 1, +}, PRESTO_CATALOG: { type: "string", default: "clp", }, PRESTO_SCHEMA: { type: "string", default: "default", },Follow-up (in Presto.ts, not shown here): use
fastify.config.PRESTO_USER ?? fastify.config.USER.Also applies to: 59-67
59-67: Validate catalog/schema are non-emptyEmpty strings may behave differently than omitting the field; enforce non-empty.
PRESTO_CATALOG: { - type: "string", - default: "clp", + type: "string", + minLength: 1, + default: "clp", }, PRESTO_SCHEMA: { - type: "string", - default: "default", + type: "string", + minLength: 1, + default: "default", },components/clp-py-utils/clp_py_utils/clp_config.py (1)
787-800: Presto.port should be > 0 and extra keys should be rejectedPort 0 is invalid for a remote service. Also forbid unexpected keys to catch typos early.
class Presto(BaseModel): host: str port: int @validator("host") def validate_host(cls, field): _validate_host(cls, field) return field @validator("port") def validate_port(cls, field): _validate_port(cls, field) + if field <= 0: + raise ValueError(f"{cls.__name__}.port must be greater than 0") return field + + class Config: + extra = "forbid"components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2)
891-892: Ensure webui server/client settings.json schemas include these keys.Because update_settings_object rejects unknown keys, adding ClpQueryEngine (and below, PrestoHost/PrestoPort) will hard-fail if the settings schemas don’t already define them. Verify the templates were updated, or relax the updater for explicitly allowed new keys.
Run:
#!/bin/bash set -euo pipefail echo "Scanning settings.json files for required keys..." fd -a -H 'settings.json' components/webui | while read -r p; do echo "== $p" ok=1 rg -nP '"ClpQueryEngine"\s*:' "$p" || { echo "MISSING: ClpQueryEngine"; ok=0; } rg -nP '"PrestoHost"\s*:' "$p" || { echo "MISSING: PrestoHost"; ok=0; } rg -nP '"PrestoPort"\s*:' "$p" || { echo "MISSING: PrestoPort"; ok=0; } [[ $ok -eq 1 ]] && echo "OK" done
917-924: PRESTO branch likely never executes; normalise to enum, add guard, and log.clp_config.package.query_engine can be a string; comparing to an enum member will fail, leaving PrestoHost/Port unset. Also add a defensive guard and a single info log to aid troubleshooting.
Apply:
- query_engine = clp_config.package.query_engine - if QueryEngine.PRESTO == query_engine: - server_settings_json_updates["PrestoHost"] = clp_config.presto.host - server_settings_json_updates["PrestoPort"] = clp_config.presto.port + query_engine = clp_config.package.query_engine + qe = QueryEngine(query_engine) # normalise to enum + if qe == QueryEngine.PRESTO: + if clp_config.presto is None: + raise ValueError("presto config is required when ClpQueryEngine is 'presto'") + server_settings_json_updates["PrestoHost"] = clp_config.presto.host + server_settings_json_updates["PrestoPort"] = clp_config.presto.port + logger.info( + f"Configuring web UI Presto client -> host: {clp_config.presto.host}, port: {clp_config.presto.port}" + ) else: server_settings_json_updates["PrestoHost"] = None server_settings_json_updates["PrestoPort"] = None
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py(3 hunks)components/clp-py-utils/clp_py_utils/clp_config.py(3 hunks)components/package-template/src/etc/clp-config.yml(1 hunks)components/webui/server/.env(1 hunks)components/webui/server/src/plugins/external/env.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/server/src/plugins/external/env.ts
🧠 Learnings (4)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Applied to files:
components/clp-package-utils/clp_package_utils/scripts/start_clp.pycomponents/clp-py-utils/clp_py_utils/clp_config.pycomponents/webui/server/.envcomponents/webui/server/src/plugins/external/env.tscomponents/package-template/src/etc/clp-config.yml
📚 Learning: 2025-04-21T16:21:07.371Z
Learnt from: haiqi96
PR: y-scope/clp#743
File: components/clp-py-utils/clp_py_utils/clp_config.py:339-365
Timestamp: 2025-04-21T16:21:07.371Z
Learning: When validating enum values in Pydantic models with a pre=True root_validator, convert string values to enum instances using EnumClass(string_value) within a try/except block before comparison, as direct comparison between strings and enum instances will always return False.
Applied to files:
components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-07-01T14:49:07.333Z
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Applied to files:
components/package-template/src/etc/clp-config.yml
🧬 Code graph analysis (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
QueryEngine(101-104)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
validate_port(195-207)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
29-29: LGTM: import is correct and used below.
| # We set `pre=True` so that we print errors about a mismatch between the query engine and presto | ||
| # config before errors about the presto config itself. | ||
| @root_validator(pre=True) | ||
| def validate_presto_config(cls, values): | ||
| package = values.get("package") | ||
| if not isinstance(package, Package): | ||
| # Skip validation since `package` is not a valid `Package` (Pydantic will validate it | ||
| # later and throw an error). | ||
| return values | ||
|
|
||
| query_engine = package.get("query_engine") | ||
| presto = values.get("presto") | ||
| if query_engine == QueryEngine.PRESTO and presto is None: | ||
| raise ValueError( | ||
| f"`presto` config must be non-null when query_engine is `{query_engine}`" | ||
| ) | ||
| return values | ||
|
|
There was a problem hiding this comment.
Root validator never enforces Presto requirement; fix dict/model handling and enum normalisation
With pre=True, package is a dict; your early return skips validation. When it’s a model, package.get(...) is invalid. Normalise and handle both cases.
- # We set `pre=True` so that we print errors about a mismatch between the query engine and presto
- # config before errors about the presto config itself.
- @root_validator(pre=True)
- def validate_presto_config(cls, values):
- package = values.get("package")
- if not isinstance(package, Package):
- # Skip validation since `package` is not a valid `Package` (Pydantic will validate it
- # later and throw an error).
- return values
-
- query_engine = package.get("query_engine")
- presto = values.get("presto")
- if query_engine == QueryEngine.PRESTO and presto is None:
- raise ValueError(
- f"`presto` config must be non-null when query_engine is `{query_engine}`"
- )
- return values
+ # We set `pre=True` so that we print errors about a mismatch between the query engine and presto
+ # config before errors about the presto config itself.
+ @root_validator(pre=True)
+ def validate_presto_config(cls, values):
+ package = values.get("package")
+ # Handle both raw dict (pre=True) and already-validated model.
+ if isinstance(package, dict):
+ qe = package.get("query_engine")
+ elif isinstance(package, Package):
+ qe = package.query_engine
+ else:
+ return values
+
+ # Normalise to enum if possible; if invalid/absent, let other validators raise later.
+ try:
+ qe_enum = QueryEngine(qe) if qe is not None else None
+ except Exception:
+ return values
+
+ presto = values.get("presto")
+ if qe_enum == QueryEngine.PRESTO and presto is None:
+ raise ValueError("`presto` config must be non-null when query_engine is 'presto'")
+ return values📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # We set `pre=True` so that we print errors about a mismatch between the query engine and presto | |
| # config before errors about the presto config itself. | |
| @root_validator(pre=True) | |
| def validate_presto_config(cls, values): | |
| package = values.get("package") | |
| if not isinstance(package, Package): | |
| # Skip validation since `package` is not a valid `Package` (Pydantic will validate it | |
| # later and throw an error). | |
| return values | |
| query_engine = package.get("query_engine") | |
| presto = values.get("presto") | |
| if query_engine == QueryEngine.PRESTO and presto is None: | |
| raise ValueError( | |
| f"`presto` config must be non-null when query_engine is `{query_engine}`" | |
| ) | |
| return values | |
| # We set `pre=True` so that we print errors about a mismatch between the query engine and presto | |
| # config before errors about the presto config itself. | |
| @root_validator(pre=True) | |
| def validate_presto_config(cls, values): | |
| package = values.get("package") | |
| # Handle both raw dict (pre=True) and already-validated model. | |
| if isinstance(package, dict): | |
| qe = package.get("query_engine") | |
| elif isinstance(package, Package): | |
| qe = package.query_engine | |
| else: | |
| # Skip validation since `package` is neither a dict nor a Package model. | |
| return values | |
| # Normalise to enum if possible; if invalid/absent, let other validators raise later. | |
| try: | |
| qe_enum = QueryEngine(qe) if qe is not None else None | |
| except Exception: | |
| return values | |
| presto = values.get("presto") | |
| if qe_enum == QueryEngine.PRESTO and presto is None: | |
| raise ValueError("`presto` config must be non-null when query_engine is 'presto'") | |
| return values |
🤖 Prompt for AI Agents
In components/clp-py-utils/clp_py_utils/clp_config.py around lines 982–999, the
root_validator with pre=True both sees package as a dict and as a model in
different runs and currently uses package.get(...) which fails for model objects
and causes the validator to be skipped; fix by reading query_engine robustly for
both dict and model (e.g. if isinstance(package, dict): q =
package.get("query_engine") else: q = getattr(package, "query_engine", None)),
normalize q into the QueryEngine enum safely (try to coerce via QueryEngine(q)
or handle string names/case), then enforce that when q equals QueryEngine.PRESTO
the presto value is non-null and raise the ValueError otherwise. Ensure the
validator returns values at the end.
| #Presto | ||
| PRESTO_CATALOG=clp | ||
| PRESTO_SCHEMA=default |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Standardize header comment and keep section consistent
Use a space after the hash to match other sections, and clarify intent.
-#Presto
+# Presto
PRESTO_CATALOG=clp
PRESTO_SCHEMA=defaultOptional: if you later introduce PRESTO_USER (see env.ts comment), consider documenting it here with a commented example for discoverability.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #Presto | |
| PRESTO_CATALOG=clp | |
| PRESTO_SCHEMA=default | |
| # Presto | |
| PRESTO_CATALOG=clp | |
| PRESTO_SCHEMA=default |
🤖 Prompt for AI Agents
In components/webui/server/.env around lines 14 to 16, the section header uses
"#Presto" which is inconsistent with other headers—change it to "# Presto" (add
a space after the hash) to match formatting, and keep the section content
aligned; optionally add a commented example for PRESTO_USER (e.g. "#
PRESTO_USER=your_user") below the existing lines to document the variable for
discoverability if you plan to introduce it later.
Description
Adds host/port settings to start a presto client in webui.
Also added catalog,schema required for connector, but should not be configurable by users. Devs can change by modifying env variables if they want to use ui with a different connector.
Presto has fancier authentication methods, but will leave for a future PR.
Also added a user since neccesary i believe when running the webui in docker container.
Checklist
breaking change.
Validation performed
Tested package sets variables, and client starts succesfully
Summary by CodeRabbit
New Features
Configuration
Validation