Skip to content

feat(package): Add settings to configure webui Presto client (resolves #1259, resolves #1260).#1198

Merged
davemarco merged 35 commits into
y-scope:mainfrom
davemarco:prestoArgs
Sep 10, 2025
Merged

feat(package): Add settings to configure webui Presto client (resolves #1259, resolves #1260).#1198
davemarco merged 35 commits into
y-scope:mainfrom
davemarco:prestoArgs

Conversation

@davemarco

@davemarco davemarco commented Aug 13, 2025

Copy link
Copy Markdown
Contributor

Description

Adds host/port settings to start a presto client in webui.

Also added catalog,schema required for connector, but should not be configurable by users. Devs can change by modifying env variables if they want to use ui with a different connector.

Presto has fancier authentication methods, but will leave for a future PR.

Also added a user since neccesary i believe when running the webui in docker container.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Tested package sets variables, and client starts succesfully

Summary by CodeRabbit

  • New Features

    • Support for a Presto-backed query engine; Web UI shows the selected engine and, when Presto is chosen, exposes Presto host/port plus catalog, schema and user options.
  • Configuration

    • Added a Presto section to the sample config.
    • Added PRESTO_CATALOG, PRESTO_SCHEMA and USER environment variables with defaults.
    • Server and client settings now expose the CLP query engine selection; Presto host/port are set only when Presto is selected.
  • Validation

    • Presto configuration is required when the Presto engine is selected.

@coderabbitai

coderabbitai Bot commented Aug 13, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Adds Presto configuration and wiring: new Presto model and CLPConfig validation, start_clp exposes ClpQueryEngine and conditionally writes PrestoHost/PrestoPort, package template and .env gains Presto placeholders/defaults, WebUI env/schema extended, and Presto plugin receives catalog/schema/user options.

Changes

Cohort / File(s) Summary of Changes
Startup wiring (server & client settings)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
Import QueryEngine. start_webui writes ClpQueryEngine to client and server settings and conditionally sets PrestoHost/PrestoPort from clp_config.presto when QueryEngine.PRESTO; otherwise sets them to None.
Core config model (Presto)
components/clp-py-utils/clp_py_utils/clp_config.py
Add Presto model (host: str, port: int) and optional presto field on CLPConfig. Add root validator requiring presto when package.query_engine == PRESTO. Update env var helper to raise ValueError for missing required env vars.
Config template
components/package-template/src/etc/clp-config.yml
Add placeholder Presto client config block (presto: null) to example clp-config.yml.
WebUI env schema / types
components/webui/server/src/plugins/external/env.ts
Extend Fastify config typings and env schema with USER (default clp-webui), PRESTO_CATALOG (default clp), and PRESTO_SCHEMA (default default).
WebUI Presto plugin
components/webui/server/src/plugins/app/Presto.ts
Include catalog, schema, and user in Presto client options (sourced from fastify.config); host/port still from settings.PrestoHost/settings.PrestoPort.
WebUI env defaults
components/webui/server/.env
Add PRESTO_CATALOG=clp and PRESTO_SCHEMA=default.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Admin
  participant Starter as start_clp.py
  participant Config as CLPConfig
  participant Settings as server_settings.json
  participant WebUI
  participant PrestoPlugin

  Admin->>Starter: start_webui()
  Starter->>Config: load CLPConfig (package.query_engine, presto?)
  Note over Config: if package.query_engine == PRESTO\npresto{host,port} must be present
  Config-->>Starter: return query_engine, presto{host,port}
  Starter->>Settings: write {ClpQueryEngine, PrestoHost/PrestoPort|null}

  Admin->>WebUI: launch
  WebUI->>WebUI: load env (USER, PRESTO_CATALOG, PRESTO_SCHEMA)
  WebUI->>PrestoPlugin: init({host: settings.PrestoHost, port: settings.PrestoPort, catalog, schema, user})
  PrestoPlugin-->>WebUI: decorated Presto client
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 87af21e and 0defd24.

📒 Files selected for processing (2)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (3 hunks)
  • components/webui/server/src/plugins/app/Presto.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/plugins/app/Presto.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/webui/server/src/plugins/app/Presto.ts
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
🧬 Code graph analysis (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
  • QueryEngine (101-104)
🔇 Additional comments (5)
components/webui/server/src/plugins/app/Presto.ts (2)

38-43: Sourcing catalog/schema/user from env looks good.

Matches the intended split: host/port from package settings; identity/catalog/schema from env. No issues spotted here.


38-43: Ensure env.ts schema provides safe defaults for PRESTO_CATALOG, PRESTO_SCHEMA and USER
The .env file doesn’t define these keys, so verify your env.ts schema uses .default() (or equivalent) for each to prevent client‐init failures when they’re unset.

components/clp-package-utils/clp_package_utils/scripts/start_clp.py (3)

891-892: Surfacing ClpQueryEngine in server settings is correct.

Enables the server to gate Presto init. Looks good.


878-893: Server settings schema includes ClpQueryEngine, PrestoHost, and PrestoPort; no changes needed.


917-924: The grep on CLPConfig shows that presto is defined as a required Pydantic model field (not Optional) and will never be None. Coupled with the fact that QueryEngine inherits from KebabCaseStrEnum (a subclass of str), comparing query_engine == QueryEngine.PRESTO is already safe even if you accidentally compare to a raw string. The extra .value fallback and None-guard are unnecessary. Remove that refactor suggestion.

Likely an incorrect or invalid review comment.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@davemarco davemarco changed the title feat(webui): Add basic settings to start a presto client to clp config. feat(webui): Add basic settings to start a presto client in webui to clp config. Aug 14, 2025
@davemarco davemarco changed the title feat(webui): Add basic settings to start a presto client in webui to clp config. feat(webui): Add basic settings to start a presto client in webui to clp package config. Aug 14, 2025
@davemarco davemarco changed the title feat(webui): Add basic settings to start a presto client in webui to clp package config. feat(webui): Add basic settings to configure webui presto client to clp package config. Aug 14, 2025
@davemarco

Copy link
Copy Markdown
Contributor Author

@kirkrodrigues do u want to review this or someone else?

@davemarco davemarco marked this pull request as ready for review August 14, 2025 18:19
@davemarco davemarco requested a review from a team as a code owner August 14, 2025 18:19

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1fe0224 and 1fc6b20.

📒 Files selected for processing (4)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1 hunks)
  • components/clp-py-utils/clp_py_utils/clp_config.py (2 hunks)
  • components/package-template/src/etc/clp-config.yml (1 hunks)
  • components/webui/server/src/plugins/app/Presto.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit Configuration File

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/plugins/app/Presto.ts
🧬 Code Graph Analysis (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
  • validate_port (192-204)
🔇 Additional comments (4)
components/clp-py-utils/clp_py_utils/clp_config.py (2)

699-712: LGTM: New Presto config model is consistent and validated.

Good reuse of existing host/port validators and sensible defaults. This integrates cleanly with the rest of the config models.


731-731: LGTM: Exposing presto on CLPConfig.

This makes the values available to the launcher without affecting existing consumers. No conflicts with dump_to_primitive_dict().

components/webui/server/src/plugins/app/Presto.ts (1)

38-42: Verified — server settings.json includes ClpQueryEngine, PrestoHost, PrestoPort

Confirmed: components/webui/server/settings.json defines the required keys and types, and the Presto plugin imports that file, so it will receive defined values.

  • components/webui/server/settings.json — lines ~21–23: "ClpQueryEngine": "clp", "PrestoHost": "localhost", "PrestoPort": 8889
  • components/webui/server/src/plugins/app/Presto.ts — imports ../../../settings.json and uses settings.ClpQueryEngine to gate initialization and settings.PrestoHost / settings.PrestoPort for clientOptions

No change required in this PR; the original concern is resolved.

components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

915-918: Confirm the server settings.json path matches the module import path

I ran the check: the Presto plugin import "../../../settings.json" resolves to components/webui/server/dist/server/settings.json, but that file is not present in the repository (dist is likely a build artifact). I cannot conclusively verify from source alone — please confirm the built/deployed artifact contains settings.json at that path and that start_clp writes to it.

Files/locations to verify:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (around lines 915–918) — ensure it writes to components/webui/server/dist/server/settings.json.
  • components/webui/server/src/plugins/app/Presto.ts (and the built file components/webui/server/dist/server/src/plugins/app/Presto.js) — ensure "../../../settings.json" resolves to the same server/dist/server/settings.json at runtime.

Comment thread components/clp-package-utils/clp_package_utils/scripts/start_clp.py
Comment thread components/package-template/src/etc/clp-config.yml Outdated
Comment thread components/webui/server/src/plugins/app/Presto.ts Outdated
@davemarco davemarco requested a review from haiqi96 August 14, 2025 20:59
# port: 4000
# results_metadata_collection_name: "results-metadata"
#
#presto:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we consider making the order to be consistent with the order we declare them in CLPConfig? (line 731)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure but should garbage collector be higher up then? like above archive output settings?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure but should garbage collector be higher up then? like above archive output settings?

True, the order isn't that simple.
Do we start presto as a component, or will it be something user configure themselves and only need to let CLP know the port?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per kirk, plan is for user to start their own presto cluster, then the package/webui connects to it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

  1. Let's make the declaration of Presto to be consistent in the clp-config.yml and clp_config.py -- since now class Presto is defined after class garbage collector in clp_config.py, let's keep the same order and move Presto lower in the clp-config.yml
  2. for presto: Presto = Presto(). I would say let's move it to the block starting at line734, the reasoning is that the target in the first block are launchable components.

It looks like that there might be some inconsistent decisions made on where to put what, I would say let's leave those for future issues and PRs.

import settings from "../../../settings.json" with {type: "json"};


const CLP_PRESTO_USER = "clp-webui";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with the context, wonder if this value is only limited to webui.
If something else could use the value in the future, shall we consider moving it into clp-config.yml, and passin via settings?

@davemarco davemarco Aug 20, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i dont think anyone would really benefit from changing this. The username does nothing from a functional perspective, it is just metadata. Normally the library we are using defaults to using linux username automatically. This hardcode is basically a quick fix since when running in docker there is no username, and library we are using requires a user name. Maybe a better solution is to check process.env.USER, and if undefined, then set to hardcoded "clp-webui"?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, checking process.env.USER sounds reasonable.

@davemarco davemarco requested a review from haiqi96 August 22, 2025 13:39

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
components/package-template/src/etc/clp-config.yml (1)

118-122: Add a brief usage note to reduce misconfiguration (when to enable, required storage engine, container reachability)

This mirrors prior feedback and helps users wiring the web UI to an external Presto.

Apply:

 ## Presto client config
 #presto:
 #  host: "localhost"
 #  port: 8889
+#
+# Note:
+# - Enable this block only when `package.query_engine` is "presto".
+# - Presto queries require `package.storage_engine` to be "clp-s".
+# - Ensure host/port are reachable from the web UI container and match your Presto endpoint.
components/webui/server/src/plugins/app/Presto.ts (2)

10-12: Allow an explicit env override and prefer nullish coalescing

Support CLP_PRESTO_USER for ops overrides, then fall back to process.env.USER, then the default. Using ?? avoids treating empty strings as falsy.

Apply:

-// Use process.env.USER if defined, otherwise fallback to "clp-webui"
-const CLP_PRESTO_USER_FALLBACK = "clp-webui";
+// Resolve Presto user in priority order:
+// 1) CLP_PRESTO_USER (explicit override for ops)
+// 2) USER (typical Linux/Docker env)
+// 3) "clp-webui" (safe fallback when no user is available, e.g., some containers)
+const CLP_PRESTO_USER_FALLBACK = "clp-webui";

42-42: Use explicit user resolution chain (env override → USER → fallback)

This matches the documented intent and prior feedback.

Apply:

-            user: process.env.USER || CLP_PRESTO_USER_FALLBACK,
+            user: process.env.CLP_PRESTO_USER ?? process.env.USER ?? CLP_PRESTO_USER_FALLBACK,
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1fc6b20 and 1c49b00.

📒 Files selected for processing (3)
  • components/clp-py-utils/clp_py_utils/clp_config.py (2 hunks)
  • components/package-template/src/etc/clp-config.yml (1 hunks)
  • components/webui/server/src/plugins/app/Presto.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/plugins/app/Presto.ts
🧠 Learnings (1)
📚 Learning: 2025-07-01T14:49:07.333Z
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.

Applied to files:

  • components/package-template/src/etc/clp-config.yml
🧬 Code graph analysis (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
  • validate_port (192-204)
🔇 Additional comments (1)
components/package-template/src/etc/clp-config.yml (1)

118-122: LGTM: sane defaults and placement

Presto block is placed after the garbage collector section, matching the Python config’s declaration order. Defaults align with the new model. No functional concerns here.

Comment thread components/clp-py-utils/clp_py_utils/clp_config.py
Comment thread components/clp-py-utils/clp_py_utils/clp_config.py Outdated
host: settings.PrestoHost,
port: settings.PrestoPort,
user: CLP_PRESTO_USER,
user: process.env.USER || CLP_PRESTO_USER_FALLBACK,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junhaoliao Defer to Junhao for review.

haiqi96
haiqi96 previously approved these changes Aug 22, 2025

@haiqi96 haiqi96 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in python and yml looks good to me.
Leave one webui syntax for Junhao to review, since we need Junhao's approval anyway.

@junhaoliao

Copy link
Copy Markdown
Member

Let's merge main into this branch and add #1259 & #1260 to the PR title

Comment thread components/clp-py-utils/clp_py_utils/clp_config.py Outdated
const clientOptions: ClientOptions = {
host: settings.PrestoHost,
port: settings.PrestoPort,
user: process.env.USER || CLP_PRESTO_USER_FALLBACK,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to use the fastify env plugin instead of directly reading it from process.env?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That way, we don't have to configure the fallback with ||

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, i made this change

@@ -912,6 +912,9 @@ def start_webui(
"ClientDir": str(container_webui_dir / "client"),
"LogViewerDir": str(container_webui_dir / "yscope-log-viewer"),
"StreamTargetUncompressedSize": container_clp_config.stream_output.target_uncompressed_size,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

off topic: @haiqi96 do you know why we read some of the settings from container_clp_config instead of clp_config?

@haiqi96 haiqi96 Aug 25, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc, we once mounted host directories to different paths in the container (for example, the package directory on the host is mounted to /opt/clp in the container. The paths in clp_config are host path, whereas paths in container_clp_config are container paths.

That said, we implemented a change a while ago to let most (but not all) paths in the container to be the same as the host. So technically, we are now duplicating the entire config to just account for any path that's still different between the host and the container.

One possibility to improve this is to create a smaller class that only contains the different mounted path, and stick to the original clp_config (with everything loaded, such as password) for all other purposes. It's likely the CLPDockerMounts can already achieve something similar.

@davemarco davemarco changed the title feat(webui): Add basic settings to configure webui presto client to clp package config. feat(webui): Add basic settings to configure webui presto client to clp package config (resolves #1259, resolves #1260). Aug 25, 2025

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/webui/server/.env (1)

10-12: Clarify where host/port come from to prevent misconfiguration.

Since host/port are sourced from settings (generated via start_clp.py) rather than env, add a brief comment so operators don’t look for PRESTO_HOST/PORT here.

 PRESTO_CATALOG=clp
 PRESTO_SCHEMA=default
+
+# Note: Presto host/port are configured via settings.json (generated by start_clp.py),
+# not via env. Update clp-config.yml (presto.host / presto.port) to override.
♻️ Duplicate comments (1)
components/webui/server/src/plugins/app/Presto.ts (1)

37-42: Env-driven Presto options: LGTM, and this addresses prior feedback.

Using fastify.config.USER, PRESTO_CATALOG, and PRESTO_SCHEMA is the right direction and integrates with the env plugin cleanly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1c49b00 and f79915b.

📒 Files selected for processing (3)
  • components/webui/server/.env (1 hunks)
  • components/webui/server/src/plugins/app/Presto.ts (1 hunks)
  • components/webui/server/src/plugins/external/env.ts (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/plugins/app/Presto.ts
  • components/webui/server/src/plugins/external/env.ts
🔇 Additional comments (2)
components/webui/server/.env (1)

10-11: Presto catalog/schema defaults: looks good.

Defaults align with the new env schema and Presto client usage. No concerns.

components/webui/server/src/plugins/app/Presto.ts (1)

37-42: Guard against empty strings via validation (preferred) or fallback.

If env values are set but empty, the client may receive empty headers/params. Prefer the schema hardening in env.ts (minLength: 1). Otherwise, defensively omit empty strings here.

If you’d like a code fallback instead, I can provide a patch that normalizes empty strings to undefined for catalog/schema or defaults user to "clp-webui".

Comment thread components/webui/server/src/plugins/app/Presto.ts
Comment thread components/webui/server/src/plugins/external/env.ts
Comment thread components/webui/server/src/plugins/external/env.ts
Comment thread components/webui/server/src/plugins/external/env.ts
Comment thread components/webui/server/src/plugins/app/Presto.ts
@davemarco davemarco requested a review from haiqi96 August 27, 2025 17:35
Comment thread components/clp-py-utils/clp_py_utils/clp_config.py Outdated
davemarco and others added 2 commits August 27, 2025 16:39
Co-authored-by: haiqi96 <14502009+haiqi96@users.noreply.github.com>
@davemarco davemarco requested a review from haiqi96 August 27, 2025 20:42
haiqi96
haiqi96 previously approved these changes Aug 27, 2025

@haiqi96 haiqi96 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note to add:
The reason we put pre=true in root_validator is to prevent validating Presto config when it is not allowed.

For example, if user specifies a Presto config with invalid port when query_engine is NOT presto. Without pre=true, the script will first complain about invalid port, and only if user fixes the port, it will then let user know that presto is not allowed when query_engine is not presto.

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

feat(package): Add settings to configure webui Presto client (resolves #1259, resolves #1260).

Comment thread components/clp-py-utils/clp_py_utils/clp_config.py
Comment thread components/clp-py-utils/clp_py_utils/clp_config.py
Comment thread components/clp-py-utils/clp_py_utils/clp_config.py Outdated
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
davemarco and others added 2 commits September 2, 2025 09:47
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
@davemarco davemarco changed the title feat(webui): Add basic settings to configure webui presto client to clp package config (resolves #1259, resolves #1260). feat(package): Add settings to configure webui Presto client (resolves #1259, resolves #1260). Sep 2, 2025

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)

811-811: Typo in type annotation breaks Pydantic field definition

reducer: Reducer() = Reducer() should be reducer: Reducer = Reducer().

-    reducer: Reducer() = Reducer()
+    reducer: Reducer = Reducer()
♻️ Duplicate comments (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)

780-793: Presto.port should be > 0 and reject unknown keys

Port 0 isn’t valid for a remote Presto service; also forbid extraneous keys to catch typos early.

 class Presto(BaseModel):
     host: str
     port: int
 
     @validator("host")
     def validate_host(cls, field):
         _validate_host(cls, field)
         return field
 
     @validator("port")
     def validate_port(cls, field):
         _validate_port(cls, field)
+        if field <= 0:
+            raise ValueError(f"{cls.__name__}.port must be greater than 0")
         return field
+
+    class Config:
+        extra = "forbid"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f7df0cf and e100051.

📒 Files selected for processing (1)
  • components/clp-py-utils/clp_py_utils/clp_config.py (3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-04-21T16:21:07.371Z
Learnt from: haiqi96
PR: y-scope/clp#743
File: components/clp-py-utils/clp_py_utils/clp_config.py:339-365
Timestamp: 2025-04-21T16:21:07.371Z
Learning: When validating enum values in Pydantic models with a pre=True root_validator, convert string values to enum instances using EnumClass(string_value) within a try/except block before comparison, as direct comparison between strings and enum instances will always return False.

Applied to files:

  • components/clp-py-utils/clp_py_utils/clp_config.py
🧬 Code graph analysis (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
  • validate_port (195-207)
🪛 GitHub Actions: clp-lint
components/clp-py-utils/clp_py_utils/clp_config.py

[error] 983-994: Black formatting check failed. 1 file would be reformatted. Run 'black --write' to fix code style issues in this file.

🔇 Additional comments (2)
components/clp-py-utils/clp_py_utils/clp_config.py (2)

821-821: Making presto optional is the right interface

Optional Presto aligns with the root validator enforcing presence only when Presto is selected. LGTM.


983-989: Fix Black failure by collapsing the multi-line f-string

The current two-line f-string violates Black’s formatting. The diff above replaces it with a single-line string, which satisfies Black.

Comment on lines +975 to +991
# We set `pre=True` so that we print errors about a mismatch between the query engine and presto
# config before errors about the presto config itself.
@root_validator(pre=True)
def validate_presto_config(cls, values):
package = values.get("package")
if not isinstance(package, Package):
# Skip validation since `package` is not a valid `Package` (Pydantic will validate it
# later and throw an error).
return values

query_engine = package.get("query_engine")
presto = values.get("presto")
if query_engine == QueryEngine.PRESTO and presto is None:
raise ValueError(f"`presto` config must be non-null when query_engine is"
f" `{query_engine}`")
return values

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Root validator currently skips validation and uses an invalid API

  • With pre=True, package is a dict; the early return prevents any Presto check from running.
  • If package were a model, package.get(...) is invalid; BaseModel has attributes, not .get.

Normalize query_engine to the enum and handle both dict/model inputs; also fix the long f-string that failed Black.

-    @root_validator(pre=True)
-    def validate_presto_config(cls, values):
-        package = values.get("package")
-        if not isinstance(package, Package):
-            # Skip validation since `package` is not a valid `Package` (Pydantic will validate it
-            # later and throw an error).
-            return values
-
-        query_engine = package.get("query_engine")
-        presto = values.get("presto")
-        if query_engine == QueryEngine.PRESTO and presto is None:
-            raise ValueError(f"`presto` config must be non-null when query_engine is"
-                             f" `{query_engine}`")
-        return values
+    @root_validator(pre=True)
+    def validate_presto_config(cls, values):
+        package = values.get("package")
+        # Handle both raw dict (pre=True) and already-validated model.
+        if isinstance(package, dict):
+            qe = package.get("query_engine")
+        elif isinstance(package, Package):
+            qe = package.query_engine
+        else:
+            return values
+
+        # Normalise to enum; if invalid or absent, let other validators raise later.
+        try:
+            qe_enum = QueryEngine(qe) if qe is not None else None
+        except Exception:
+            return values
+
+        presto = values.get("presto")
+        if qe_enum == QueryEngine.PRESTO and presto is None:
+            raise ValueError("`presto` config must be non-null when query_engine is 'presto'")
+        return values
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# We set `pre=True` so that we print errors about a mismatch between the query engine and presto
# config before errors about the presto config itself.
@root_validator(pre=True)
def validate_presto_config(cls, values):
package = values.get("package")
if not isinstance(package, Package):
# Skip validation since `package` is not a valid `Package` (Pydantic will validate it
# later and throw an error).
return values
query_engine = package.get("query_engine")
presto = values.get("presto")
if query_engine == QueryEngine.PRESTO and presto is None:
raise ValueError(f"`presto` config must be non-null when query_engine is"
f" `{query_engine}`")
return values
# We set `pre=True` so that we print errors about a mismatch between the query engine and presto
# config before errors about the presto config itself.
@root_validator(pre=True)
def validate_presto_config(cls, values):
package = values.get("package")
# Handle both raw dict (pre=True) and already-validated model.
if isinstance(package, dict):
qe = package.get("query_engine")
elif isinstance(package, Package):
qe = package.query_engine
else:
return values
# Normalise to enum; if invalid or absent, let other validators raise later.
try:
qe_enum = QueryEngine(qe) if qe is not None else None
except Exception:
return values
presto = values.get("presto")
if qe_enum == QueryEngine.PRESTO and presto is None:
raise ValueError("`presto` config must be non-null when query_engine is 'presto'")
return values

@davemarco

davemarco commented Sep 3, 2025

Copy link
Copy Markdown
Contributor Author

@kirkrodrigues - I tested this and it appears Presto Client only attempts connect when starting a query, not on startup. So it is okay as is

Comment thread components/clp-py-utils/clp_py_utils/clp_config.py Outdated
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)

818-818: Type annotation bug: reducer: Reducer() = Reducer() will break Pydantic field parsing

The annotation calls the type, creating an instance as the annotation. Use the class type in the annotation.

-    reducer: Reducer() = Reducer()
+    reducer: Reducer = Reducer()
♻️ Duplicate comments (7)
components/package-template/src/etc/clp-config.yml (1)

119-121: Tighten the Presto section: remove stray “#” and add brief usage guidance

The lone “#” looks accidental. Adding a short note prevents misconfiguration when enabling Presto.

-## Presto client config
-#presto: null
-#
+## Presto client config
+# Use only when `package.query_engine` is "presto" and `package.storage_engine` is "clp-s".
+# Ensure host/port are reachable from the WebUI container.
+#presto: null

Optional: also include a commented example block to guide users without setting defaults:

 #presto: null
+#presto:
+#  host: "localhost"
+#  port: 8889
components/webui/server/src/plugins/external/env.ts (3)

43-47: Disallow blank USER to avoid emitting an empty Presto user

Empty strings can override the default when injected by containers; enforce non-empty.

 // System
 USER: {
-    type: "string",
-    default: "clp-webui",
+    type: "string",
+    minLength: 1,
+    default: "clp-webui",
 },

10-10: Scope user to Presto; prefer PRESTO_USER with USER fallback

A generic USER key is ambiguous. Add PRESTO_USER now and let the Presto plugin prefer it over USER.

 declare module "fastify" {
     export interface FastifyInstance {
         config: {
             PORT: number;
             HOST: string;
             USER: string;
+            PRESTO_USER?: string;
             CLP_DB_USER: string;
             CLP_DB_PASS: string;
             PRESTO_CATALOG: string;
             PRESTO_SCHEMA: string;
             RATE_LIMIT: number;
         };
     }
 }
 // Presto
+PRESTO_USER: {
+    type: "string",
+    minLength: 1,
+},
 PRESTO_CATALOG: {
     type: "string",
     default: "clp",
 },
 PRESTO_SCHEMA: {
     type: "string",
     default: "default",
 },

Follow-up (in Presto.ts, not shown here): use fastify.config.PRESTO_USER ?? fastify.config.USER.

Also applies to: 59-67


59-67: Validate catalog/schema are non-empty

Empty strings may behave differently than omitting the field; enforce non-empty.

 PRESTO_CATALOG: {
-    type: "string",
-    default: "clp",
+    type: "string",
+    minLength: 1,
+    default: "clp",
 },
 PRESTO_SCHEMA: {
-    type: "string",
-    default: "default",
+    type: "string",
+    minLength: 1,
+    default: "default",
 },
components/clp-py-utils/clp_py_utils/clp_config.py (1)

787-800: Presto.port should be > 0 and extra keys should be rejected

Port 0 is invalid for a remote service. Also forbid unexpected keys to catch typos early.

 class Presto(BaseModel):
     host: str
     port: int
 
     @validator("host")
     def validate_host(cls, field):
         _validate_host(cls, field)
         return field
 
     @validator("port")
     def validate_port(cls, field):
         _validate_port(cls, field)
+        if field <= 0:
+            raise ValueError(f"{cls.__name__}.port must be greater than 0")
         return field
+
+    class Config:
+        extra = "forbid"
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2)

891-892: Ensure webui server/client settings.json schemas include these keys.

Because update_settings_object rejects unknown keys, adding ClpQueryEngine (and below, PrestoHost/PrestoPort) will hard-fail if the settings schemas don’t already define them. Verify the templates were updated, or relax the updater for explicitly allowed new keys.

Run:

#!/bin/bash
set -euo pipefail
echo "Scanning settings.json files for required keys..."
fd -a -H 'settings.json' components/webui | while read -r p; do
  echo "== $p"
  ok=1
  rg -nP '"ClpQueryEngine"\s*:' "$p" || { echo "MISSING: ClpQueryEngine"; ok=0; }
  rg -nP '"PrestoHost"\s*:' "$p" || { echo "MISSING: PrestoHost"; ok=0; }
  rg -nP '"PrestoPort"\s*:' "$p" || { echo "MISSING: PrestoPort"; ok=0; }
  [[ $ok -eq 1 ]] && echo "OK"
done

917-924: PRESTO branch likely never executes; normalise to enum, add guard, and log.

clp_config.package.query_engine can be a string; comparing to an enum member will fail, leaving PrestoHost/Port unset. Also add a defensive guard and a single info log to aid troubleshooting.

Apply:

-    query_engine = clp_config.package.query_engine
-    if QueryEngine.PRESTO == query_engine:
-        server_settings_json_updates["PrestoHost"] = clp_config.presto.host
-        server_settings_json_updates["PrestoPort"] = clp_config.presto.port
+    query_engine = clp_config.package.query_engine
+    qe = QueryEngine(query_engine)  # normalise to enum
+    if qe == QueryEngine.PRESTO:
+        if clp_config.presto is None:
+            raise ValueError("presto config is required when ClpQueryEngine is 'presto'")
+        server_settings_json_updates["PrestoHost"] = clp_config.presto.host
+        server_settings_json_updates["PrestoPort"] = clp_config.presto.port
+        logger.info(
+            f"Configuring web UI Presto client -> host: {clp_config.presto.host}, port: {clp_config.presto.port}"
+        )
     else:
         server_settings_json_updates["PrestoHost"] = None
         server_settings_json_updates["PrestoPort"] = None
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c9fefa7 and 87af21e.

📒 Files selected for processing (5)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (3 hunks)
  • components/clp-py-utils/clp_py_utils/clp_config.py (3 hunks)
  • components/package-template/src/etc/clp-config.yml (1 hunks)
  • components/webui/server/.env (1 hunks)
  • components/webui/server/src/plugins/external/env.ts (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/plugins/external/env.ts
🧠 Learnings (4)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
  • components/clp-py-utils/clp_py_utils/clp_config.py
  • components/webui/server/.env
  • components/webui/server/src/plugins/external/env.ts
  • components/package-template/src/etc/clp-config.yml
📚 Learning: 2025-04-21T16:21:07.371Z
Learnt from: haiqi96
PR: y-scope/clp#743
File: components/clp-py-utils/clp_py_utils/clp_config.py:339-365
Timestamp: 2025-04-21T16:21:07.371Z
Learning: When validating enum values in Pydantic models with a pre=True root_validator, convert string values to enum instances using EnumClass(string_value) within a try/except block before comparison, as direct comparison between strings and enum instances will always return False.

Applied to files:

  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-07-01T14:49:07.333Z
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.

Applied to files:

  • components/package-template/src/etc/clp-config.yml
🧬 Code graph analysis (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
  • QueryEngine (101-104)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-package-utils/clp_package_utils/general.py (1)
  • validate_port (195-207)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

29-29: LGTM: import is correct and used below.

Comment on lines +982 to +999
# We set `pre=True` so that we print errors about a mismatch between the query engine and presto
# config before errors about the presto config itself.
@root_validator(pre=True)
def validate_presto_config(cls, values):
package = values.get("package")
if not isinstance(package, Package):
# Skip validation since `package` is not a valid `Package` (Pydantic will validate it
# later and throw an error).
return values

query_engine = package.get("query_engine")
presto = values.get("presto")
if query_engine == QueryEngine.PRESTO and presto is None:
raise ValueError(
f"`presto` config must be non-null when query_engine is `{query_engine}`"
)
return values

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Root validator never enforces Presto requirement; fix dict/model handling and enum normalisation

With pre=True, package is a dict; your early return skips validation. When it’s a model, package.get(...) is invalid. Normalise and handle both cases.

-    # We set `pre=True` so that we print errors about a mismatch between the query engine and presto
-    # config before errors about the presto config itself.
-    @root_validator(pre=True)
-    def validate_presto_config(cls, values):
-        package = values.get("package")
-        if not isinstance(package, Package):
-            # Skip validation since `package` is not a valid `Package` (Pydantic will validate it
-            # later and throw an error).
-            return values
-
-        query_engine = package.get("query_engine")
-        presto = values.get("presto")
-        if query_engine == QueryEngine.PRESTO and presto is None:
-            raise ValueError(
-                f"`presto` config must be non-null when query_engine is `{query_engine}`"
-            )
-        return values
+    # We set `pre=True` so that we print errors about a mismatch between the query engine and presto
+    # config before errors about the presto config itself.
+    @root_validator(pre=True)
+    def validate_presto_config(cls, values):
+        package = values.get("package")
+        # Handle both raw dict (pre=True) and already-validated model.
+        if isinstance(package, dict):
+            qe = package.get("query_engine")
+        elif isinstance(package, Package):
+            qe = package.query_engine
+        else:
+            return values
+
+        # Normalise to enum if possible; if invalid/absent, let other validators raise later.
+        try:
+            qe_enum = QueryEngine(qe) if qe is not None else None
+        except Exception:
+            return values
+
+        presto = values.get("presto")
+        if qe_enum == QueryEngine.PRESTO and presto is None:
+            raise ValueError("`presto` config must be non-null when query_engine is 'presto'")
+        return values
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# We set `pre=True` so that we print errors about a mismatch between the query engine and presto
# config before errors about the presto config itself.
@root_validator(pre=True)
def validate_presto_config(cls, values):
package = values.get("package")
if not isinstance(package, Package):
# Skip validation since `package` is not a valid `Package` (Pydantic will validate it
# later and throw an error).
return values
query_engine = package.get("query_engine")
presto = values.get("presto")
if query_engine == QueryEngine.PRESTO and presto is None:
raise ValueError(
f"`presto` config must be non-null when query_engine is `{query_engine}`"
)
return values
# We set `pre=True` so that we print errors about a mismatch between the query engine and presto
# config before errors about the presto config itself.
@root_validator(pre=True)
def validate_presto_config(cls, values):
package = values.get("package")
# Handle both raw dict (pre=True) and already-validated model.
if isinstance(package, dict):
qe = package.get("query_engine")
elif isinstance(package, Package):
qe = package.query_engine
else:
# Skip validation since `package` is neither a dict nor a Package model.
return values
# Normalise to enum if possible; if invalid/absent, let other validators raise later.
try:
qe_enum = QueryEngine(qe) if qe is not None else None
except Exception:
return values
presto = values.get("presto")
if qe_enum == QueryEngine.PRESTO and presto is None:
raise ValueError("`presto` config must be non-null when query_engine is 'presto'")
return values
🤖 Prompt for AI Agents
In components/clp-py-utils/clp_py_utils/clp_config.py around lines 982–999, the
root_validator with pre=True both sees package as a dict and as a model in
different runs and currently uses package.get(...) which fails for model objects
and causes the validator to be skipped; fix by reading query_engine robustly for
both dict and model (e.g. if isinstance(package, dict): q =
package.get("query_engine") else: q = getattr(package, "query_engine", None)),
normalize q into the QueryEngine enum safely (try to coerce via QueryEngine(q)
or handle string names/case), then enforce that when q equals QueryEngine.PRESTO
the presto value is non-null and raise the ValueError otherwise. Ensure the
validator returns values at the end.

Comment on lines +14 to +16
#Presto
PRESTO_CATALOG=clp
PRESTO_SCHEMA=default

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Standardize header comment and keep section consistent

Use a space after the hash to match other sections, and clarify intent.

-#Presto
+# Presto
 PRESTO_CATALOG=clp
 PRESTO_SCHEMA=default

Optional: if you later introduce PRESTO_USER (see env.ts comment), consider documenting it here with a commented example for discoverability.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#Presto
PRESTO_CATALOG=clp
PRESTO_SCHEMA=default
# Presto
PRESTO_CATALOG=clp
PRESTO_SCHEMA=default
🤖 Prompt for AI Agents
In components/webui/server/.env around lines 14 to 16, the section header uses
"#Presto" which is inconsistent with other headers—change it to "# Presto" (add
a space after the hash) to match formatting, and keep the section content
aligned; optionally add a commented example for PRESTO_USER (e.g. "#
PRESTO_USER=your_user") below the existing lines to document the variable for
discoverability if you plan to introduce it later.

@davemarco davemarco merged commit 61f74cc into y-scope:main Sep 10, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants