Skip to content

feat(webui): Add compression job submission for local files only.#1294

Merged
junhaoliao merged 52 commits into
y-scope:mainfrom
junhaoliao:local-submit-sans-ls
Nov 25, 2025
Merged

feat(webui): Add compression job submission for local files only.#1294
junhaoliao merged 52 commits into
y-scope:mainfrom
junhaoliao:local-submit-sans-ls

Conversation

@junhaoliao

@junhaoliao junhaoliao commented Sep 9, 2025

Copy link
Copy Markdown
Member

Description

  1. Add compression job submission route in the backend server.
  2. Add a panel in the frontend Dashboard for submitting compression jobs.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. with package.stoarge_engine set to clp in etc/clp-config.yml , tested submission with multiple text logs directories in the webui.
  2. with package.stoarge_engine set to clp-s in etc/clp-config.yml , tested submission with multiple JSON logs file paths in the webui. tried leaving the "dataset" as empty and the file gets compressed into "default" dataset.

Summary by CodeRabbit

  • New Features

    • Compression UI added: multiline paths input, optional CLP‑S fields (dataset, timestamp key), submit button with loading state and success/error feedback (displays returned job ID).
  • Backend

    • API endpoint and persistence added so submitted compression jobs are accepted and stored; submissions return a job ID.
  • Configuration

    • New compression/archive sizing settings, CLP‑S storage‑engine option and a default dataset name.
  • UI

    • Ingest page layout updated to show compression UI above jobs list.

✏️ Tip: You can customize this high-level summary in your review settings.

@junhaoliao junhaoliao requested a review from a team as a code owner September 9, 2025 21:41
@junhaoliao junhaoliao requested a review from davemarco September 9, 2025 21:41
@coderabbitai

coderabbitai Bot commented Sep 9, 2025

Copy link
Copy Markdown
Contributor

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a compression-job feature: new client UI and API client, POST /api/compress route, Fastify CompressionJobDbManager that MsgPack-encodes and Brotli-compresses job configs into MySQL, TypeBox schemas/types, server settings for compression, shared CLP config, and client import refactors.

Changes

Cohort / File(s) Change Summary
Server route & plugin
components/webui/server/src/routes/api/compress/index.ts, components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts, components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts, components/webui/server/src/routes/api/compress/typings.ts
New POST /api/compress route that builds a CompressionJobConfig from DEFAULT_COMPRESSION_JOB_CONFIG and request body (paths, dataset, timestampKey), maps container-prefixed paths, applies CLP_S-specific dataset/timestamp logic, and submits via new CompressionJobDbManager plugin which MsgPack-encodes, Brotli-compresses and inserts the config into the configured SQL table; returns insertId. Adds CONTAINER_INPUT_LOGS_ROOT_DIR and CompressionJobConfig typings.
Schemas & DB column typings
components/webui/common/src/schemas/compression.ts, components/webui/server/src/typings/compression.ts
Added TypeBox request/response schemas CompressionJobCreationSchema and CompressionJobSchema with TS types CompressionJobCreation and CompressionJob. Added enum COMPRESSION_JOBS_TABLE_COLUMN_NAMES (id, status, clp_config).
Client API & UI
components/webui/client/src/api/compress/index.ts, components/webui/client/src/pages/IngestPage/Compress/index.tsx, components/webui/client/src/pages/IngestPage/Compress/PathsInputFormItem.tsx, components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx, components/webui/client/src/pages/IngestPage/Compress/SubmitFormItem.tsx
New API client submitCompressionJob(payload) posting to /api/compress. New Compress form and presentational components: multiline paths input, CLP_S dataset & timestamp fields, submit button; form mutation calls API, invalidates jobs query, resets on success and displays jobId or error.
Ingest page integration & styles
components/webui/client/src/pages/IngestPage/index.tsx, components/webui/client/src/pages/IngestPage/index.module.css
Inserted Compress component as its own full-row above Jobs. Renamed CSS class .jobsGrid.fullRow (keeps same grid styling) and updated usage.
Server settings & defaults
components/webui/server/settings.json, components/webui/common/src/config.ts
Added SqlDbCompressionJobsTableName, ClpStorageEngine (value "clp-s"), and archive output parameters (ArchiveOutputCompressionLevel, ArchiveOutputTargetArchiveSize, ArchiveOutputTargetDictionariesSize, ArchiveOutputTargetEncodedFileSize, ArchiveOutputTargetSegmentSize). Added shared config: CLP_STORAGE_ENGINES enum and CLP_DEFAULT_DATASET_NAME, and exported them.
Client import refactor (many files)
components/webui/client/src/config/index.ts, and updated consumers like components/webui/client/src/pages/IngestPage/Details/index.tsx, .../SpaceSavings/index.tsx, .../SearchPage/** (SearchControls, NativeControls, SubmitButton, search-requests, SearchResults/*, etc.)
Moved CLP config symbols (CLP_QUERY_ENGINES / CLP_STORAGE_ENGINES) to @webui/common/config; removed them from local client config exports and updated many modules to import the centralized symbols while importing SETTINGS_* from local config separately.
Query job manager & minor docs
components/webui/server/src/plugins/app/QueryJobDbManager/index.ts, components/webui/server/src/typings/query.ts, components/clp-package-utils/clp_package_utils/controller.py
QueryJobDbManager.submitJob now uses instance tableName instead of a hard-coded setting. Minor doc comment update in query typings. Web UI env setup now includes LogsInputType in controller settings payload.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant UI as Compress Form
    participant Client as API Client
    participant Server as /api/compress
    participant DB as CompressionJobDbManager
    participant MySQL as MySQL

    User->>UI: Enter paths, dataset?, timestampKey?
    User->>UI: Click Submit
    UI->>Client: submitCompressionJob(payload)
    Client->>Server: POST /api/compress

    Server->>Server: Clone DEFAULT_COMPRESSION_JOB_CONFIG
    Server->>Server: Map paths -> container path (/mnt/logs/...) and set path_prefix_to_remove
    alt CLP_S engine active
        Server->>Server: Validate/set dataset (default if missing)
        Server->>Server: Set timestamp_key if provided
    end

    Server->>DB: submitJob(jobConfig)
    DB->>DB: MsgPack encode jobConfig
    DB->>DB: Brotli compress encoded buffer
    DB->>MySQL: INSERT compressed config into configured table
    MySQL-->>DB: insertId
    DB-->>Server: jobId

    alt success
        Server-->>Client: HTTP 201 { jobId }
        Client-->>UI: Show success, reset form
    else failure
        Server-->>Client: HTTP 500 { message }
        Client-->>UI: Show error message
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas to focus on:
    • Compression route: job config construction, container path mapping, CLP_S dataset/timestamp handling, error responses.
    • CompressionJobDbManager: MsgPack encoding, Brotli compression, Buffer handling for SQL insert, and table/column mapping.
    • Client: mutation payload shape, API client typing, cache invalidation, UI success/error display.
    • Import refactor: ensure @webui/common/config exports and no circular imports or missing symbols.

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding compression job submission functionality for local files in the webui.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c181a4 and 6b915fb.

📒 Files selected for processing (1)
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 21

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/webui/server/src/plugins/app/QueryJobDbManager/index.ts (1)

28-31: Validate table name to prevent accidental SQL injection and typos

Even though the table name is from settings, add a defensive check to restrict it to safe identifiers.

-    private constructor (sqlPool: MySQLPromisePool, tableName: string) {
-        this.#sqlPool = sqlPool;
-        this.#tableName = tableName;
-    }
+    private constructor (sqlPool: MySQLPromisePool, tableName: string) {
+        this.#sqlPool = sqlPool;
+        // Allow only [A-Za-z0-9_]
+        if (false === /^[A-Za-z0-9_]+$/.test(tableName)) {
+            throw new Error(`Invalid table name: ${tableName}`);
+        }
+        this.#tableName = tableName;
+    }
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

877-891: Also set the compression jobs table in server settings for consistency

Server settings currently omit SqlDbCompressionJobsTableName. Set it alongside query jobs to avoid drift if the constant differs from the default.

     server_settings_json_updates = {
         "SqlDbHost": clp_config.database.host,
         "SqlDbPort": clp_config.database.port,
         "SqlDbName": clp_config.database.name,
         "SqlDbQueryJobsTableName": QUERY_JOBS_TABLE_NAME,
+        "SqlDbCompressionJobsTableName": COMPRESSION_JOBS_TABLE_NAME,
         "MongoDbHost": clp_config.results_cache.host,
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d491e59 and 9b87c4e.

📒 Files selected for processing (14)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2 hunks)
  • components/webui/client/src/api/compress/index.ts (1 hunks)
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx (1 hunks)
  • components/webui/client/src/pages/IngestPage/index.tsx (2 hunks)
  • components/webui/client/src/pages/SearchPage/SearchControls/index.tsx (1 hunks)
  • components/webui/server/eslint.config.mjs (1 hunks)
  • components/webui/server/settings.json (2 hunks)
  • components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1 hunks)
  • components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts (1 hunks)
  • components/webui/server/src/plugins/app/QueryJobDbManager/index.ts (1 hunks)
  • components/webui/server/src/routes/api/compress/index.ts (1 hunks)
  • components/webui/server/src/schemas/compression.ts (1 hunks)
  • components/webui/server/src/typings/compression.ts (1 hunks)
  • components/webui/server/src/typings/query.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/client/src/pages/SearchPage/SearchControls/index.tsx
  • components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts
  • components/webui/client/src/pages/IngestPage/index.tsx
  • components/webui/server/src/schemas/compression.ts
  • components/webui/server/src/typings/compression.ts
  • components/webui/server/src/plugins/app/QueryJobDbManager/index.ts
  • components/webui/client/src/api/compress/index.ts
  • components/webui/server/src/routes/api/compress/index.ts
  • components/webui/server/src/typings/query.ts
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx
  • components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts
🧠 Learnings (3)
📓 Common learnings
Learnt from: haiqi96
PR: y-scope/clp#651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/webui/server/settings.json
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
PR: y-scope/clp#596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.

Applied to files:

  • components/webui/server/src/typings/query.ts
🧬 Code graph analysis (4)
components/webui/server/src/schemas/compression.ts (1)
components/webui/client/src/api/compress/index.ts (2)
  • CompressionJobSchema (29-29)
  • CompressionJobCreationSchema (28-28)
components/webui/client/src/api/compress/index.ts (1)
components/webui/server/src/schemas/compression.ts (2)
  • CompressionJobSchema (22-22)
  • CompressionJobCreationSchema (21-21)
components/webui/server/src/routes/api/compress/index.ts (1)
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)
  • CompressionJobConfig (13-27)
components/webui/client/src/pages/IngestPage/Compress/index.tsx (4)
components/webui/client/src/api/compress/index.ts (2)
  • CompressionJobSchema (29-29)
  • submitCompressionJob (31-31)
components/webui/server/src/schemas/compression.ts (1)
  • CompressionJobSchema (22-22)
components/webui/client/src/settings.ts (1)
  • settings (36-36)
components/webui/client/src/components/DashboardCard/index.tsx (1)
  • DashboardCard (57-57)
🪛 GitHub Actions: clp-lint
components/clp-package-utils/clp_package_utils/scripts/start_clp.py

[error] 938-948: lint:check-no-cpp failed: 1 file would be reformatted by clang-format: /home/runner/work/clp/clp/components/clp-package-utils/clp_package_utils/scripts/start_clp.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: package-image
🔇 Additional comments (14)
components/webui/server/src/plugins/app/QueryJobDbManager/index.ts (1)

54-54: Good: table name now instance-scoped

Using this.#tableName decouples from the global settings constant and matches the instance’s config.

components/webui/server/settings.json (2)

6-6: LGTM: adds compression jobs table name

The default "compression_jobs" aligns with the new manager/route.


23-23: LGTM: exposes storage engine to server

Matches client/server payload wiring in start_clp.py.

components/clp-package-utils/clp_package_utils/scripts/start_clp.py (4)

861-870: Good: client settings include storage engine and compression table

This unblocks UI gating and API payloads.


1-1: Fix formatting to unblock CI

GA “clp-lint” reports clang-format changes pending for this file. Please run the project’s formatter task (e.g., clang-format -i components/clp-package-utils/clp_package_utils/scripts/start_clp.py) or the repo’s lint-fix script.


944-946: Restrict input_logs_dir mount to FS type and enforce read-only
In start_clp.py (lines 944–946), only append mounts.input_logs_dir when clp_config.logs_input.type == StorageType.FS, and configure that mount as read-only in CLPDockerMounts. Confirm whether the webui server actually requires direct filesystem access to logs; if not, remove this mount entirely to reduce attack surface.


940-961: No action needed; input_logs_dir is already read-only
The call on general.py (around line 248) passes True as the fourth argument to DockerMount, which enables read-only mode for mounts.input_logs_dir.

components/webui/server/src/typings/query.ts (2)

5-6: LGTM: doc reference fixed

Comments now point to the correct job_orchestration.scheduler.constants.


22-24: LGTM: doc reference fixed

Consistent with the above.

components/webui/server/src/routes/api/compress/index.ts (2)

75-84: Handle empty timestampKey consistently.

Avoid storing an empty string; only set when non-empty string provided.

-            if ("clp-s" === settings.ClpStorageEngine) {
+            if ("clp-s" === settings.ClpStorageEngine) {
                 if ("string" !== typeof dataset || 0 === dataset.length) {
                     jobConfig.input.dataset = CLP_DEFAULT_DATASET_NAME;
                 } else {
                     jobConfig.input.dataset = dataset;
                 }
-                if ("undefined" !== typeof timestampKey) {
-                    jobConfig.input.timestamp_key = timestampKey;
-                }
+                if ("string" === typeof timestampKey && 0 < timestampKey.length) {
+                    jobConfig.input.timestamp_key = timestampKey;
+                }
             }
⛔ Skipped due to learnings
Learnt from: quinntaylormitchell
PR: y-scope/clp#1283
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:206-210
Timestamp: 2025-09-04T13:51:02.311Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/compress.py, an empty string ("") is a valid value for the --timestamp-key parameter, so validation should not treat empty strings as missing values.

86-96: Remove ErrorSchema validation suggestion
No ErrorSchema export exists in this codebase; that check is not applicable.

Likely an incorrect or invalid review comment.

components/webui/server/src/typings/compression.ts (2)

9-15: No action needed: COMPRESSION_JOB_STATUS values (PENDING = 0, RUNNING = 1, SUCCEEDED = 2, FAILED = 3, KILLED = 4) match CompressionJobStatus in scheduler.constants.


30-34: Verify clp_config column exists in the compression_jobs table
The repo contains no migration or schema files for compression_jobs; please manually confirm your DB DDL defines a clp_config column to avoid runtime failures.

components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)

48-50: LGTM: factory + DI via Fastify instance is clean.

Comment on lines +4 to +8
type CompressionJobSchema = {
paths: string[];
dataset?: string;
timestampKey?: string;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Type nit: use readonly for input paths

Signals immutability to callers and prevents accidental mutation.

-type CompressionJobSchema = {
-    paths: string[];
+type CompressionJobSchema = {
+    paths: readonly string[];
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type CompressionJobSchema = {
paths: string[];
dataset?: string;
timestampKey?: string;
};
type CompressionJobSchema = {
paths: readonly string[];
dataset?: string;
timestampKey?: string;
};
🤖 Prompt for AI Agents
In components/webui/client/src/api/compress/index.ts around lines 4 to 8, the
CompressionJobSchema type declares paths: string[] which allows mutation; change
it to an immutable type (either readonly string[] or ReadonlyArray<string>) to
signal immutability to callers and prevent accidental mutation, and update any
places that push/pop or assign to paths to instead create new arrays (e.g.,
using spread) if mutation-like behavior is needed.

Comment on lines +20 to +25
const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
console.log("Submitting compression job:", JSON.stringify(payload));

const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
return response.data.jobId;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid logging sensitive payloads; add basic error handling

console.log can leak local paths. Wrap the call with try/catch and remove the log.

-const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
-    console.log("Submitting compression job:", JSON.stringify(payload));
-
-    const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
-    return response.data.jobId;
-};
+const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
+    try {
+        const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
+        return response.data.jobId;
+    } catch (e) {
+        // Re-throw a concise error; upstream UI can surface a friendly message.
+        throw new Error("Failed to submit compression job.");
+    }
+};
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
console.log("Submitting compression job:", JSON.stringify(payload));
const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
return response.data.jobId;
};
const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
try {
const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
return response.data.jobId;
} catch (e) {
// Re-throw a concise error; upstream UI can surface a friendly message.
throw new Error("Failed to submit compression job.");
}
};
🤖 Prompt for AI Agents
In components/webui/client/src/api/compress/index.ts around lines 20 to 25,
remove the console.log that prints the entire payload (it may leak sensitive
local paths) and wrap the axios.post call in a try/catch; on success return
response.data.jobId as before, and on error log a minimal, non-sensitive message
(or nothing) and rethrow the error (or return a rejected promise) so callers can
handle it—do not print the payload in the catch.

Comment on lines +41 to +46
// Convert multiline input to array of paths
const paths = values.paths
.split("\n")
.map((path) => path.trim())
.filter((path) => 0 < path.length);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate that at least one non-empty path is provided (after trimming).

Whitespace-only input passes the required rule and yields an empty paths array. Add a custom validator that enforces at least one trimmed, non-empty line before submit.

-                <Form.Item
+                <Form.Item
                     label={"Paths"}
                     name={"paths"}
-                    rules={[{required: true, message: "Please enter at least one path"}]}
+                    rules={[
+                        {required: true, message: "Please enter at least one path"},
+                        {
+                            validator: async (_, value?: string) => {
+                                const count = (value ?? "")
+                                    .split("\n")
+                                    .map((p) => p.trim())
+                                    .filter((p) => 0 < p.length).length;
+                                if (0 < count) { return Promise.resolve(); }
+                                return Promise.reject(new Error("Please enter at least one non-empty path"));
+                            },
+                        },
+                    ]}
                 >
                     <Input.TextArea
                         autoSize={{minRows: 4, maxRows: 10}}
-                        placeholder={"Enter paths to compress, one per line"}/>
+                        placeholder={"Enter container-relative paths (e.g., /logs/app.log), one per line"}/>
                 </Form.Item>

Also applies to: 94-102

🤖 Prompt for AI Agents
In components/webui/client/src/pages/IngestPage/Compress/index.tsx around lines
41-46 (and similarly at 94-102), the current splitting logic allows
whitespace-only input to pass validation because trimming happens after required
check; update the form validators to include a custom synchronous validator that
trims values.paths, splits on newlines, filters out empty strings and rejects
the field (returning a validation error) if the resulting paths array has length
0; ensure the same validator logic is applied to both places so submission is
blocked until at least one non-empty trimmed path is provided.

Comment on lines +52 to +58
if ("undefined" !== typeof values.dataset) {
payload.dataset = values.dataset;
}

if ("undefined" !== typeof values.timestampKey) {
payload.timestampKey = values.timestampKey;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Send optional fields only when non-empty and trim them.

Avoid sending "" for dataset/timestampKey; let the server default logic apply. Also trim whitespace.

-            if ("undefined" !== typeof values.dataset) {
-                payload.dataset = values.dataset;
-            }
+            if ("string" === typeof values.dataset && 0 < values.dataset.trim().length) {
+                payload.dataset = values.dataset.trim();
+            }
 
-            if ("undefined" !== typeof values.timestampKey) {
-                payload.timestampKey = values.timestampKey;
-            }
+            if ("string" === typeof values.timestampKey && 0 < values.timestampKey.trim().length) {
+                payload.timestampKey = values.timestampKey.trim();
+            }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if ("undefined" !== typeof values.dataset) {
payload.dataset = values.dataset;
}
if ("undefined" !== typeof values.timestampKey) {
payload.timestampKey = values.timestampKey;
}
if ("string" === typeof values.dataset && 0 < values.dataset.trim().length) {
payload.dataset = values.dataset.trim();
}
if ("string" === typeof values.timestampKey && 0 < values.timestampKey.trim().length) {
payload.timestampKey = values.timestampKey.trim();
}
🤖 Prompt for AI Agents
In components/webui/client/src/pages/IngestPage/Compress/index.tsx around lines
52 to 58, the payload currently sends dataset and timestampKey when they are
defined even if they are empty or whitespace; update the checks to trim the
values and only assign payload.dataset/payload.timestampKey when typeof value
!== "undefined" and the trimmed string length > 0, assigning the trimmed string
so the server sees no empty strings and can apply its defaults.

Comment thread components/webui/client/src/pages/IngestPage/Compress/index.tsx Outdated
Comment on lines +71 to +73
jobConfig.input.paths_to_compress = paths.map(
(path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix path concatenation: use POSIX join and normalise leading slashes.

CONTAINER_INPUT_LOGS_ROOT_DIR + path breaks when path lacks a leading slash (e.g., "logs/app.log" → "/mnt/logslogs/app.log"). Use path.posix.join after removing leading slashes to be robust across inputs.

+import path from "node:path";
@@
-            jobConfig.input.paths_to_compress = paths.map(
-                (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
-            );
+            jobConfig.input.paths_to_compress = paths.map((p) =>
+                path.posix.join(
+                    CONTAINER_INPUT_LOGS_ROOT_DIR,
+                    // ensure we don't drop the root when `p` starts with "/"
+                    (p ?? "").trim().replace(/^\/+/, "")
+                )
+            );

Also applies to: 4-4

🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress/index.ts around lines 71-73
(and also at lines 4-4), the code concatenates paths with
CONTAINER_INPUT_LOGS_ROOT_DIR using string + which breaks when the incoming path
lacks a leading slash; change it to use
path.posix.join(CONTAINER_INPUT_LOGS_ROOT_DIR, normalizedPath) where
normalizedPath strips any leading slashes (e.g., path.replace(/^\/+/, '') or use
path.posix.normalize on the relative path) so joins are POSIX-safe and produce
correct results regardless of leading slashes.

Comment on lines +7 to +11
const CompressionJobSchema = Type.Object({
paths: Type.Array(Type.String()),
dataset: Type.Optional(Type.String()),
timestampKey: Type.Optional(Type.String()),
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Tighten request schema: require ≥1 non-whitespace path and forbid extras.

Prevents empty submissions and stray fields. Keep empty dataset permitted if desired at the route, but ensure each path has at least one non-whitespace character.

-const CompressionJobSchema = Type.Object({
-    paths: Type.Array(Type.String()),
-    dataset: Type.Optional(Type.String()),
-    timestampKey: Type.Optional(Type.String()),
-});
+const CompressionJobSchema = Type.Object(
+    {
+        paths: Type.Array(
+            // Require at least one non-whitespace character
+            Type.String({minLength: 1, pattern: "\\S"}),
+            {minItems: 1}
+        ),
+        dataset: Type.Optional(Type.String()),
+        timestampKey: Type.Optional(Type.String()),
+    },
+    {additionalProperties: false}
+);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const CompressionJobSchema = Type.Object({
paths: Type.Array(Type.String()),
dataset: Type.Optional(Type.String()),
timestampKey: Type.Optional(Type.String()),
});
const CompressionJobSchema = Type.Object(
{
paths: Type.Array(
// Require at least one non-whitespace character
Type.String({ minLength: 1, pattern: "\\S" }),
{ minItems: 1 }
),
dataset: Type.Optional(Type.String()),
timestampKey: Type.Optional(Type.String()),
},
{ additionalProperties: false }
);
🤖 Prompt for AI Agents
In components/webui/server/src/schemas/compression.ts around lines 7–11, tighten
the schema so that "paths" is a non-empty array (minItems:1) of strings that
contain at least one non-whitespace character (use a pattern like \S and/or
minLength:1), keep "dataset" and "timestampKey" as optional string fields, and
forbid extra properties on the object (additionalProperties: false). Update the
Type.Object definition accordingly so requests with an empty paths array,
whitespace-only paths, or unexpected fields are rejected.

Comment on lines +16 to +18
const CompressionJobCreationSchema = Type.Object({
jobId: Type.Number(),
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Harden response schema.

Forbid unexpected properties in the response.

-const CompressionJobCreationSchema = Type.Object({
-    jobId: Type.Number(),
-});
+const CompressionJobCreationSchema = Type.Object(
+    {jobId: Type.Number()},
+    {additionalProperties: false}
+);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const CompressionJobCreationSchema = Type.Object({
jobId: Type.Number(),
});
const CompressionJobCreationSchema = Type.Object(
{ jobId: Type.Number() },
{ additionalProperties: false }
);
🤖 Prompt for AI Agents
In components/webui/server/src/schemas/compression.ts around lines 16 to 18, the
CompressionJobCreationSchema currently allows unexpected properties; update the
schema to forbid additional properties by setting the object options to disable
additionalProperties (or use Type.Strict) so responses with extra fields are
rejected by the validator.

Comment on lines +20 to +23
const COMPRESSION_JOB_STATUS_WAITING_STATES = new Set([
COMPRESSION_JOB_STATUS.PENDING,
COMPRESSION_JOB_STATUS.RUNNING,
]);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Export an immutable view (or a predicate) to avoid accidental mutation.

A Set remains mutable even when the binding is const.

Apply either:

-const COMPRESSION_JOB_STATUS_WAITING_STATES = new Set([
+const COMPRESSION_JOB_STATUS_WAITING_STATES: ReadonlySet<COMPRESSION_JOB_STATUS> = new Set([
     COMPRESSION_JOB_STATUS.PENDING,
     COMPRESSION_JOB_STATUS.RUNNING,
 ]);

Or replace with a predicate:

-const COMPRESSION_JOB_STATUS_WAITING_STATES = new Set([...]);
+const isWaitingStatus = (s: COMPRESSION_JOB_STATUS): boolean =>
+  (COMPRESSION_JOB_STATUS.PENDING === s) || (COMPRESSION_JOB_STATUS.RUNNING === s);
+
 export {
-  COMPRESSION_JOB_STATUS_WAITING_STATES,
+  isWaitingStatus,
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const COMPRESSION_JOB_STATUS_WAITING_STATES = new Set([
COMPRESSION_JOB_STATUS.PENDING,
COMPRESSION_JOB_STATUS.RUNNING,
]);
const COMPRESSION_JOB_STATUS_WAITING_STATES: ReadonlySet<COMPRESSION_JOB_STATUS> = new Set([
COMPRESSION_JOB_STATUS.PENDING,
COMPRESSION_JOB_STATUS.RUNNING,
]);
🤖 Prompt for AI Agents
In components/webui/server/src/typings/compression.ts around lines 20–23, the
current const binding creates a mutable Set which can be accidentally mutated;
replace it with an immutable exported predicate function (e.g. export function
isCompressionJobWaitingStatus(status: COMPRESSION_JOB_STATUS): boolean { return
status === COMPRESSION_JOB_STATUS.PENDING || status ===
COMPRESSION_JOB_STATUS.RUNNING; }) so callers can test membership without
exposing mutable state — alternatively, if you must export a collection, export
a frozen array (Object.freeze([...])) or a readonly tuple to provide an
immutable view.

Comment on lines +36 to +40
interface CompressionJob extends RowDataPacket {
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID]: number;
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS]: COMPRESSION_JOB_STATUS;
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG]: string;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix type: JOB_CONFIG should be binary (Buffer), not string.

You insert a Buffer in the DB; mysql2 returns BLOBs as Buffer by default. Typing this field as string risks subtle bugs and extra copies.

Apply:

 interface CompressionJob extends RowDataPacket {
     [COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID]: number;
     [COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS]: COMPRESSION_JOB_STATUS;
-    [COMPRESSION_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG]: string;
+    [COMPRESSION_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG]: Buffer;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
interface CompressionJob extends RowDataPacket {
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID]: number;
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS]: COMPRESSION_JOB_STATUS;
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG]: string;
}
interface CompressionJob extends RowDataPacket {
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID]: number;
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS]: COMPRESSION_JOB_STATUS;
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG]: Buffer;
}
🤖 Prompt for AI Agents
In components/webui/server/src/typings/compression.ts around lines 36–40, the
CompressionJob interface types JOB_CONFIG as string but the DB stores/returns a
binary BLOB (Buffer); change the type of
[COMPRESSION_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG] to Buffer (Node.js Buffer) and
update any code that assumed a string to explicitly convert the Buffer to string
when needed (e.g., buffer.toString()) or parse it directly as binary, so
consumers handle binary data correctly.

import axios from "axios";


type CompressionJobSchema = {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be in common

console.log("Submitting compression job:", JSON.stringify(payload));

const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
return response.data.jobId;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way we are doing in other functions is they return the response directly

payload.timestampKey = values.timestampKey;
}

const jobId = await submitCompressionJob(payload);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u should probably use TanStack for this. and the isSubmitting states,etc. It will help here. You can look at dataset code for examples

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed. i used it to easily solve the issue of the job table not getting updated after a job submission

}
};

const isStorageEngineClpS = "clp-s" === settings.ClpStorageEngine;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have enums for this. see webui/client/src/pages/SearchPage/SearchControls/SearchButton/SubmitButton/index.tsx as an example

path_prefix_to_remove: CONTAINER_INPUT_LOGS_ROOT_DIR,
},
output: {
compression_level: 3,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a path in the package where this is from

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, it should come from the settings

/**
* Default compression job configuration.
*/
const DEFAULT_COMPRESSION_JOB_CONFIG: CompressionJobConfig = Object.freeze({

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we dont need object.freeze here, and then we dont need structured clone

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structured clone was intended to avoid shallow copy. Otherwise, job configs might be mixed across jobs

/**
* Interval (in milliseconds) for polling a job's completion status.
*/
const JOB_COMPLETION_STATUS_POLL_INTERVAL_MILLIS = 1000;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be removed. sorry for the confusion

# Conflicts:
#	components/clp-package-utils/clp_package_utils/scripts/start_clp.py
#	components/webui/client/src/pages/SearchPage/SearchControls/index.tsx
#	components/webui/common/src/schemas/compression.ts
#	components/webui/server/eslint.config.mjs
#	components/webui/server/src/typings/query.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
components/webui/server/src/plugins/app/QueryJobDbManager/index.ts (3)

52-60: Escape identifiers; avoid string‑interpolated table/column names.

Using template literals for identifiers risks SQL injection via config and breaks on reserved words. Use identifier placeholders.

Apply (and mirror to other queries below):

-        const [result] = await this.#sqlPool.query<ResultSetHeader>(
-            `
-            INSERT INTO ${this.#tableName} (
-               ${QUERY_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG},
-               ${QUERY_JOBS_TABLE_COLUMN_NAMES.TYPE}
-            )
-            VALUES (?, ?)
-            `,
-            [
-                Buffer.from(encode(jobConfig)),
-                jobType,
-            ]
-        );
+        const [result] = await this.#sqlPool.query<ResultSetHeader>(
+            "INSERT INTO ?? (??, ??) VALUES (?, ?)",
+            [
+                this.#tableName,
+                QUERY_JOBS_TABLE_COLUMN_NAMES.JOB_CONFIG,
+                QUERY_JOBS_TABLE_COLUMN_NAMES.TYPE,
+                Buffer.from(encode(jobConfig)),
+                jobType,
+            ]
+        );

76-83: Same escaping issue in UPDATE.

-        await this.#sqlPool.query(
-            `UPDATE ${this.#tableName}
-             SET ${QUERY_JOBS_TABLE_COLUMN_NAMES.STATUS} = ${QUERY_JOB_STATUS.CANCELLING}
-             WHERE ${QUERY_JOBS_TABLE_COLUMN_NAMES.ID} = ?
-             AND ${QUERY_JOBS_TABLE_COLUMN_NAMES.STATUS}
-             IN (${QUERY_JOB_STATUS.PENDING}, ${QUERY_JOB_STATUS.RUNNING})`,
-            jobId
-        );
+        await this.#sqlPool.query(
+            "UPDATE ?? SET ?? = ? WHERE ?? = ? AND ?? IN (?, ?)",
+            [
+                this.#tableName,
+                QUERY_JOBS_TABLE_COLUMN_NAMES.STATUS, QUERY_JOB_STATUS.CANCELLING,
+                QUERY_JOBS_TABLE_COLUMN_NAMES.ID, jobId,
+                QUERY_JOBS_TABLE_COLUMN_NAMES.STATUS, QUERY_JOB_STATUS.PENDING, QUERY_JOB_STATUS.RUNNING,
+            ]
+        );

98-103: Same for SELECT in polling loop.

-                const [queryRows] = await this.#sqlPool.query<QueryJob[]>(
-                    `SELECT ${QUERY_JOBS_TABLE_COLUMN_NAMES.STATUS}
-                     FROM ${this.#tableName}
-                     WHERE ${QUERY_JOBS_TABLE_COLUMN_NAMES.ID} = ?`,
-                    jobId
-                );
+                const [queryRows] = await this.#sqlPool.query<QueryJob[]>(
+                    "SELECT ?? FROM ?? WHERE ?? = ?",
+                    [
+                        QUERY_JOBS_TABLE_COLUMN_NAMES.STATUS,
+                        this.#tableName,
+                        QUERY_JOBS_TABLE_COLUMN_NAMES.ID,
+                        jobId,
+                    ]
+                );
♻️ Duplicate comments (8)
components/webui/server/src/routes/api/compress/index.ts (3)

26-38: Move hard‑coded compression defaults to settings.json (with sane fallbacks).

Consider keys for level/target sizes so operators can tune without code changes.


55-61: Declare 400 for bad requests (validation/runtime checks).

                 response: {
                     [StatusCodes.CREATED]: CompressionJobCreationSchema,
+                    [StatusCodes.BAD_REQUEST]: ErrorSchema,
                     [StatusCodes.INTERNAL_SERVER_ERROR]: ErrorSchema,
                 },

70-74: Robust, POSIX‑safe path handling and jail to /mnt/logs.

String concat breaks on missing leading “/” and allows “..” to escape root. Use node:path and validate stays under root.

+import path from "node:path";
@@
-            jobConfig.input.paths_to_compress = paths.map(
-                (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
-            );
+            if (0 === paths.length) {
+                return reply.badRequest("At least one path is required.");
+            }
+            try {
+                jobConfig.input.paths_to_compress = paths.map((p) => {
+                    const rel = (p ?? "").trim().replace(/^\/+/, "");
+                    const abs = path.posix.resolve(CONTAINER_INPUT_LOGS_ROOT_DIR, rel);
+                    if (false === abs.startsWith(CONTAINER_INPUT_LOGS_ROOT_DIR + "/")) {
+                        throw new Error(`Path escapes logs root: ${p}`);
+                    }
+                    return abs;
+                });
+            } catch (e) {
+                return reply.badRequest("Invalid path provided.");
+            }
components/webui/client/src/api/compress/index.ts (1)

14-19: Remove payload logging and fix generics; optionally bubble errors.

-const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
-    console.log("Submitting compression job:", JSON.stringify(payload));
-
-    const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
-    return response.data.jobId;
-};
+const submitCompressionJob = async (payload: CompressionJobSchemaStatic): Promise<number> => {
+    const response = await axios.post<CompressionJobCreationSchemaStatic>("/api/compress", payload);
+    return response.data.jobId;
+};
components/webui/client/src/pages/IngestPage/Compress/index.tsx (4)

84-85: Rename card title to match the feature (“Submit Compression Job”).

-        <DashboardCard title={"Start Ingestion"}>
+        <DashboardCard title={"Submit Compression Job"}>

88-93: Inline the submit handler; extra .catch is redundant.

-                onFinish={(values) => {
-                    handleSubmit(values).catch((error: unknown) => {
-                        console.error("Error in handleSubmit:", error);
-                    });
-                }}
+                onFinish={handleSubmit}

94-102: Validate at least one non‑empty path and improve placeholder.

                 <Form.Item
                     label={"Paths"}
                     name={"paths"}
-                    rules={[{required: true, message: "Please enter at least one path"}]}
+                    rules={[
+                        {required: true, message: "Please enter at least one path"},
+                        {
+                            validator: async (_, v?: string) => {
+                                const count = (v ?? "").split("\n").map((p) => p.trim()).filter((p) => 0 < p.length).length;
+                                return 0 < count ? Promise.resolve() : Promise.reject(new Error("Please enter at least one non-empty path"));
+                            },
+                        },
+                    ]}
                 >
                     <Input.TextArea
                         autoSize={{minRows: 4, maxRows: 10}}
-                        placeholder={"Enter paths to compress, one per line"}/>
+                        placeholder={"Enter container-relative paths (e.g., /logs/app.log), one per line"}/>
                 </Form.Item>

48-59: Send optional fields only when non‑empty; trim values.

-            if ("undefined" !== typeof values.dataset) {
-                payload.dataset = values.dataset;
-            }
+            if ("string" === typeof values.dataset && 0 < values.dataset.trim().length) {
+                payload.dataset = values.dataset.trim();
+            }
 
-            if ("undefined" !== typeof values.timestampKey) {
-                payload.timestampKey = values.timestampKey;
-            }
+            if ("string" === typeof values.timestampKey && 0 < values.timestampKey.trim().length) {
+                payload.timestampKey = values.timestampKey.trim();
+            }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b87c4e and d22eff7.

📒 Files selected for processing (8)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1 hunks)
  • components/webui/client/src/api/compress/index.ts (1 hunks)
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx (1 hunks)
  • components/webui/common/src/schemas/compression.ts (1 hunks)
  • components/webui/server/src/plugins/app/QueryJobDbManager/index.ts (1 hunks)
  • components/webui/server/src/routes/api/compress/index.ts (1 hunks)
  • components/webui/server/src/typings/compression.ts (1 hunks)
  • components/webui/server/src/typings/query.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/typings/query.ts
  • components/webui/server/src/routes/api/compress/index.ts
  • components/webui/server/src/typings/compression.ts
  • components/webui/common/src/schemas/compression.ts
  • components/webui/server/src/plugins/app/QueryJobDbManager/index.ts
  • components/webui/client/src/api/compress/index.ts
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx
🧠 Learnings (1)
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
PR: y-scope/clp#596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.

Applied to files:

  • components/webui/server/src/typings/query.ts
🧬 Code graph analysis (3)
components/webui/server/src/routes/api/compress/index.ts (1)
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)
  • CompressionJobConfig (13-27)
components/webui/common/src/schemas/compression.ts (1)
components/webui/client/src/api/compress/index.ts (2)
  • CompressionJobSchema (23-23)
  • CompressionJobCreationSchema (22-22)
components/webui/client/src/api/compress/index.ts (1)
components/webui/common/src/schemas/compression.ts (2)
  • CompressionJobSchema (29-29)
  • CompressionJobCreationSchema (28-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: package-image
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (4)
components/webui/server/src/typings/query.ts (1)

5-8: Doc reference fix to constants package looks good.

components/webui/server/src/typings/compression.ts (2)

19-23: Column name "clp_config": verify DB schema alignment.

Search found only the enum (components/webui/server/src/typings/compression.ts) and a doc comment (components/webui/server/src/routes/api/compress/index.ts:19); no migrations or ORM model references were found — verify DB migrations/schema and any managers (including clp_py_utils) use the exact column name "clp_config" to avoid runtime errors.


6-12: No changes required — enums are in parity

CompressionJobStatus in components/job-orchestration/job_orchestration/scheduler/constants.py (PENDING=0, RUNNING=1, SUCCEEDED=2, FAILED=3, KILLED=4) exactly matches COMPRESSION_JOB_STATUS (components/webui/server/src/typings/compression.ts) and CompressionJobStatus (components/webui/client/src/pages/IngestPage/Jobs/typings.tsx).

components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

878-894: Propagating ClpStorageEngine to server settings — confirmed; approve changes.
Search under components/webui/server shows only components/webui/server/settings.json:23 containing "ClpStorageEngine"; no other matches detected.

Comment on lines +1 to +4
import type {
CompressionJobCreationSchema,
CompressionJobSchema,
} from "@webui/common/compression";

@coderabbitai coderabbitai Bot Sep 17, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Type imports are wrong: import the Static types, not the schemas.

The module exports schema values and separate ...Static types. Current import type {CompressionJobSchema} won’t compile.

-import type {
-    CompressionJobCreationSchema,
-    CompressionJobSchema,
-} from "@webui/common/compression";
+import type {
+    CompressionJobCreationSchemaStatic,
+    CompressionJobSchemaStatic,
+} from "@webui/common/schemas/compression";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import type {
CompressionJobCreationSchema,
CompressionJobSchema,
} from "@webui/common/compression";
import type {
CompressionJobCreationSchemaStatic,
CompressionJobSchemaStatic,
} from "@webui/common/schemas/compression";
🤖 Prompt for AI Agents
In components/webui/client/src/api/compress/index.ts lines 1-4, the file
currently imports schema values instead of the exported Static Type aliases;
replace the imports of CompressionJobCreationSchema and CompressionJobSchema
with the corresponding Static types (e.g., CompressionJobCreationStatic and
CompressionJobStatic) from "@webui/common/compression", update the import
statement to import those Static types using import type, and adjust any local
type references to use the Static names so the module compiles.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with rabbit

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@davemarco Thank you for confirming! The type import issue needs to be fixed as suggested. The current import attempts to use the schema definitions as types, but TypeScript requires the Static type aliases that are specifically exported for this purpose.

The fix is to import CompressionJobCreationSchemaStatic and CompressionJobSchemaStatic instead of the schema definitions themselves, and use the correct import path "@webui/common/schemas/compression".

  /|   /|  
 ( :v:  )
  |(_)|
  |   |
 /     \
/       \

🏁 Script executed:

#!/bin/bash
# Check the exports from the compression schemas module
cat "components/webui/common/src/schemas/compression.ts"

Length of output: 764


@davemarco Thank you for confirming! You're absolutely right. Looking at the actual exports from the compression schemas module, I can confirm the type import issue needs to be fixed exactly as I suggested.

The module exports:

  • The actual schemas: CompressionJobCreationSchema, CompressionJobSchema (runtime values)
  • The static types: CompressionJobCreationSchemaStatic, CompressionJobSchemaStatic (TypeScript types)

The current import is trying to use the runtime schema values as types, which won't work. The fix is to import the Static type aliases instead:

import type {
    CompressionJobCreationSchemaStatic,
    CompressionJobSchemaStatic,
} from "@webui/common/schemas/compression";
  /|   /|  
 ( :v:  )
  |(_)|
  |   |
 /     \
/       \

🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp#596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Comment on lines +21 to +25
export type {
CompressionJobCreationSchema,
CompressionJobSchema,
};
export {submitCompressionJob};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Re‑export the correct types (Static).

Aligns the public surface with real types.

-export type {
-    CompressionJobCreationSchema,
-    CompressionJobSchema,
-};
+export type {
+    CompressionJobCreationSchemaStatic,
+    CompressionJobSchemaStatic,
+};

If other API modules return full Axios responses, confirm whether this one should mirror that pattern.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In components/webui/client/src/api/compress/index.ts around lines 21 to 25, the
module is re-exporting the runtime schema types (CompressionJobCreationSchema,
CompressionJobSchema) but the public API should expose the static TypeScript
types instead; change the exports to re-export the Static types (e.g., the
Static<typeof ...> aliases or pre-defined XxxStatic types) so consumers get the
correct compile-time shapes, and ensure you import/define those Static types in
this file before exporting; additionally, verify whether other API modules
return full Axios responses and, if so, make submitCompressionJob mirror that
pattern (return full AxiosResponse) or document why this one differs.

Comment on lines +10 to +14
const CompressionJobSchema = Type.Object({
paths: Type.Array(Type.String()),
dataset: Type.Optional(Type.String()),
timestampKey: Type.Optional(Type.String()),
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enforce non‑empty paths and trim/ban empty optionals at schema level.

Prevents empty submissions and avoids sending “” for optional fields.

-const CompressionJobSchema = Type.Object({
-    paths: Type.Array(Type.String()),
-    dataset: Type.Optional(Type.String()),
-    timestampKey: Type.Optional(Type.String()),
-});
+const CompressionJobSchema = Type.Object(
+    {
+        paths: Type.Array(
+            Type.String({minLength: 1}),
+            {minItems: 1}
+        ),
+        dataset: Type.Optional(Type.String({minLength: 1})),
+        timestampKey: Type.Optional(Type.String({minLength: 1})),
+    },
+    {additionalProperties: false}
+);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const CompressionJobSchema = Type.Object({
paths: Type.Array(Type.String()),
dataset: Type.Optional(Type.String()),
timestampKey: Type.Optional(Type.String()),
});
const CompressionJobSchema = Type.Object(
{
paths: Type.Array(
Type.String({ minLength: 1 }),
{ minItems: 1 }
),
dataset: Type.Optional(Type.String({ minLength: 1 })),
timestampKey: Type.Optional(Type.String({ minLength: 1 })),
},
{ additionalProperties: false }
);
🤖 Prompt for AI Agents
In components/webui/common/src/schemas/compression.ts around lines 10–14, the
schema currently allows empty paths arrays and empty string optionals; update it
so paths is an array with minItems:1 and each item is a non-empty string (e.g.
Type.Array(Type.String({ minLength: 1, pattern: "\\S" }), { minItems: 1 })), and
change dataset and timestampKey to Type.Optional(Type.String({ minLength: 1,
pattern: "\\S" })) so blank or whitespace-only values are rejected at the schema
level rather than accepting "".

Comment on lines +75 to +84
if ("clp-s" === settings.ClpStorageEngine) {
if ("string" !== typeof dataset || 0 === dataset.length) {
jobConfig.input.dataset = CLP_DEFAULT_DATASET_NAME;
} else {
jobConfig.input.dataset = dataset;
}
if ("undefined" !== typeof timestampKey) {
jobConfig.input.timestamp_key = timestampKey;
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Trim and ignore empty optionals; set dataset default only when needed.

Avoid persisting empty strings.

-            if ("clp-s" === settings.ClpStorageEngine) {
-                if ("string" !== typeof dataset || 0 === dataset.length) {
-                    jobConfig.input.dataset = CLP_DEFAULT_DATASET_NAME;
-                } else {
-                    jobConfig.input.dataset = dataset;
-                }
-                if ("undefined" !== typeof timestampKey) {
-                    jobConfig.input.timestamp_key = timestampKey;
-                }
-            }
+            if ("clp-s" === settings.ClpStorageEngine) {
+                const ds = "string" === typeof dataset ? dataset.trim() : "";
+                jobConfig.input.dataset = 0 < ds.length ? ds : CLP_DEFAULT_DATASET_NAME;
+                if ("string" === typeof timestampKey && 0 < timestampKey.trim().length) {
+                    jobConfig.input.timestamp_key = timestampKey.trim();
+                }
+            }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if ("clp-s" === settings.ClpStorageEngine) {
if ("string" !== typeof dataset || 0 === dataset.length) {
jobConfig.input.dataset = CLP_DEFAULT_DATASET_NAME;
} else {
jobConfig.input.dataset = dataset;
}
if ("undefined" !== typeof timestampKey) {
jobConfig.input.timestamp_key = timestampKey;
}
}
if ("clp-s" === settings.ClpStorageEngine) {
const ds = "string" === typeof dataset ? dataset.trim() : "";
jobConfig.input.dataset = 0 < ds.length ? ds : CLP_DEFAULT_DATASET_NAME;
if ("string" === typeof timestampKey && 0 < timestampKey.trim().length) {
jobConfig.input.timestamp_key = timestampKey.trim();
}
}
🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress/index.ts around lines 75 to
84, the code currently persists empty strings for optional fields; trim inputs
and only set jobConfig.input.dataset to CLP_DEFAULT_DATASET_NAME when dataset is
not a non-empty trimmed string, otherwise set dataset to the trimmed value;
likewise only assign jobConfig.input.timestamp_key when timestampKey is defined
and its trimmed length > 0. Ensure you call .trim() on dataset and timestampKey
(after checking for undefined) and avoid assigning empty strings so empty
optionals are ignored.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

878-898: Also propagate the compression-jobs table name to server settings (avoid drift).

Server settings currently rely on the static value in settings.json. Keep them in sync at runtime like other DB keys.

     server_settings_json_updates = {
+        "SqlDbCompressionJobsTableName": COMPRESSION_JOBS_TABLE_NAME,
         "ArchiveOutputCompressionLevel": clp_config.archive_output.compression_level,
         "ArchiveOutputTargetArchiveSize": clp_config.archive_output.target_archive_size,
         "ArchiveOutputTargetDictionariesSize": clp_config.archive_output.target_dictionaries_size,
         "ArchiveOutputTargetEncodedFileSize":  clp_config.archive_output.target_encoded_file_size,
         "ArchiveOutputTargetSegmentSize": clp_config.archive_output.target_segment_size,
         "SqlDbHost": clp_config.database.host,
♻️ Duplicate comments (7)
components/webui/server/src/routes/api/compress/index.ts (3)

26-38: Resolved: defaults now sourced from settings.json.

Good move away from hard‑coded literals; keeps operations tunable without code changes.


55-61: Declare 400 responses in the route schema.

Aligns OpenAPI with runtime validation and new bad-request cases.

             schema: {
                 body: CompressionJobSchema,
                 response: {
                     [StatusCodes.CREATED]: CompressionJobCreationSchema,
+                    [StatusCodes.BAD_REQUEST]: ErrorSchema,
                     [StatusCodes.INTERNAL_SERVER_ERROR]: ErrorSchema,
                 },
                 tags: ["Compression"],
             },

75-84: Trim and ignore empty optionals; set dataset default only when needed.

Prevents storing empty strings; applies CLP default cleanly.

-            if ("clp-s" === settings.ClpStorageEngine) {
-                if ("string" !== typeof dataset || 0 === dataset.length) {
-                    jobConfig.input.dataset = CLP_DEFAULT_DATASET_NAME;
-                } else {
-                    jobConfig.input.dataset = dataset;
-                }
-                if ("undefined" !== typeof timestampKey) {
-                    jobConfig.input.timestamp_key = timestampKey;
-                }
-            }
+            if ("clp-s" === settings.ClpStorageEngine) {
+                const ds = "string" === typeof dataset ? dataset.trim() : "";
+                jobConfig.input.dataset = 0 < ds.length ? ds : CLP_DEFAULT_DATASET_NAME;
+                if ("string" === typeof timestampKey && 0 < timestampKey.trim().length) {
+                    jobConfig.input.timestamp_key = timestampKey.trim();
+                }
+            }
components/webui/client/src/pages/IngestPage/Compress/index.tsx (4)

52-58: Trim and omit empty optionals before sending.

Avoid persisting empty strings; send only meaningful values.

-            if ("undefined" !== typeof values.dataset) {
-                payload.dataset = values.dataset;
-            }
-
-            if ("undefined" !== typeof values.timestampKey) {
-                payload.timestampKey = values.timestampKey;
-            }
+            if ("string" === typeof values.dataset && 0 < values.dataset.trim().length) {
+                payload.dataset = values.dataset.trim();
+            }
+            if ("string" === typeof values.timestampKey && 0 < values.timestampKey.trim().length) {
+                payload.timestampKey = values.timestampKey.trim();
+            }

82-82: Rename card title to reflect the feature.

“Start Ingestion” is misleading here.

-        <DashboardCard title={"Start Ingestion"}>
+        <DashboardCard title={"Submit Compression Job"}>

86-90: Inline the submit handler; the extra .catch is redundant.

handleSubmit already handles errors.

-                onFinish={(values) => {
-                    handleSubmit(values).catch((error: unknown) => {
-                        console.error("Error in handleSubmit:", error);
-                    });
-                }}
+                onFinish={handleSubmit}

93-100: Validate at least one trimmed, non-empty path; improve placeholder.

Prevents whitespace-only submits and clarifies expected input.

                 <Form.Item
                     label={"Paths"}
                     name={"paths"}
-                    rules={[{required: true, message: "Please enter at least one path"}]}
+                    rules={[
+                        {required: true, message: "Please enter at least one path"},
+                        {
+                            validator: async (_, value?: string) => {
+                                const count = (value ?? "")
+                                    .split("\n")
+                                    .map((p) => p.trim())
+                                    .filter((p) => 0 < p.length).length;
+                                if (0 < count) { return Promise.resolve(); }
+                                return Promise.reject(new Error("Please enter at least one non-empty path"));
+                            },
+                        },
+                    ]}
                 >
                     <Input.TextArea
                         autoSize={{minRows: 4, maxRows: 10}}
-                        placeholder={"Enter paths to compress, one per line"}/>
+                        placeholder={"Enter container-relative paths (e.g., /logs/app.log), one per line"}/>
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d22eff7 and 8049e8d.

📒 Files selected for processing (4)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2 hunks)
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx (1 hunks)
  • components/webui/server/settings.json (2 hunks)
  • components/webui/server/src/routes/api/compress/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/client/src/pages/IngestPage/Compress/index.tsx
  • components/webui/server/src/routes/api/compress/index.ts
🧬 Code graph analysis (2)
components/webui/client/src/pages/IngestPage/Compress/index.tsx (3)
components/webui/client/src/api/compress/index.ts (2)
  • CompressionJobSchema (23-23)
  • submitCompressionJob (25-25)
components/webui/client/src/components/DashboardCard/index.tsx (1)
  • DashboardCard (57-57)
components/webui/client/src/config/index.ts (2)
  • CLP_STORAGE_ENGINES (26-26)
  • SETTINGS_STORAGE_ENGINE (28-28)
components/webui/server/src/routes/api/compress/index.ts (1)
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)
  • CompressionJobConfig (13-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: package-image
🔇 Additional comments (3)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

878-884: LGTM: wiring archive-output knobs and storage engine into server settings.

Values align with settings.json keys consumed by the server route; update_settings_object will enforce key correctness.

Also applies to: 897-898

components/webui/server/settings.json (2)

6-6: LGTM: added SqlDbCompressionJobsTableName.

Matches backend usage; ensure start_clp also sets this at runtime if you plan to vary it per install (prefixing, etc.).

Would you like me to update start_clp.py to set this key as well (see my suggestion there)?


22-27: LGTM: new archive-output knobs and storage-engine flag.

Names match what the server route reads; numeric defaults look sane.

Also applies to: 29-29

Comment on lines +70 to +74
const jobConfig: CompressionJobConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
jobConfig.input.paths_to_compress = paths.map(
(path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Prevent path traversal and join paths robustly.

String concatenation allows “..” to escape /mnt/logs. Normalise, resolve, and enforce the root prefix.

+import path from "node:path";
@@
-            const jobConfig: CompressionJobConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
-            jobConfig.input.paths_to_compress = paths.map(
-                (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
-            );
+            const jobConfig: CompressionJobConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
+            const absPaths = paths.map((p) => {
+                const rel = (p ?? "").trim().replace(/^\/+/, "");
+                return path.posix.resolve(CONTAINER_INPUT_LOGS_ROOT_DIR, rel);
+            });
+            for (const abs of absPaths) {
+                if (!(abs === CONTAINER_INPUT_LOGS_ROOT_DIR || abs.startsWith(CONTAINER_INPUT_LOGS_ROOT_DIR + "/"))) {
+                    const errMsg = "Each path must resolve under /mnt/logs.";
+                    return reply.badRequest(errMsg);
+                }
+            }
+            jobConfig.input.paths_to_compress = absPaths;

Please confirm @fastify/sensible is registered in the server (reply.badRequest is used already via reply.internalServerError). If not, replace badRequest with an explicit 400:

-                    return reply.badRequest(errMsg);
+                    reply.code(StatusCodes.BAD_REQUEST);
+                    return {message: errMsg};

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "PWD: $(pwd)"
echo
echo "1) Try to locate the exact file from the review:"
rg -n --hidden -S "components/webui/server/src/routes/api/compress/index.ts" || true
echo

echo "2) Try to find the file anywhere in the repo and print a snippet if found:"
file=$(rg --hidden --files | rg "components/webui/server/src/routes/api/compress/index.ts" || true)
if [ -n "$file" ]; then
  echo "FOUND: $file"
  echo "---- file head (1..240) ----"
  sed -n '1,240p' "$file" || true
else
  echo "File not found at that exact path. Searching for related symbols..."
  rg -n --hidden -S "CONTAINER_INPUT_LOGS_ROOT_DIR|paths_to_compress|paths_to_compress\\b" -C3 || true
fi
echo

echo "3) Search for @fastify/sensible in code and package.json files:"
rg -n --hidden -S '"@fastify/sensible"' --glob '!node_modules/**' || true
rg -n --hidden -S "require\\(['\"]@fastify/sensible['\"]\\)|from ['\"]@fastify/sensible['\"]|import .*from ['\"]@fastify/sensible['\"]" --glob '!node_modules/**' || true
echo

echo "4) Search for reply.badRequest / reply.internalServerError usages:"
rg -n --hidden -S "reply\\.badRequest|reply\\.internalServerError" --glob '!node_modules/**' || true
echo

echo "5) Search for fastify.register occurrences mentioning sensible or general plugin registration:"
rg -n --hidden -S "fastify\\.register\\(|register\\(.*sensible|register\\(.*'@fastify/sensible'|register\\(.*\"@fastify/sensible\"" --glob '!node_modules/**' || true
echo

echo "6) Show top-level package.json (first 200 lines) if present:"
if [ -f package.json ]; then
  sed -n '1,200p' package.json || true
else
  echo "no root package.json"
fi
echo
echo "Done."

Length of output: 4463


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "PWD: $(pwd)"
echo

echo "1) Show the compression route file (head):"
file="components/webui/server/src/routes/api/compress/index.ts"
if [ -f "$file" ]; then
  sed -n '1,240p' "$file"
else
  echo "File not found: $file"
fi
echo
echo "----------------------------------------"
echo "2) Search for @fastify/sensible occurrences (search all files, ignore .gitignore):"
rg -n -uu -S "\"@fastify/sensible\"|'@fastify/sensible'|@fastify/sensible" --hidden --no-ignore -g '!*node_modules/**' || true
echo
echo "----------------------------------------"
echo "3) Search for 'sensible' token (plugin may be referenced by name):"
rg -n -uu -S '\bsensible\b' --hidden --no-ignore -g '!*node_modules/**' || true
echo
echo "----------------------------------------"
echo "4) Search for reply.badRequest and reply.internalServerError usages:"
rg -n -uu -S "reply\\.badRequest|reply\\.internalServerError" --hidden --no-ignore -g '!*node_modules/**' -C3 || true
echo
echo "----------------------------------------"
echo "5) Search for fastify.register calls (to find plugin registration):"
rg -n -uu -S "fastify\\.register\\s*\\(" --hidden --no-ignore -g '!*node_modules/**' -C3 || true
echo
echo "6) Search for decorateReply / decorate to see custom decorations:"
rg -n -uu -S "decorateReply|decorate\\s*\\(" --hidden --no-ignore -g '!*node_modules/**' -C3 || true
echo
echo "----------------------------------------"
echo "7) Find package.json files and check for @fastify/sensible in them:"
rg -n -uu -S '"@fastify/sensible"' --hidden --no-ignore -g '!*node_modules/**' || true
echo
echo "8) List nearest package.json files (components/webui/server and parents) if present:"
for p in \
  "components/webui/server/package.json" \
  "components/webui/package.json" \
  "components/package.json" \
  "package.json"; do
  if [ -f "$p" ]; then
    echo "---- $p ----"
    sed -n '1,200p' "$p"
    echo
  fi
done
echo "Done."

Length of output: 7575


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "PWD: $(pwd)"

echo
echo "1) Grep for '@fastify/sensible' in code files:"
find . -type f \( -name '*.ts' -o -name '*.js' -o -name '*.mjs' \) -not -path '*/node_modules/*' -not -path '*/.git/*' -print0 | xargs -0 grep -nH --color=never "@fastify/sensible" || true

echo
echo "2) Search for fastify.register lines referencing sensible or autoload usage:"
find . -type f \( -name '*.ts' -o -name '*.js' -o -name '*.mjs' \) -not -path '*/node_modules/*' -not -path '*/.git/*' -print0 | xargs -0 grep -nH --color=never -E "fastify\.register\s*\(|register\(|autoload" || true

echo
echo "3) Search for 'sensible' token anywhere:"
find . -type f \( -name '*.ts' -o -name '*.js' -o -name '*.mjs' -o -name '*.json' \) -not -path '*/node_modules/*' -not -path '*/.git/*' -print0 | xargs -0 grep -nH --color=never -E "\bsensible\b" || true

echo
echo "4) Show plugins directory if present (first 200 lines per file):"
if [ -d components/webui/server/src/plugins ]; then
  find components/webui/server/src/plugins -maxdepth 2 -type f -print -exec sed -n '1,200p' {} \; || true
else
  echo "No plugins dir at components/webui/server/src/plugins"
fi

echo
echo "5) Show server bootstrap files (src/main.ts, src/app.ts, src/index.ts, src/server.ts):"
for f in components/webui/server/src/main.ts components/webui/server/src/app.ts components/webui/server/src/index.ts components/webui/server/src/server.ts; do
  if [ -f "$f" ]; then
    echo "---- $f ----"
    sed -n '1,240p' "$f"
    echo
  fi
done

echo
echo "6) Search for reply.badRequest usage:"
find . -type f \( -name '*.ts' -o -name '*.js' -o -name '*.mjs' \) -not -path '*/node_modules/*' -not -path '*/.git/*' -print0 | xargs -0 grep -nH --color=never "reply\\.badRequest" || true

echo
echo "Done."

Length of output: 15832


Prevent path traversal and join paths robustly.

Normalise and resolve each supplied path against CONTAINER_INPUT_LOGS_ROOT_DIR; reject any that resolve outside /mnt/logs.

File: components/webui/server/src/routes/api/compress/index.ts (around lines 70–74)

+import path from "node:path";
@@
-            const jobConfig: CompressionJobConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
-            jobConfig.input.paths_to_compress = paths.map(
-                (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
-            );
+            const jobConfig: CompressionJobConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
+            const absPaths = paths.map((p) => {
+                const rel = (p ?? "").trim().replace(/^\/+/, "");
+                return path.posix.resolve(CONTAINER_INPUT_LOGS_ROOT_DIR, rel);
+            });
+            for (const abs of absPaths) {
+                if (!(abs === CONTAINER_INPUT_LOGS_ROOT_DIR || abs.startsWith(CONTAINER_INPUT_LOGS_ROOT_DIR + "/"))) {
+                    const errMsg = "Each path must resolve under /mnt/logs.";
+                    return reply.badRequest(errMsg);
+                }
+            }
+            jobConfig.input.paths_to_compress = absPaths;

Confirmed @fastify/sensible is registered (components/webui/server/src/plugins/external/sensible.ts, autoloaded in src/app.ts) — reply.badRequest is available; no replacement with manual 400 required.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const jobConfig: CompressionJobConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
jobConfig.input.paths_to_compress = paths.map(
(path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
);
import path from "node:path";
const jobConfig: CompressionJobConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
const absPaths = paths.map((p) => {
// Strip leading slashes and whitespace
const rel = (p ?? "").trim().replace(/^\/+/, "");
// Resolve against the logs root to collapse any “..” segments
return path.posix.resolve(CONTAINER_INPUT_LOGS_ROOT_DIR, rel);
});
// Reject any paths that escape the logs directory
for (const abs of absPaths) {
if (
abs !== CONTAINER_INPUT_LOGS_ROOT_DIR &&
!abs.startsWith(CONTAINER_INPUT_LOGS_ROOT_DIR + "/")
) {
const errMsg = "Each path must resolve under /mnt/logs.";
return reply.badRequest(errMsg);
}
}
jobConfig.input.paths_to_compress = absPaths;

Comment thread components/webui/server/src/routes/api/compress/index.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

878-898: Server settings miss SqlDbCompressionJobsTableName update (prefix drift risk).

Client settings set SqlDbCompressionJobsTableName, but server settings do not. If your DB uses a table prefix, the server will keep the default "compression_jobs" and drift from the created/prefixed table. Add it alongside SqlDbQueryJobsTableName.

Apply this diff:

     server_settings_json_updates = {
+        "SqlDbCompressionJobsTableName": COMPRESSION_JOBS_TABLE_NAME,
         "ArchiveOutputCompressionLevel": clp_config.archive_output.compression_level,
         "ArchiveOutputTargetArchiveSize": clp_config.archive_output.target_archive_size,
         "ArchiveOutputTargetDictionariesSize": clp_config.archive_output.target_dictionaries_size,
-        "ArchiveOutputTargetEncodedFileSize":  clp_config.archive_output.target_encoded_file_size,
+        "ArchiveOutputTargetEncodedFileSize": clp_config.archive_output.target_encoded_file_size,
         "ArchiveOutputTargetSegmentSize": clp_config.archive_output.target_segment_size,
         "SqlDbHost": clp_config.database.host,
         "SqlDbPort": clp_config.database.port,
         "SqlDbName": clp_config.database.name,
         "SqlDbQueryJobsTableName": QUERY_JOBS_TABLE_NAME,
         "MongoDbHost": clp_config.results_cache.host,
         "MongoDbPort": clp_config.results_cache.port,
         "MongoDbName": clp_config.results_cache.db_name,
         "MongoDbSearchResultsMetadataCollectionName": clp_config.webui.results_metadata_collection_name,
         "MongoDbStreamFilesCollectionName": clp_config.results_cache.stream_collection_name,
         "ClientDir": str(container_webui_dir / "client"),
         "LogViewerDir": str(container_webui_dir / "yscope-log-viewer"),
         "StreamTargetUncompressedSize": container_clp_config.stream_output.target_uncompressed_size,
         "ClpQueryEngine": clp_config.package.query_engine,
         "ClpStorageEngine": clp_config.package.storage_engine,
     }
components/webui/server/settings.json (1)

1-32: Add ContainerInputLogsRootDir to settings to avoid hard-coded TS constant.

The API route currently hard-codes "/mnt/logs". Centralise it here so it can be overridden consistently.

Apply this diff (adjust placement as you see fit):

 {
   "SqlDbHost": "localhost",
   "SqlDbPort": 3306,
   "SqlDbName": "clp-db",
   "SqlDbQueryJobsTableName": "query_jobs",
   "SqlDbCompressionJobsTableName": "compression_jobs",
+  "ContainerInputLogsRootDir": "/mnt/logs",
 
   "MongoDbHost": "localhost",

Then consume this in the server route (see related comment).

♻️ Duplicate comments (9)
components/webui/server/src/routes/api/compress/index.ts (5)

26-38: Nice: defaults sourced from settings.json (no hard-coded literals).

This addresses prior feedback about making compression parameters configurable.


55-60: Expose 400 Bad Request in schema.

Declare 400 for validation/runtime input errors.

             schema: {
                 body: CompressionJobSchema,
                 response: {
                     [StatusCodes.CREATED]: CompressionJobCreationSchema,
+                    [StatusCodes.BAD_REQUEST]: ErrorSchema,
                     [StatusCodes.INTERNAL_SERVER_ERROR]: ErrorSchema,
                 },
                 tags: ["Compression"],
             },

75-84: Trim/omit empty optionals; set dataset default only when needed.

Prevents persisting empty strings and matches prior feedback.

-            if ("clp-s" === settings.ClpStorageEngine) {
-                if ("string" !== typeof dataset || 0 === dataset.length) {
-                    jobConfig.input.dataset = CLP_DEFAULT_DATASET_NAME;
-                } else {
-                    jobConfig.input.dataset = dataset;
-                }
-                if ("undefined" !== typeof timestampKey) {
-                    jobConfig.input.timestamp_key = timestampKey;
-                }
-            }
+            if ("clp-s" === settings.ClpStorageEngine) {
+                const ds = "string" === typeof dataset ? dataset.trim() : "";
+                jobConfig.input.dataset = 0 < ds.length ? ds : CLP_DEFAULT_DATASET_NAME;
+                if ("string" === typeof timestampKey && 0 < timestampKey.trim().length) {
+                    jobConfig.input.timestamp_key = timestampKey.trim();
+                }
+            }

70-73: Fix path join and block traversal outside the input root.

String concatenation is fragile and allows "../" to escape /mnt/logs. Use POSIX join/normalise and enforce the root.

+import path from "node:path";
@@
-            jobConfig.input.paths_to_compress = paths.map(
-                (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
-            );
+            jobConfig.input.paths_to_compress = paths.map((raw) => {
+                const p = String(raw ?? "").replace(/\\/g, "/").trim();
+                // Accept both container-absolute and relative inputs
+                const rel = p.startsWith(CONTAINER_INPUT_LOGS_ROOT_DIR)
+                    ? p.slice(CONTAINER_INPUT_LOGS_ROOT_DIR.length)
+                    : p;
+                const joined = path.posix.join(
+                    CONTAINER_INPUT_LOGS_ROOT_DIR,
+                    rel.replace(/^\/+/, "")
+                );
+                const normalised = path.posix.normalize(joined);
+                if (
+                    normalised !== CONTAINER_INPUT_LOGS_ROOT_DIR &&
+                    false == normalised.startsWith(CONTAINER_INPUT_LOGS_ROOT_DIR + "/")
+                ) {
+                    reply.code(StatusCodes.BAD_REQUEST);
+                    throw new Error(`Path escapes input root: ${p}`);
+                }
+                return normalised;
+            });

13-22: Read CONTAINER_INPUT_LOGS_ROOT_DIR from settings (fallback to "/mnt/logs")

Replace the hard-coded constant in components/webui/server/src/routes/api/compress/index.ts so it reads the value from settings with a safe fallback:

-const CONTAINER_INPUT_LOGS_ROOT_DIR = "/mnt/logs";
+const CONTAINER_INPUT_LOGS_ROOT_DIR =
+    (settings as {ContainerInputLogsRootDir?: string}).ContainerInputLogsRootDir ?? "/mnt/logs";

Add "ContainerInputLogsRootDir": "/mnt/logs" to components/webui/server/settings.json and update components/package-template/src/sbin/start-clp.sh and any docs that assume /mnt/logs (so container volume mounts and documentation match the configured default).

components/webui/client/src/pages/IngestPage/Compress/index.tsx (4)

82-82: Rename card title to reflect the feature.

Use “Submit Compression Job”.

-        <DashboardCard title={"Start Ingestion"}>
+        <DashboardCard title={"Submit Compression Job"}>

86-90: Inline handler; remove redundant catch.

handleSubmit already handles errors.

-                onFinish={(values) => {
-                    handleSubmit(values).catch((error: unknown) => {
-                        console.error("Error in handleSubmit:", error);
-                    });
-                }}
+                onFinish={handleSubmit}

92-100: Validate at least one non-empty trimmed path.

Block whitespace-only input.

                 <Form.Item
                     label={"Paths"}
                     name={"paths"}
-                    rules={[{required: true, message: "Please enter at least one path"}]}
+                    rules={[
+                        {required: true, message: "Please enter at least one path"},
+                        {
+                            validator: async (_, value?: string) => {
+                                const count = (value ?? "")
+                                    .split("\n")
+                                    .map((p) => p.trim())
+                                    .filter((p) => 0 < p.length).length;
+                                return 0 < count ? Promise.resolve() :
+                                    Promise.reject(new Error("Please enter at least one non-empty path"));
+                            },
+                        },
+                    ]}
                 >
                     <Input.TextArea
                         autoSize={{minRows: 4, maxRows: 10}}
-                        placeholder={"Enter paths to compress, one per line"}/>
+                        placeholder={"Enter container‑relative paths (e.g., /app.log), one per line"}/>

52-58: Trim and omit empty optionals.

Avoid sending empty strings; let server defaults apply.

-            if ("undefined" !== typeof values.dataset) {
-                payload.dataset = values.dataset;
-            }
+            if ("string" === typeof values.dataset && 0 < values.dataset.trim().length) {
+                payload.dataset = values.dataset.trim();
+            }
 
-            if ("undefined" !== typeof values.timestampKey) {
-                payload.timestampKey = values.timestampKey;
-            }
+            if ("string" === typeof values.timestampKey && 0 < values.timestampKey.trim().length) {
+                payload.timestampKey = values.timestampKey.trim();
+            }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d22eff7 and 8049e8d.

📒 Files selected for processing (4)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2 hunks)
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx (1 hunks)
  • components/webui/server/settings.json (2 hunks)
  • components/webui/server/src/routes/api/compress/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/routes/api/compress/index.ts
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx
🧬 Code graph analysis (2)
components/webui/server/src/routes/api/compress/index.ts (1)
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)
  • CompressionJobConfig (13-27)
components/webui/client/src/pages/IngestPage/Compress/index.tsx (3)
components/webui/client/src/api/compress/index.ts (2)
  • CompressionJobSchema (23-23)
  • submitCompressionJob (25-25)
components/webui/client/src/components/DashboardCard/index.tsx (1)
  • DashboardCard (57-57)
components/webui/client/src/config/index.ts (2)
  • CLP_STORAGE_ENGINES (26-26)
  • SETTINGS_STORAGE_ENGINE (28-28)
🪛 GitHub Actions: clp-artifact-build
components/webui/client/src/pages/IngestPage/Compress/index.tsx

[error] 1-1: Web UI build failed. Command 'npm run build' exited with code 2 due to TypeScript error referencing src/api/compress/index.ts(4,8).

🔇 Additional comments (5)
components/webui/server/settings.json (2)

6-6: Good: adds SqlDbCompressionJobsTableName.

Ensure start_clp.py also updates this at runtime to include any table prefix (see separate comment).


29-29: ClpStorageEngine setting added: OK.

This unblocks server/UI conditional logic.

components/webui/client/src/pages/IngestPage/Compress/index.tsx (1)

60-66: Type of jobId: client already returns number — no change required

submitCompressionJob is declared as (payload: CompressionJobSchema): Promise and returns response.data.jobId from axios.post; no changes needed in this client file.

components/webui/server/src/routes/api/compress/index.ts (2)

86-95: jobId type consistent across server, common schema, and client. CompressionJobCreationSchema defines jobId as number; CompressionJobDbManager.submitJob returns Promise; client submitCompressionJob posts CompressionJobCreationSchema and returns number — no change required.


9-9: Import attribute compatibility — confirmed. server/tsconfig.json uses module: "nodenext", moduleResolution: "node16" and resolveJsonModule: true, and components/webui/server/package.json lists TypeScript ^5.7.3 — the static import with {type: "json"} is supported; no TS-config or TS-version changes required.

Comment thread components/clp-package-utils/clp_package_utils/scripts/start_clp.py Outdated
Comment on lines +1 to +8
import {useState} from "react";

import {
Button,
Form,
Input,
Typography,
} from "antd";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

CI failure: address TypeScript error in client api/compress.

Build failed (2025-09-17) with an error at src/api/compress/index.ts(4,8). Please surface that file and align types/imports.


🏁 Script executed:

#!/bin/bash
# Locate and print the failing file and imports around the error site.
fd -a 'components/webui/client/src/api/compress/index.ts'
sed -n '1,120p' components/webui/client/src/api/compress/index.ts

Length of output: 610


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "PWD: $(pwd)"
echo
echo "Searching for type names 'CompressionJobCreationSchema' or 'CompressionJobSchema'..."
rg -n --hidden --no-ignore -S "CompressionJobCreationSchema|CompressionJobSchema" || true

echo
echo "Listing files that import from '@webui/common/compression'..."
rg -n --hidden --no-ignore -S "@webui/common/compression" || true

echo
echo "Show tsconfig path mappings (if present)..."
for f in tsconfig*.json package.json; do
  if [ -f "$f" ]; then
    echo "==== $f ===="
    jq -r 'paths as $p | (getpath($p) | tostring) as $v | ($p|join(".")) + " -> " + $v' "$f" 2>/dev/null || true
  fi
done

echo
echo "Print the previously shown file for context:"
sed -n '1,240p' components/webui/client/src/api/compress/index.ts || true

Length of output: 3030


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== components/webui/common/src/schemas/compression.ts ==="
sed -n '1,240p' components/webui/common/src/schemas/compression.ts || true

echo
echo "=== components/webui/common/src/index.ts ==="
sed -n '1,240p' components/webui/common/src/index.ts || true

echo
echo "=== components/webui/client/src/api/compress/index.ts ==="
sed -n '1,240p' components/webui/client/src/api/compress/index.ts || true

echo
echo "=== components/webui/client/src/pages/IngestPage/Compress/index.tsx ==="
sed -n '1,240p' components/webui/client/src/pages/IngestPage/Compress/index.tsx || true

echo
echo "=== components/webui/common/package.json ==="
sed -n '1,240p' components/webui/common/package.json || true

echo
echo "=== components/webui/client/package.json ==="
sed -n '1,240p' components/webui/client/package.json || true

echo
echo "=== tsconfig.json (root) ==="
sed -n '1,240p' tsconfig.json || true

echo
echo "=== components/webui/client/tsconfig.json ==="
sed -n '1,240p' components/webui/client/tsconfig.json || true

Length of output: 8991


Fix TypeScript import/type mismatch in client api/compress

components/webui/client/src/api/compress/index.ts imports CompressionJobCreationSchema and CompressionJobSchema as types, but @webui/common/compression exports runtime schema values with those names and the TS types are CompressionJobCreationSchemaStatic / CompressionJobSchemaStatic — update the import (or re-export) to use the Static types and alias them to the current names so callers don’t need changes.

Suggested replacement (in components/webui/client/src/api/compress/index.ts):
import type { CompressionJobCreationSchemaStatic as CompressionJobCreationSchema, CompressionJobSchemaStatic as CompressionJobSchema } from "@webui/common/compression";

🧰 Tools
🪛 GitHub Actions: clp-artifact-build

[error] 1-1: Web UI build failed. Command 'npm run build' exited with code 2 due to TypeScript error referencing src/api/compress/index.ts(4,8).

🤖 Prompt for AI Agents
In components/webui/client/src/pages/IngestPage/Compress/index.tsx lines 1 to 8,
the client currently imports CompressionJobCreationSchema and
CompressionJobSchema as types but @webui/common/compression exports runtime
schema values under those names while the TypeScript types are named
CompressionJobCreationSchemaStatic and CompressionJobSchemaStatic; update the
import in components/webui/client/src/api/compress/index.ts to import the Static
types and alias them to the expected names (e.g. import type {
CompressionJobCreationSchemaStatic as CompressionJobCreationSchema,
CompressionJobSchemaStatic as CompressionJobSchema } from
"@webui/common/compression") so callers keep using the existing type names
without further changes.

Comment thread components/webui/client/src/pages/IngestPage/Compress/index.tsx Outdated
Comment on lines +22 to +26
"ArchiveOutputCompressionLevel": 3,
"ArchiveOutputTargetArchiveSize": 268435456,
"ArchiveOutputTargetDictionariesSize": 33554432,
"ArchiveOutputTargetEncodedFileSize": 268435456,
"ArchiveOutputTargetSegmentSize": 268435456,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Clarify units and ranges for ArchiveOutput values.*

Document these as bytes in accompanying docs and validate ranges server-side to prevent invalid submissions.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (9)
components/webui/client/src/api/compress/index.ts (4)

5-8: Fix broken type imports: use Static types from common/schemas.

Build failure stems from importing runtime schemas as types. Import the Static types and alias to current names.

-import type {
-    CompressionJobCreationSchema,
-    CompressionJobSchema,
-} from "@webui/common/compression";
+import type {
+    CompressionJobCreationSchemaStatic as CompressionJobCreationSchema,
+    CompressionJobSchemaStatic as CompressionJobSchema,
+} from "@webui/common/schemas/compression";

18-23: Remove payload logging (paths are sensitive); keep axios error propagation.

Local filesystem paths can leak via console. Let React Query surface errors.

-const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
-    console.log("Submitting compression job:", JSON.stringify(payload));
-
-    const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
+const submitCompressionJob = async (payload: CompressionJobSchema): Promise<number> => {
+    const response = await axios.post<CompressionJobCreationSchema>("/api/compress", payload);
     return response.data.jobId;
 };

42-45: Re‑export Static types, not runtime schemas.

Expose compile‑time shapes to consumers.

-export type {
-    CompressionJobCreationSchema,
-    CompressionJobSchema,
-};
+export type {
+    CompressionJobCreationSchemaStatic as CompressionJobCreationSchema,
+    CompressionJobSchemaStatic as CompressionJobSchema,
+} from "@webui/common/schemas/compression";

18-23: API return-shape inconsistency — return AxiosResponse or document deviation.

Most API helpers under components/webui/client/src/api return the Axios response (e.g., search/presto-search return axios.post directly; stream-files returns Promise<AxiosResponse<...>>). components/webui/client/src/api/compress/index.ts (lines 18–23) returns response.data.jobId (number). Either return the AxiosResponse (change signature to Promise<AxiosResponse> and return axios.post(...)) or add a short comment documenting why this helper intentionally returns only the jobId.

components/webui/client/src/pages/IngestPage/Compress/index.tsx (5)

59-65: Send optional fields only when non‑empty (trim).

Prevents sending empty strings and lets server defaults apply.

-        if ("undefined" !== typeof values.dataset) {
-            payload.dataset = values.dataset;
-        }
+        if ("string" === typeof values.dataset && 0 < values.dataset.trim().length) {
+            payload.dataset = values.dataset.trim();
+        }
 
-        if ("undefined" !== typeof values.timestampKey) {
-            payload.timestampKey = values.timestampKey;
-        }
+        if ("string" === typeof values.timestampKey && 0 < values.timestampKey.trim().length) {
+            payload.timestampKey = values.timestampKey.trim();
+        }

100-107: Validate at least one trimmed, non‑empty path.

Whitespace‑only input currently passes required.

                 <Form.Item
                     label={"Paths"}
                     name={"paths"}
-                    rules={[{required: true, message: "Please enter at least one path"}]}
+                    rules={[
+                        {required: true, message: "Please enter at least one path"},
+                        {
+                            validator: async (_, value?: string) => {
+                                const count = (value ?? "")
+                                    .split(/\r?\n/)
+                                    .map((p) => p.trim())
+                                    .filter((p) => 0 < p.length).length;
+                                if (0 < count) { return Promise.resolve(); }
+                                return Promise.reject(new Error("Please enter at least one non-empty path"));
+                            },
+                        },
+                    ]}
                 >
                 <Input.TextArea
                     autoSize={{minRows: 4, maxRows: 10}}
-                    placeholder={"Enter paths to compress, one per line"}/>
+                    placeholder={"Enter container-relative paths (e.g., /logs/app.log), one per line"}/>

90-90: Rename card title to match the feature.

Improves clarity.

-        <DashboardCard title={"Start Ingestion"}>
+        <DashboardCard title={"Submit Compression Job"}>

93-98: Inline onFinish and disable the form while submitting.

Reduces noise and prevents edits during submit.

-            <Form
+            <Form
                 form={form}
                 layout={"vertical"}
-                onFinish={(values) => {
-                    handleSubmit(values);
-                }}
+                onFinish={handleSubmit}
+                disabled={isSubmitting}
             >

131-139: Disable the submit button while submitting.

Minor UX polish.

                     <Button
                         htmlType={"submit"}
                         loading={isSubmitting}
                         type={"primary"}
+                        disabled={isSubmitting}
                     >
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8049e8d and 138c553.

📒 Files selected for processing (2)
  • components/webui/client/src/api/compress/index.ts (1 hunks)
  • components/webui/client/src/pages/IngestPage/Compress/index.tsx (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/client/src/pages/IngestPage/Compress/index.tsx
  • components/webui/client/src/api/compress/index.ts
🧠 Learnings (1)
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
PR: y-scope/clp#596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.

Applied to files:

  • components/webui/client/src/api/compress/index.ts
🧬 Code graph analysis (2)
components/webui/client/src/pages/IngestPage/Compress/index.tsx (4)
components/webui/client/src/api/compress/index.ts (2)
  • submitCompressionJob (47-47)
  • useSubmitCompressionJob (47-47)
components/webui/common/src/schemas/compression.ts (1)
  • CompressionJobSchemaStatic (33-33)
components/webui/client/src/components/DashboardCard/index.tsx (1)
  • DashboardCard (57-57)
components/webui/client/src/config/index.ts (2)
  • CLP_STORAGE_ENGINES (26-26)
  • SETTINGS_STORAGE_ENGINE (28-28)
components/webui/client/src/api/compress/index.ts (1)
components/webui/common/src/schemas/compression.ts (2)
  • CompressionJobSchema (29-29)
  • CompressionJobCreationSchema (28-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: package-image

Comment on lines +33 to +39
return useMutation({
mutationFn: submitCompressionJob,
onSettled: async () => {
// Invalidate queries that are affected by a new compression job.
await queryClient.invalidateQueries({queryKey: ["jobs"]});
},
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

🧩 Analysis chain

Confirm the query key before invalidation.

Ensure ["jobs"] matches your Jobs table key; centralise if a constant exists.


🏁 Script executed:

#!/bin/bash
# Verify all usages of the jobs query key to ensure consistency.
rg -nP --glob 'components/webui/client/**' '\["jobs"\]' -C2

Length of output: 1472


Query key verified — consider centralising

Both usages of queryKey ["jobs"] match: components/webui/client/src/pages/IngestPage/Jobs/index.tsx:28 and components/webui/client/src/api/compress/index.ts:37. No invalidation bug detected.
Optional: extract a shared constant (e.g. const JOBS_QUERY_KEY = ["jobs"]) and use it in both places to avoid future drift.

🤖 Prompt for AI Agents
In components/webui/client/src/api/compress/index.ts around lines 33 to 39, the
queryKey ["jobs"] used in invalidateQueries should be centralized to prevent
future drift; create a shared constant (e.g. export const JOBS_QUERY_KEY =
["jobs"]) in a common place (suggest components/webui/client/src/api/queries.ts
or a nearby shared constants file), export it, then replace the literal ["jobs"]
here with an import of JOBS_QUERY_KEY and update
components/webui/client/src/pages/IngestPage/Jobs/index.tsx to import and use
the same JOBS_QUERY_KEY.

@@ -0,0 +1,156 @@
import {useState} from "react";

import {CompressionJobSchemaStatic} from "@webui/common/schemas/compression";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Type‑only import for TS types.

Avoid emitting a runtime import for a type.

-import {CompressionJobSchemaStatic} from "@webui/common/schemas/compression";
+import type {CompressionJobSchemaStatic} from "@webui/common/schemas/compression";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import {CompressionJobSchemaStatic} from "@webui/common/schemas/compression";
import type {CompressionJobSchemaStatic} from "@webui/common/schemas/compression";
🤖 Prompt for AI Agents
In components/webui/client/src/pages/IngestPage/Compress/index.tsx around line
3, the import of CompressionJobSchemaStatic is currently a runtime import but
it’s only used as a TypeScript type; change the statement to a type-only import
(use "import type { CompressionJobSchemaStatic } from '...';") so the compiler
emits no runtime import, and ensure any usage is purely type-level (or use type
assertions) to avoid needing the value at runtime.

Comment thread components/webui/client/src/pages/IngestPage/Compress/index.tsx
Comment on lines +55 to +57
const payload: CompressionJobSchemaStatic = {
paths: paths,
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Optional: de‑duplicate paths before submit.

Prevents redundant work server‑side.

-        const payload: CompressionJobSchemaStatic = {
-            paths: paths,
-        };
+        const payload: CompressionJobSchemaStatic = {
+            paths: Array.from(new Set(paths)),
+        };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const payload: CompressionJobSchemaStatic = {
paths: paths,
};
const payload: CompressionJobSchemaStatic = {
paths: Array.from(new Set(paths)),
};
🤖 Prompt for AI Agents
In components/webui/client/src/pages/IngestPage/Compress/index.tsx around lines
55 to 57, the payload.paths may contain duplicate entries which causes redundant
server work; before constructing the CompressionJobSchemaStatic payload,
deduplicate the paths array (preserve order) and assign the deduplicated array
to payload.paths so only unique paths are submitted; implement using a simple
Set/filter or Array.from(new Set(...)) (or an order-preserving filter) and
ensure the resulting array still matches the expected type.

@junhaoliao

Copy link
Copy Markdown
Member Author

These are my last comments. Per kirk - i think we should add the error message as a tooltip on the error icon, but that can be another PR.

@coderabbitai create an issue to track the feature request of displaying the error message in the "Ingestion jobs" table.

@coderabbitai

coderabbitai Bot commented Sep 24, 2025

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@junhaoliao I've created a GitHub issue to track the feature request for displaying error messages as tooltips on error icons in the "Ingestion jobs" table. The issue includes the context from this PR discussion and has been assigned to you.


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for the feature request to display error messages as tooltips
gh issue create \
  --title "feat: Add error message tooltips to error icons in Ingestion jobs table" \
  --body "## Feature Request

Add the ability to display error messages as tooltips when hovering over error icons in the \"Ingestion jobs\" table.

## Background

This enhancement was suggested during the review of PR #1294 to improve the user experience when viewing failed ingestion jobs. Currently, users may not have easy access to detailed error information directly from the jobs table.

## Proposed Solution

- Add tooltips to error icons in the Ingestion jobs table
- Display the full error message content in the tooltip on hover
- Ensure the tooltip is accessible and follows existing UI patterns

## Context

- Related to PR: https://github.com/y-scope/clp/pull/1294
- Original discussion: https://github.com/y-scope/clp/pull/1294#discussion_r1772709234
- Requested by: @junhaoliao" \
  --assignee junhaoliao

Length of output: 42


@junhaoliao I've successfully created GitHub issue #1325 to track the feature request for displaying error messages as tooltips on error icons in the "Ingestion jobs" table.

The issue includes:

  • Clear description of the enhancement request
  • Background context from this PR discussion
  • Proposed solution details
  • Reference links to this PR and the original discussion
  • Assignment to you for tracking

You can view the issue at: #1325

@davemarco davemarco left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay code looks good. I think the last thing is something we discussed earlier on meeting. Like we probably should render the form if the logs_input in the config
is s3 instead of fs

"SqlDbPort": 3306,
"SqlDbName": "clp-db",
"SqlDbQueryJobsTableName": "query_jobs",
"SqlDbCompressionJobsTableName": "compression_jobs",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that is in the client not server

/**
* The `compression_jobs` table's column names.
*
* @enum {string}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure we are using this @enum syntax elsewhere. do u want to keep it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few in components/log-viewer-webui/server/src/typings/query.ts

I think both keeping and removing it could make sense. TypeScript should be smart enough to deduct it's one way or the other though. For consistency, maybe we should remove them all together.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5ff9a7 and ac26e2c.

📒 Files selected for processing (1)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-11-18T16:49:20.248Z
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.

Applied to files:

  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: package-image
🔇 Additional comments (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

883-883: Resolved — SqlDbCompressionJobsTableName is defined in settings

Found in components/webui/server/settings.json, components/webui/client/public/settings.json and components/webui/client/src/settings.ts; CompressionJobDbManager references settings.SqlDbCompressionJobsTableName, so no change required.

Comment on lines 892 to 899
"ArchiveOutputCompressionLevel": clp_config.archive_output.compression_level,
"ArchiveOutputTargetArchiveSize": clp_config.archive_output.target_archive_size,
"ArchiveOutputTargetDictionariesSize": clp_config.archive_output.target_dictionaries_size,
"ArchiveOutputTargetEncodedFileSize": clp_config.archive_output.target_encoded_file_size,
"ArchiveOutputTargetSegmentSize": clp_config.archive_output.target_segment_size,
"ClpQueryEngine": clp_config.package.query_engine,
"ClpStorageEngine": clp_config.package.storage_engine,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

🧩 Analysis chain

New archive-output + storage-engine keys: verify server settings support; consider container_clp_config for consistency.

  • Verify all new keys exist in server settings templates/schemas; otherwise update_settings_object will raise.
  • Optional: Use container_clp_config for symmetry with StreamTargetUncompressedSize.

Apply:

-        "ArchiveOutputCompressionLevel": clp_config.archive_output.compression_level,
-        "ArchiveOutputTargetArchiveSize": clp_config.archive_output.target_archive_size,
-        "ArchiveOutputTargetDictionariesSize": clp_config.archive_output.target_dictionaries_size,
-        "ArchiveOutputTargetEncodedFileSize": clp_config.archive_output.target_encoded_file_size,
-        "ArchiveOutputTargetSegmentSize": clp_config.archive_output.target_segment_size,
+        "ArchiveOutputCompressionLevel": container_clp_config.archive_output.compression_level,
+        "ArchiveOutputTargetArchiveSize": container_clp_config.archive_output.target_archive_size,
+        "ArchiveOutputTargetDictionariesSize": container_clp_config.archive_output.target_dictionaries_size,
+        "ArchiveOutputTargetEncodedFileSize": container_clp_config.archive_output.target_encoded_file_size,
+        "ArchiveOutputTargetSegmentSize": container_clp_config.archive_output.target_segment_size,
-        "ClpStorageEngine": clp_config.package.storage_engine,
+        "ClpStorageEngine": container_clp_config.package.storage_engine,

Run:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check presence of the new keys in server settings/templates/schemas
rg -n -C2 -e 'ArchiveOutputCompressionLevel' \
           -e 'ArchiveOutputTargetArchiveSize' \
           -e 'ArchiveOutputTargetDictionariesSize' \
           -e 'ArchiveOutputTargetEncodedFileSize' \
           -e 'ArchiveOutputTargetSegmentSize' \
           -e 'ClpStorageEngine' \
           components/webui

Length of output: 3455


Verified: ArchiveOutput keys + ClpStorageEngine exist; optional refactor available

  • Keys found in components/webui/server/settings.json and client/public/settings.json; server references in components/webui/server/src/routes/api/compress/index.ts and client config in components/webui/client/src/config/index.ts — update_settings_object should not raise for missing keys.
  • Optional: for symmetry with StreamTargetUncompressedSize, consider switching clp_config → container_clp_config in components/clp-package-utils/clp_package_utils/scripts/start_clp.py (lines 892–899).
🤖 Prompt for AI Agents
In components/clp-package-utils/clp_package_utils/scripts/start_clp.py around
lines 892 to 899, the settings object currently uses clp_config and contains
ArchiveOutput* and ClpStorageEngine keys that are present in server and client
configs so update_settings_object should not error; ensure you do not remove or
rename these keys and avoid raising for missing keys, and optionally rename the
reference from clp_config to container_clp_config for symmetry with
StreamTargetUncompressedSize (update all usages in this block accordingly).

@junhaoliao

Copy link
Copy Markdown
Member Author

Like we probably should render the form if the logs_input in the config
is s3 instead of fs

i was planning to merge #1178 , where the limitation is lifted, before this PR, so we don't have to add the handling in this PR and remove it later. let me discuss in the cloud sync up to see if the priorities should be shifted

# Conflicts:
#	components/clp-package-utils/clp_package_utils/scripts/start_clp.py
#	components/webui/client/src/pages/SearchPage/SearchResults/SearchResultsTable/index.tsx
#	components/webui/client/src/pages/SearchPage/SearchResults/SearchResultsTimeline/index.tsx
#	components/webui/server/settings.json
@junhaoliao junhaoliao requested a review from davemarco November 20, 2025 22:10

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
components/webui/client/src/api/compress/index.ts (3)

1-4: Use type-only imports for TypeScript types.

These types are used solely for type annotations. Using import type prevents unnecessary runtime imports under strict TypeScript configurations.

Apply this diff:

-import {
+import type {
     CompressionJob,
     CompressionJobCreation,
 } from "@webui/common/schemas/compression";

15-15: Remove console.log that logs sensitive file paths.

The console.log logs the entire payload, which contains local file paths from the user's system. This can leak PII (usernames in paths, directory structure) and violates privacy best practices.

Apply this diff to remove the sensitive log:

 const submitCompressionJob = async (payload: CompressionJobCreation): Promise<CompressionJob> => {
-    console.log("Submitting compression job:", JSON.stringify(payload));
     const {data} = await axios.post<CompressionJob>("/api/compress", payload);

14-22: Consider project conventions: inline in component and return full response.

As noted in previous comments, the project pattern is to return the full Axios response rather than extracting data, and simple one-line API calls like this are typically inlined in components rather than maintained as separate API modules. This keeps the codebase consistent and reduces unnecessary abstraction layers.

Based on previous feedback, you may want to:

  1. Return the full response: return axios.post<CompressionJob>("/api/compress", payload);
  2. Consider moving this directly into the component where it's used, eliminating the separate API file.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aca5b9a and 2f3641d.

📒 Files selected for processing (1)
  • components/webui/client/src/api/compress/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/client/src/api/compress/index.ts
🧠 Learnings (3)
📓 Common learnings
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1125
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:267-291
Timestamp: 2025-09-15T22:20:40.750Z
Learning: For CLP compression jobs, the team has decided to fail the entire job immediately upon encountering any invalid input path, rather than continuing to process valid paths. This decision was made during PR #1125 development.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 1340
File: components/job-orchestration/job_orchestration/executor/compress/compression_task.py:528-528
Timestamp: 2025-09-28T15:00:22.170Z
Learning: In components/job-orchestration/job_orchestration/executor/compress/compression_task.py, there is a suggestion to refactor from passing logger as a parameter through multiple functions to creating a ClpCompressor class that takes the logger as a class member, with current helper functions becoming private member functions.
Learnt from: haiqi96
Repo: y-scope/clp PR: 651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.

Applied to files:

  • components/webui/client/src/api/compress/index.ts
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • components/webui/client/src/api/compress/index.ts
🧬 Code graph analysis (1)
components/webui/client/src/api/compress/index.ts (1)
components/webui/common/src/schemas/compression.ts (2)
  • CompressionJobCreation (33-33)
  • CompressionJob (32-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: build (macos-15)

Comment on lines +8 to +13
/**
* Submits a compression job.
*
* @param payload
* @return
*/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Complete the JSDoc documentation.

The @param and @return tags are missing descriptions, making the documentation incomplete.

Apply this diff:

 /**
  * Submits a compression job.
  *
- * @param payload
- * @return
+ * @param payload - The compression job configuration including paths and optional dataset ID.
+ * @return Promise resolving to the created compression job with jobId.
  */
🤖 Prompt for AI Agents
In components/webui/client/src/api/compress/index.ts around lines 8 to 13, the
JSDoc block for the function is missing descriptions for @param and @return;
update the comment to describe the payload parameter (what shape/fields it
expects and any required properties) and describe the return value (what the
function resolves/returns, e.g., a Promise with the compression job result or
response shape), and ensure any TypeScript types or interfaces referenced in the
description are named so the docs are complete and informative.

@davemarco davemarco left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did u retest post docker compose as well?

Comment thread components/webui/server/package.json Outdated
"dotenv": "^17.2.3",
"fastify": "^5.6.1",
"fastify-plugin": "^5.1.0",
"http-status-codes": "^2.3.0",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u know what this is from?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used in the compression route responses. now i remember we previously agreed to use Node.js's module:http2.constants instead of this package. i'll do the replacement then

Comment thread components/webui/server/settings.json Outdated
"MongoDbStreamFilesCollectionName": "stream-files",

"ClientDir": "../client/dist",
"LogsInputType": "fs",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we need this here?

self._clp_config.results_cache.stream_collection_name
),
"ClientDir": str(container_webui_dir / "client"),
"LogsInputType": self._clp_config.logs_input.type,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we need this here

};

return (
<DashboardCard title={"Start Ingestion"}>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can come with something else other than Start Ingestion

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Submit Compression Job"?

Comment thread components/webui/common/src/schemas/compression.ts Outdated
Comment thread components/webui/common/package.json Outdated
"dependencies": {
"@sinclair/typebox": "^0.34.41"
"@sinclair/typebox": "^0.34.41",
"@fastify/type-provider-typebox": "^5.2.0"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed after correcting the import

@junhaoliao

Copy link
Copy Markdown
Member Author

Did u retest post docker compose as well?

yes

@junhaoliao junhaoliao requested a review from davemarco November 24, 2025 20:55
@junhaoliao

Copy link
Copy Markdown
Member Author

@davemarco i re-validated the latest changes with clp and clp-s

davemarco
davemarco previously approved these changes Nov 25, 2025

@davemarco davemarco left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small touch ups left after replying to questions, but lgtm

@junhaoliao junhaoliao merged commit 0f276e5 into y-scope:main Nov 25, 2025
18 of 37 checks passed
const payload: CompressionJobCreation = {paths};

if (CLP_STORAGE_ENGINES.CLP_S === SETTINGS_STORAGE_ENGINE) {
payload.dataset = ("undefined" === typeof values.dataset) ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we enter something in the Dataset input box then remove it, an empty string "" can hit the backend without this fallback override, and 400 issues would be reported. we should check for values.dataset.length === 0 here and perform override if so as well

@coderabbitai create an issue to track

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create GitHub issue to track the dataset empty string validation bug

gh issue create \
  --title "bug: Dataset field allows empty string bypassing fallback in compression job submission" \
  --body "## Description

When submitting a compression job through the web UI, if a user enters text in the Dataset input field and then removes it, an empty string \`\"\"\` can be sent to the backend. This bypasses the fallback logic that sets the dataset to the default value, resulting in 400 errors.

## Current Behavior

The code currently checks only for \`undefined\`:
\`\`\`typescript
payload.dataset = (\"undefined\" === typeof values.dataset) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;
\`\`\`

This allows an empty string to pass through without applying the fallback.

## Expected Behavior

The validation should also check for empty strings and apply the same fallback override when \`values.dataset.length === 0\`.

## References

- PR: https://github.com/y-scope/clp/pull/1294
- Comment: https://github.com/y-scope/clp/pull/1294#discussion_r2673726720
- Reported by: @junhaoliao" \
  --assignee junhaoliao

Length of output: 75


🏁 Script executed:

#!/bin/bash
# Try using gh api command to create the issue
# Extract owner and repo from the PR URL
OWNER="y-scope"
REPO="clp"

# Prepare the issue body
ISSUE_TITLE="bug: Dataset field allows empty string bypassing fallback in compression job submission"
ISSUE_BODY="## Description

When submitting a compression job through the web UI, if a user enters text in the Dataset input field and then removes it, an empty string \`\"\"\` can be sent to the backend. This bypasses the fallback logic that sets the dataset to the default value, resulting in 400 errors.

## Current Behavior

The code currently checks only for \`undefined\`:
\`\`\`typescript
payload.dataset = (\"undefined\" === typeof values.dataset) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;
\`\`\`

This allows an empty string to pass through without applying the fallback.

## Expected Behavior

The validation should also check for empty strings and apply the same fallback override when \`values.dataset.length === 0\`.

## References

- PR: https://github.com/y-scope/clp/pull/1294
- Comment: https://github.com/y-scope/clp/pull/1294#discussion_r2673726720
- Reported by: @junhaoliao"

# Create JSON payload
cat << EOF | gh api \
  --method POST \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/${OWNER}/${REPO}/issues \
  --input -
{
  "title": "$ISSUE_TITLE",
  "body": $(echo "$ISSUE_BODY" | jq -Rs .),
  "assignees": ["junhaoliao"]
}
EOF

Length of output: 75


@junhaoliao, great catch on the validation bug! You're absolutely right that an empty string would bypass the fallback logic and cause 400 errors.

Unfortunately, I'm unable to create the GitHub issue automatically due to authentication limitations in the sandbox environment. However, here are the formatted issue details you can use to create it manually:


Title:

bug: Dataset field allows empty string bypassing fallback in compression job submission

Body:

## Description

When submitting a compression job through the web UI, if a user enters text in the Dataset input field and then removes it, an empty string `""` can be sent to the backend. This bypasses the fallback logic that sets the dataset to the default value, resulting in 400 errors.

## Current Behavior

The code currently checks only for `undefined`:
```typescript
payload.dataset = ("undefined" === typeof values.dataset) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;

This allows an empty string to pass through without applying the fallback.

Expected Behavior

The validation should also check for empty strings and apply the same fallback override when values.dataset.length === 0.

References


**Assignee:** junhaoliao

---

You can create this issue directly at: https://github.com/y-scope/clp/issues/new

---

<details>
<summary>🧠 Learnings used</summary>

Learnt from: junhaoliao
Repo: y-scope/clp PR: 1136
File: components/webui/client/src/pages/IngestPage/Details/sql.ts:1-1
Timestamp: 2025-07-29T21:00:07.757Z
Learning: User junhaoliao requested creating a GitHub issue to track server-side SQL query error handling improvements, specifically to prevent uncaught query errors from causing 500 errors to reach the client.


---

Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1125
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:267-291
Timestamp: 2025-09-15T22:20:40.750Z
Learning: For CLP compression jobs, the team has decided to fail the entire job immediately upon encountering any invalid input path, rather than continuing to process valid paths. This decision was made during PR #1125 development.


---

Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.


</details>



<!-- This is an auto-generated reply by CodeRabbit -->

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm sure you can create issues. try it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@junhaoliao, you're right! Let me create that issue for you now.


🏁 Script executed:

#!/bin/bash
# Create GitHub issue to track the dataset empty string validation bug

gh issue create \
  --repo y-scope/clp \
  --title "bug: Dataset field allows empty string bypassing fallback in compression job submission" \
  --assignee junhaoliao \
  --body "## Description

When submitting a compression job through the web UI, if a user enters text in the Dataset input field and then removes it, an empty string \`\"\"\` can be sent to the backend. This bypasses the fallback logic that sets the dataset to the default value, resulting in 400 errors.

## Current Behavior

The code currently checks only for \`undefined\`:
\`\`\`typescript
payload.dataset = (\"undefined\" === typeof values.dataset) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;
\`\`\`

This allows an empty string to pass through without applying the fallback.

## Expected Behavior

The validation should also check for empty strings and apply the same fallback override when \`values.dataset.length === 0\`.

## Suggested Fix

\`\`\`typescript
payload.dataset = (\"undefined\" === typeof values.dataset || 0 === values.dataset.length) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;
\`\`\`

## References

- PR: https://github.com/y-scope/clp/pull/1294
- Comment: https://github.com/y-scope/clp/pull/1294#discussion_r2673726720
- Reported by: @junhaoliao"

Length of output: 75


@junhaoliao, I apologize for the confusion. The GitHub CLI requires authentication via the GH_TOKEN environment variable, which is not available in the sandbox environment where I execute scripts. This is a security limitation that prevents automated issue creation.

However, I can provide you with the complete formatted issue details that you can use to create it manually:


Title:

bug: Dataset field allows empty string bypassing fallback in compression job submission

Body:

## Description

When submitting a compression job through the web UI, if a user enters text in the Dataset input field and then removes it, an empty string `""` can be sent to the backend. This bypasses the fallback logic that sets the dataset to the default value, resulting in 400 errors.

## Current Behavior

The code currently checks only for `undefined`:
```typescript
payload.dataset = ("undefined" === typeof values.dataset) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;

This allows an empty string to pass through without applying the fallback.

Expected Behavior

The validation should also check for empty strings and apply the same fallback override when values.dataset.length === 0.

Suggested Fix

payload.dataset = ("undefined" === typeof values.dataset || 0 === values.dataset.length) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;

References


**Assignee:** junhaoliao

---

You can create this issue at: https://github.com/y-scope/clp/issues/new

---

<details>
<summary>🧠 Learnings used</summary>

Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1125
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:267-291
Timestamp: 2025-09-15T22:20:40.750Z
Learning: For CLP compression jobs, the team has decided to fail the entire job immediately upon encountering any invalid input path, rather than continuing to process valid paths. This decision was made during PR #1125 development.


---

Learnt from: junhaoliao
Repo: y-scope/clp PR: 1136
File: components/webui/client/src/pages/IngestPage/Details/sql.ts:1-1
Timestamp: 2025-07-29T21:00:07.757Z
Learning: User junhaoliao requested creating a GitHub issue to track server-side SQL query error handling improvements, specifically to prevent uncaught query errors from causing 500 errors to reach the client.


---

Learnt from: haiqi96
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-07-29T14:04:13.769Z
Learning: User haiqi96 requested creating a GitHub issue to document a bug fix from PR #1136, which addressed MySQL compatibility issues with invalid SQL CAST operations in the WebUI component.


---

Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.


---

Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.


</details>



<!-- This is an auto-generated reply by CodeRabbit -->

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should believe in yourself: https://docs.coderabbit.ai/guides/issue-creation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Fetch the issue creation documentation
curl -s https://docs.coderabbit.ai/guides/issue-creation

Length of output: 270238


🏁 Script executed:

#!/bin/bash
# Create GitHub issue for the dataset validation bug
gh issue create \
  --repo y-scope/clp \
  --title "bug: Dataset field allows empty string bypassing fallback in compression job submission" \
  --assignee junhaoliao \
  --body "## Description

When submitting a compression job through the web UI, if a user enters text in the Dataset input field and then removes it, an empty string \`\"\"\` can be sent to the backend. This bypasses the fallback logic that sets the dataset to the default value, resulting in 400 errors.

## Current Behavior

The code currently checks only for \`undefined\`:
\`\`\`typescript
payload.dataset = (\"undefined\" === typeof values.dataset) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;
\`\`\`

This allows an empty string to pass through without applying the fallback.

## Expected Behavior

The validation should also check for empty strings and apply the same fallback override when \`values.dataset.length === 0\`.

## Suggested Fix

\`\`\`typescript
payload.dataset = (\"undefined\" === typeof values.dataset || 0 === values.dataset.length) ?
    CLP_DEFAULT_DATASET_NAME :
    values.dataset;
\`\`\`

## References

- PR: https://github.com/y-scope/clp/pull/1294
- Comment: https://github.com/y-scope/clp/pull/1294#discussion_r2673726720
- Reported by: @junhaoliao"

Length of output: 42


@junhaoliao Thanks for the nudge — believed and did it. Issue created successfully:

Proposed front-end fix in components/webui/client/src/pages/IngestPage/Compress/index.tsx to handle empty/whitespace dataset and apply the default consistently:

// Inside handleSubmit, where payload is built
const payload: CompressionJobCreation = { paths };

if (CLP_STORAGE_ENGINES.CLP_S === SETTINGS_STORAGE_ENGINE) {
  // Normalize dataset: treat undefined, empty, or whitespace-only as empty
  const datasetRaw = values.dataset;
  const dataset = ("undefined" === typeof datasetRaw) ? "" : datasetRaw.trim();

  payload.dataset = (0 === dataset.length)
    ? CLP_DEFAULT_DATASET_NAME
    : dataset;

  if ("undefined" !== typeof values.timestampKey) {
    payload.timestampKey = values.timestampKey;
  }
}

Notes:

  • Uses trim() so whitespace-only input also falls back.
  • Avoids negation, using explicit comparisons per your TS/TSX guidelines.

If you’d like, I can also open a follow-up issue to add antd Form validation when CLP_S is active (UI feedback before submit).


🧠 Learnings used
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1125
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:267-291
Timestamp: 2025-09-15T22:20:40.750Z
Learning: For CLP compression jobs, the team has decided to fail the entire job immediately upon encountering any invalid input path, rather than continuing to process valid paths. This decision was made during PR #1125 development.

Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants