fix(job-orchestration): Make tag_ids a required list[int] for compatibility with Spider compressor.#1453
Conversation
WalkthroughInitialized Changes
Sequence Diagram(s)sequenceDiagram
participant Scheduler
participant DB
participant WorkerExecutor
rect #DFF0D8
note right of Scheduler: scheduling phase
Scheduler->>DB: read/create tag rows (if any)
DB-->>Scheduler: tag_ids (may be [])
Scheduler->>WorkerExecutor: enqueue compress task with tag_ids (list[int])
end
rect #FFF4D6
note right of WorkerExecutor: execution phase
WorkerExecutor->>WorkerExecutor: run_clp / compression_entry_point(tag_ids)
alt tag_ids non-empty
WorkerExecutor->>WorkerExecutor: update_job_metadata_and_tags -> update_tags(tag_ids)
else tag_ids empty
WorkerExecutor->>WorkerExecutor: skip update_tags (no-op)
end
WorkerExecutor->>DB: persist compression metadata
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/job-orchestration/job_orchestration/executor/compress/compression_task.py (1)
315-315: Add type hint fortag_idsparameter.The
tag_idsparameter lacks a type hint, which is inconsistent withupdate_job_metadata_and_tags(line 90) where it's typed asList[int].Apply this diff to add the type hint:
tag_ids, + tag_ids: List[int],Note: This also applies to
compression_entry_pointat line 525.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/job-orchestration/job_orchestration/executor/compress/compression_task.py(1 hunks)components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: package-image
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: rust-checks (ubuntu-24.04)
- GitHub Check: rust-checks (ubuntu-22.04)
🔇 Additional comments (2)
components/job-orchestration/job_orchestration/executor/compress/compression_task.py (1)
93-102: The fix correctly addresses the issue.The conditional guard ensures that
update_tagsis only called whentag_idsis non-empty, whileincrement_compression_job_metadatais always executed. This aligns with the scheduler initializingtag_idsto an empty list and prevents the failure when iterating overNone.components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py (1)
345-359: Excellent fix for the root cause.Initializing
tag_idsto an empty list instead ofNoneensures that the variable is always iterable. The subsequent logic correctly populatestag_idsfrom the database when tags are present, or leaves it as an empty list otherwise. This prevents the failure described in issue #1446.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/job-orchestration/job_orchestration/executor/compress/celery_compress.py(2 hunks)components/job-orchestration/job_orchestration/executor/compress/compression_task.py(3 hunks)
🧰 Additional context used
🪛 Ruff (0.14.1)
components/job-orchestration/job_orchestration/executor/compress/celery_compress.py
1-1: typing.List is deprecated, use list instead
(UP035)
🔇 Additional comments (3)
components/job-orchestration/job_orchestration/executor/compress/compression_task.py (3)
93-94: Conditional check appropriately addresses past feedback.The guard condition
if len(tag_ids) > 0:prevents unnecessary database calls when there are no tags to update. This implementation addresses the previous review feedback and is clear and explicit.
315-315: Type annotation correctly addresses reviewer feedback.The
List[int]type annotation makes the contract explicit and addresses LinZhihao-723's specific request to update type annotations throughout the call chain.
525-525: Type annotation correctly added.The
List[int]type annotation completes the chain of type annotations from the Celery task through to the core compression logic, ensuring consistency throughout.
junhaoliao
left a comment
There was a problem hiding this comment.
for the pr title, how about:
fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with spider compressor.
LinZhihao-723
left a comment
There was a problem hiding this comment.
Junhao's title should be ok, but make sure to use Spider not spider.
None for tag_ids.tag_ids a required list[int] for compatibility with Spider compressor.
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421) * docs: Add Slack community invite badge to home page README. (y-scope#1418) * refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417) Co-authored-by: Junhao Liao <junhao@junhao.ca> Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405) * chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411) * fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430) - Save cache entries using unique key per entry. - Restore latest entries using key prefix. - Avoid using outputs from optionally-run `restore` task. * fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445) Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444) * fix(webui): Submit queries that failed ANTLR validation to Presto. (y-scope#1450) * feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434) * feat(webui): Show query speed in native search status. (y-scope#1429) * fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453) * feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454) Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448) * feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459) * Add filters. * Update cargo lock. --------- Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com> Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com> Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com> Co-authored-by: Junhao Liao <junhao@junhao.ca> Co-authored-by: Junhao Liao <junhao.liao@yscope.com> Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com> Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com> Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com> Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421) * docs: Add Slack community invite badge to home page README. (y-scope#1418) * refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417) Co-authored-by: Junhao Liao <junhao@junhao.ca> Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405) * chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411) * fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430) - Save cache entries using unique key per entry. - Restore latest entries using key prefix. - Avoid using outputs from optionally-run `restore` task. * fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445) Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444) * fix(webui): Submit queries that failed ANTLR validation to Presto. (y-scope#1450) * feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434) * feat(webui): Show query speed in native search status. (y-scope#1429) * fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453) * feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454) Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448) * feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459) * Add filters. * Update cargo lock. --------- Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com> Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com> Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com> Co-authored-by: Junhao Liao <junhao@junhao.ca> Co-authored-by: Junhao Liao <junhao.liao@yscope.com> Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com> Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com> Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com> Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421) * docs: Add Slack community invite badge to home page README. (y-scope#1418) * refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417) Co-authored-by: Junhao Liao <junhao@junhao.ca> Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405) * chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411) * fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430) - Save cache entries using unique key per entry. - Restore latest entries using key prefix. - Avoid using outputs from optionally-run `restore` task. * fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445) Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444) * fix(webui): Submit queries that failed ANTLR validation to Presto. (y-scope#1450) * feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434) * feat(webui): Show query speed in native search status. (y-scope#1429) * fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453) * feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454) Co-authored-by: Junhao Liao <junhao.liao@yscope.com> * feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448) * feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459) * Add filters. * Update cargo lock. * Stupid fix --------- Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com> Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com> Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com> Co-authored-by: Junhao Liao <junhao@junhao.ca> Co-authored-by: Junhao Liao <junhao.liao@yscope.com> Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com> Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com> Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com> Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
…patibility with Spider compressor. (y-scope#1453)
Description
Compression orchestration fails, as identified by #1446, because
tag_idscould beNone, which is not supported by Spider.This PR fixes #1446 by initializing
tag_idsto empty list instead ofNone, so thattag_idswill never beNone, and is safe to be directly passed into Spider.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Bug Fixes
Chores