Skip to content

fix(job-orchestration): Update celery to 5.5.3 with redis extra to replace direct redis dependency (fixes #1059).#1213

Merged
junhaoliao merged 2 commits into
y-scope:mainfrom
junhaoliao:celery-stuck-upgrade-redis
Aug 18, 2025
Merged

fix(job-orchestration): Update celery to 5.5.3 with redis extra to replace direct redis dependency (fixes #1059).#1213
junhaoliao merged 2 commits into
y-scope:mainfrom
junhaoliao:celery-stuck-upgrade-redis

Conversation

@junhaoliao

@junhaoliao junhaoliao commented Aug 17, 2025

Copy link
Copy Markdown
Member

Description

Update Celery to 5.5.3 with redis extra and drop direct redis dependency to fix a deadlock issue in the Celery Redis backend. The deadlock occurred when AsyncResult.del() was issuing a Redis UNSUBSCRIBE while another frame held the PubSub lock.

While we had redis = "^5.0.3" specified in pyproject.toml, the ^ specifier allowed it to be upgraded to as high as 5.3.1 which has the deadlock bug. With the constraint (currently redis = ["redis (>=4.5.2,!=4.5.5,!=5.0.2,<=5.2.1)"]) enforced by Celery as a result of celery = {extras = ["redis"]}, a (hopefully) compatible version of redis will be installed instead.

See the RCA at #1059 (comment) for more details. (the latest strategy is to respect the redis version constraint specified by celery and avoid direct redis dependency specification)

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

cd clp-package/sbin
./start-clp.sh
./compress.sh ~/sample/hive-24hr/

Then in the webui, perform search with query string "123" 20 times. No stuck was observed.

Also, to cover the case where we create new jobs in handle_finished_search_job, i also tried compressing 10 copies of hive-24hr logs. Performing queries in the WebUI let the backend scheduled multiple Celery jobs, and no stuck was observed with 20 queries in the webui.

Summary by CodeRabbit

  • New Features
    • No user-facing changes in this release.
  • Chores
    • Upgraded Redis dependency to 6.4.x for the job orchestration component.
    • This update may require reinstalling dependencies or refreshing your environment.
    • Background improvements only; existing workflows and behaviour remain unchanged.

@junhaoliao junhaoliao marked this pull request as ready for review August 17, 2025 02:23
@junhaoliao junhaoliao requested a review from a team as a code owner August 17, 2025 02:23
@coderabbitai

coderabbitai Bot commented Aug 17, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Update to the job-orchestration component's Poetry config: Celery spec changed to include redis extras and a newer version, standalone redis dependency removed, msgpack and mysql-connector-python added, and build-backend metadata added.

Changes

Cohort / File(s) Summary
Poetry dependency changes
components/job-orchestration/pyproject.toml
Replace celery = "^5.3.6" with celery = {extras = ["redis"], version = "^5.5.3"}; remove redis = "^5.0.1"; add msgpack = "^1.0.7" and mysql-connector-python = "^8.2.0" under [tool.poetry.dependencies].
Build-system metadata
components/job-orchestration/pyproject.toml
Add build-backend = "poetry.core.masonry.api" under [build-system] while retaining requires = ["poetry-core"].

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@junhaoliao junhaoliao Aug 17, 2025

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock file was updated by running

poetry update redis

inside components/job-orchestration after changing the redis version in pyproject.toml

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 619392f and 6f32b09.

⛔ Files ignored due to path filters (1)
  • components/job-orchestration/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • components/job-orchestration/pyproject.toml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (2)
components/job-orchestration/pyproject.toml (2)

24-24: LGTM: Upgrade to redis 6.4.0 aligns with Python constraint and addresses the PubSub deadlock.

  • Python constraint (>=3.9) is compatible with redis-py 6.x.
  • The change is scoped and matches the PR objective to resolve the Celery Redis backend deadlock.

24-24: Lockfile updated and exports verified

  • components/job-orchestration/poetry.lock now lists Redis v6.4.0 as expected.
  • No requirements*.txt files were found—there are no exported requirements to regenerate.
  • Be sure to clear or rebuild any CI/container caches so the new Redis version is picked up.

pymongo = "^4.7.2"
PyYAML = "^6.0.1"
redis = "^5.0.1"
redis = "^6.4.0"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Optional: Document the rationale for pinning to >=6.4.0 to prevent accidental downgrades.

Add an inline comment referencing the deadlock context (issue #1059). This helps future maintainers avoid “optimizing” the version downwards and reintroducing the issue.

Apply this diff:

 [tool.poetry.dependencies]
 python = ">= 3.9"
@@
 PyYAML = "^6.0.1"
-redis = "^6.4.0"
+# NOTE: 6.4.0+ required to avoid Celery Redis backend deadlock (see #1059).
+redis = "^6.4.0"
🤖 Prompt for AI Agents
In components/job-orchestration/pyproject.toml around line 24, the redis
dependency is pinned to "^6.4.0" but lacks an explanation; update that line to
include an inline comment stating the rationale: we require redis >=6.4.0 to
avoid a known deadlock bug (reference issue #1059) so maintainers don’t
accidentally downgrade; ensure the comment is concise, mentions "deadlock (issue
#1059)" and why the minimum is required, and keep formatting valid for TOML.

💡 Verification agent

🧩 Analysis chain

Verify Celery/kombu compatibility with redis-py 6.4.0 and absence of deprecated API usage.

redis-py 6.x introduces behavioural changes; ensure Celery 5.3.6 (via kombu) is fully compatible in your deployment matrix and that the codebase doesn’t use deprecated redis APIs (e.g., StrictRedis, older PubSub semantics).

Run this script to:

  • Confirm no usage of potentially incompatible redis APIs across the repo.
  • Ensure poetry.lock(s) resolve redis to 6.x and show Celery version(s) in use.

Additionally, please confirm Celery 5.3.6’s compatibility with redis-py 6.x:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "==> Searching for potentially incompatible redis APIs (StrictRedis, redis.asyncio direct usage)"
rg -n -C2 --type=python -P '(from\s+redis\s+import\s+StrictRedis\b|redis\.StrictRedis\b|redis\.asyncio\b)' || echo "No deprecated/fragile redis APIs found."

echo
echo "==> Locating poetry.lock files and checking resolved versions for redis and celery"
fd -a -t f '^poetry\.lock$' | while read -r lock; do
  echo "--- $lock ---"
  echo "[[package]] name = \"redis\" entries near version:"
  rg -n -C3 -F 'name = "redis"' "$lock" || echo "redis not found in $lock"
  echo
  echo "[[package]] name = \"celery\" entries near version:"
  rg -n -C3 -F 'name = "celery"' "$lock" || echo "celery not found in $lock"
  echo
done

echo
echo "==> Verifying the dependency declaration in components/job-orchestration/pyproject.toml"
rg -n -C2 -F 'redis = "^6.4.0"' components/job-orchestration/pyproject.toml

Length of output: 3692


🌐 Web query:

Does Celery 5.3.6 officially support redis-py 6.x (including kombu’s redis transport)?

💡 Result:

Short answer: No — Celery 5.3.6 does not officially support redis‑py 6.x. The Redis transport comes from Kombu, and the Kombu versions that Celery 5.3.6 depends on target redis‑py 4.x (e.g. Kombu 5.3.x explicitly states support for redis‑py 4.5.x). (docs.celeryq.dev)

Why:

  • Celery 5.3.x uses Kombu 5.3.x (Celery 5.3.6 bumps kombu to the 5.3 series). (docs.celeryq.dev)
  • Kombu 5.3.x documents support for redis‑py 4.5.x / requires redis >= 4.5.2 (it does not list redis‑py 6.x as supported). The Kombu/redis transport is therefore the component to watch for redis‑py compatibility. (docs.celeryq.dev)

Practical guidance:

  • If you are running Celery 5.3.6, pin your redis client to the Kombu-supported range (for Kombu 5.3.x that means something like redis>=4.5.2,<5.0) to avoid breakage. (docs.celeryq.dev)
  • Alternatively, upgrade to a newer Kombu/Celery release that explicitly documents support for redis‑py 5/6 — but check Kombu’s changelog/requirements first (compatibility is controlled by Kombu’s redis transport).
  • If you want, tell me the exact Celery/Kombu/redis-py versions you have (or your pyproject/requirements), and I’ll recommend a safe version constraint or an upgrade path.

Citations:


Incompatible Celery/Kombu with redis-py 6.x detected

Celery 5.3.6 (via Kombu 5.3.x) officially supports only redis-py 4.x, not 6.x. Locking redis = "^6.4.0" risks breaking the Redis transport.

Please apply one of the following critical fixes:

  • Pin redis to the Kombu-supported range:
    • In components/job-orchestration/pyproject.toml (line 24), change
    redis = "^6.4.0"
    to something like
    redis = ">=4.5.2,<5.0"
  • OR upgrade Celery/Kombu to a release that explicitly supports redis-py 6.x:
    • Bump celery (and implicitly Kombu) in pyproject.toml, then run poetry lock --no-update redis to ensure redis-py resolves to a compatible version.
  • After making either change, regenerate your lockfile and verify:
    poetry.lock should show redis 4.x under components/job-orchestration/poetry.lock
    celery should remain at 5.3.6 (if not upgrading) or move to your target version if you choose to bump.
🤖 Prompt for AI Agents
In components/job-orchestration/pyproject.toml around line 24, the redis
dependency is pinned to an incompatible 6.x (redis = "^6.4.0") which breaks
Celery/Kombu; either change that line to a Kombu-supported range (for example
redis = ">=4.5.2,<5.0") OR upgrade celery/kombu to a version that explicitly
supports redis-py 6.x, then regenerate the lockfile; after making the change run
poetry lock (or poetry lock --no-update redis if you upgraded Celery) and verify
components/job-orchestration/poetry.lock shows redis 4.x (and celery remains at
5.3.6 if you did not upgrade).

@junhaoliao junhaoliao changed the title fix(job-orchestration): Upgrade redis-py to 6.4.0 to resolve celery deadlock (fixes #1059). fix(job-orchestration): Upgrade redis-py to 6.4.0 to resolve Celery deadlock (fixes #1059). Aug 17, 2025
pymongo = "^4.7.2"
PyYAML = "^6.0.1"
redis = "^5.0.1"
redis = "^6.4.0"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the lock file and realized that Celery actually specifies a version constraint on the "extra" redis-py. Since we're not using redis-py directly in our code, we should install redis-py as an extra of celery (akin to pip install celery[redis]).

We can do this by removing redis from pyproject.toml and updating the celery install to:

-celery = "^5.3.6"
+celery = {extras = ["redis"], version = "^5.3.6"}

Note that Celery locks to redis-py <= v5.2.1 rather than v6.4.0. I tested the change locally and it also seems to resolve the issue (at least no freezes after 200 queries).

@junhaoliao junhaoliao changed the title fix(job-orchestration): Upgrade redis-py to 6.4.0 to resolve Celery deadlock (fixes #1059). fix(job-orchestration): Update Celery to 5.5.3 with redis extra; Drop direct redis dependency (fixes #1059). Aug 18, 2025

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
components/job-orchestration/pyproject.toml (1)

14-14: Document the deadlock rationale on the dependency line.

Add a concise inline comment so future bumps don’t regress the redis-py fix context (issue #1059).

Apply this diff:

 [tool.poetry.dependencies]
 python = ">= 3.9"
 Brotli = "^1.1.0"
+# NOTE: Using celery[redis] with a redis-py version that includes the PubSub UNSUBSCRIBE deadlock fix (issue #1059; redis-py PR 3677).
 celery = {extras = ["redis"], version = "^5.5.3"}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6f32b09 and 1dd3b82.

⛔ Files ignored due to path filters (1)
  • components/job-orchestration/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • components/job-orchestration/pyproject.toml (1 hunks)

python = ">= 3.9"
Brotli = "^1.1.0"
celery = "^5.3.6"
celery = {extras = ["redis"], version = "^5.5.3"}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Optionally enforce redis-py >= 6.4.0 to prevent accidental downgrades (only if Kombu in this Celery series supports redis 6.x).

If your lock check confirms Kombu in Celery ^5.5.3 supports redis 6.x, consider adding an explicit lower bound to avoid future downgrades when the lock is refreshed. If this conflicts during resolution, keep relying solely on celery[redis] and skip this.

Apply this diff only if resolution succeeds:

 [tool.poetry.dependencies]
 python = ">= 3.9"
 Brotli = "^1.1.0"
 celery = {extras = ["redis"], version = "^5.5.3"}
+# Enforce min redis-py that includes the deadlock fix (issue #1059; redis-py PR 3677).
+redis = ">=6.4.0,<7"

Follow up:

  • Run: poetry lock --no-interaction
  • Re-verify the lockfile contains redis >= 6.4.0 and Kombu consistent with Celery.

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion
[tool.poetry.dependencies]
python = ">= 3.9"
Brotli = "^1.1.0"
celery = {extras = ["redis"], version = "^5.5.3"}
# Enforce min redis-py that includes the deadlock fix (issue #1059; redis-py PR 3677).
redis = ">=6.4.0,<7"
🤖 Prompt for AI Agents
In components/job-orchestration/pyproject.toml around line 14, the celery[redis]
extra may allow redis-py to be downgraded; if Kombu used by Celery ^5.5.3 is
compatible with redis-py 6.x, add an explicit lower bound by changing the
dependency to require redis >=6.4.0 (e.g. add a redis = ">=6.4.0" entry or
adjust celery extras accordingly), then run `poetry lock --no-interaction` and
verify the lockfile contains redis >= 6.4.0 and that Kombu resolution remains
consistent; if dependency resolution fails or Kombu is incompatible, leave
celery[redis] as-is and do not apply the change.

💡 Verification agent

🧩 Analysis chain

Verify the lock resolves redis-py >= 6.4.0 and a Kombu that supports it.

The PR goal is to land redis-py 6.4.0+ to pick up the PubSub deadlock fix. Using celery[redis] is good, but the resolved redis-py version is determined by Celery/Kombu’s constraints. Please verify the lockfile actually pulls redis-py >= 6.4.0 and that the bundled Kombu is compatible.

Run this script from the repo root:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "==> Checking resolved versions in poetry.lock (all occurrences)"
fd -a -t f '^poetry\.lock$' | while read -r lock; do
  echo
  echo "--- $lock ---"
  rg -n -C2 -F 'name = "celery"' "$lock" || true
  rg -n -C2 -F 'name = "kombu"' "$lock" || true
  rg -n -C2 -F 'name = "redis"' "$lock" || true
done

echo
echo "==> Confirming the dependency declaration that drives resolution"
rg -n -C2 -F 'celery = {extras = ["redis"]' components/job-orchestration/pyproject.toml || true

echo
echo "==> Guard-rail: scan for fragile/legacy redis APIs in code"
rg -n -C2 --type=python -P '(from\s+redis\s+import\s+StrictRedis\b|redis\.StrictRedis\b|redis\.asyncio\b)' || echo "No legacy redis APIs found."

echo
echo "==> Summary check"
if fd -a -t f '^poetry\.lock$' | xargs rg -nP 'name = "redis"\n.*\nversion = "([0-9.]+)"' -U | awk -F'"' '{print $6}' | awk 'BEGIN{ok=0} {split($0,a,"."); if (a[1]>=6 && (a[2]>4 || (a[2]==4 && a[3]>=0))) ok=1} END{exit ok?0:1}'; then
  echo "redis-py appears to be >= 6.4.0 in the lock."
else
  echo "redis-py is below 6.4.0 or not found at the expected place in the lock; please investigate."
  exit 1
fi

Length of output: 3244


Celery[redis] is locking redis-py at 5.2.1, below the required 6.4.0. Critical dependency bump needed.

Please update the dependency declaration and lockfile so that redis-py >= 6.4.0 is actually pulled in, and ensure Kombu remains compatible with the newer client. For example:

  • In components/job-orchestration/pyproject.toml (around line 14), either:

    • Add an explicit top-level dependency
      redis = ">=6.4.0"
      alongside the existing celery = {extras = ["redis"], version = "^5.5.3"}, or
    • Bump Celery (and thus its bundled Kombu) to a version whose extras allow redis-py 6.x.
  • Run poetry lock and verify in components/job-orchestration/poetry.lock that:

    • name = "redis" shows version = ">=6.4.0" (or a specific 6.4.x+ entry), and
    • name = "kombu" remains at a version known to support redis-py 6.x (e.g. Kombu 5.5.4+).

After locking, test the PubSub use cases to confirm the deadlock fix is in effect.

🤖 Prompt for AI Agents
In components/job-orchestration/pyproject.toml around line 14, Celery's redis
extra is pulling redis-py 5.2.1 which is below the required 6.4.0; fix by adding
an explicit top-level dependency redis = ">=6.4.0" next to celery = {extras =
["redis"], version = "^5.5.3"} (or alternatively bump Celery to a version whose
extras allow redis-py 6.x), then run `poetry lock` to regenerate
components/job-orchestration/poetry.lock and verify the lockfile contains a
redis entry at 6.4.0+ and that kombu remains at a version compatible with
redis-py 6.x (adjust Celery/Kombu if needed), finally run your PubSub tests to
confirm the deadlock is resolved.

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

fix(job-orchestration): Update `celery` to 5.5.3 with `redis` extra to replace direct `redis` dependency (fixes #1059).

@junhaoliao junhaoliao changed the title fix(job-orchestration): Update Celery to 5.5.3 with redis extra; Drop direct redis dependency (fixes #1059). fix(job-orchestration): Update celery to 5.5.3 with redis extra to replace direct redis dependency (fixes #1059). Aug 18, 2025
@junhaoliao junhaoliao merged commit d2e97ce into y-scope:main Aug 18, 2025
12 checks passed
@junhaoliao junhaoliao deleted the celery-stuck-upgrade-redis branch August 18, 2025 03:03
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants