Skip to content

fix(query-scheduler): Gracefully terminate reducer connections that do not attempt a connection handshake (fixes #1386).#1389

Merged
junhaoliao merged 3 commits into
y-scope:mainfrom
junhaoliao:graceful-reducer-handler
Oct 9, 2025
Merged

fix(query-scheduler): Gracefully terminate reducer connections that do not attempt a connection handshake (fixes #1386).#1389
junhaoliao merged 3 commits into
y-scope:mainfrom
junhaoliao:graceful-reducer-handler

Conversation

@junhaoliao

@junhaoliao junhaoliao commented Oct 6, 2025

Copy link
Copy Markdown
Member

Description

This PR adds graceful handling for empty reducer connections to the query scheduler, by ignoring such benign noises.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. Built and started the package.
  2. Ran docker ps to find out the instance id of the query-scheduler.
  3. Ran docker exec <query-scheduler-id> bash -c '< /dev/tcp/localhost/7000' and observed the command exited successfully.
  4. Ran docker logs <query-scheduler-id> and observed no exception logs were printed as before.

Summary by CodeRabbit

  • Bug Fixes

    • Treat reducer connections that close without sending data as benign to avoid spurious errors or crashes.
    • Short-circuit connection handling when no message is received and log these cases at debug level to reduce noisy error logs.
    • Updated docstrings to clarify return behavior and error handling for closed connections.
  • Reliability

    • Improved resilience of connection handling for smoother job orchestration under intermittent network conditions.

@junhaoliao junhaoliao requested a review from a team as a code owner October 6, 2025 21:54
@coderabbitai

coderabbitai Bot commented Oct 6, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Change _recv_msg_from_reducer to return Optional[bytes] and handle asyncio.IncompleteReadError: if the exception's partial data is empty, log a debug message and return None; if partial data exists, re-raise the exception. Reducer handler short-circuits on None. Signature changed; no other public APIs altered.

Changes

Cohort / File(s) Summary of changes
Reducer connection handling
components/job-orchestration/job_orchestration/scheduler/query/reducer_handler.py
Update _recv_msg_from_reducer return type to Optional[bytes]; catch asyncio.IncompleteReadError, return None and log when err.partial is empty, re-raise when non-empty; short-circuit reducer connection handler when received message is None; update docstrings.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Reducer
  participant Scheduler as Scheduler::ReducerHandler

  Note over Scheduler: Await initial reducer read via _recv_msg_from_reducer

  Reducer->>Scheduler: Open connection / send data (or close)
  alt Read completes with data (bytes returned)
    Scheduler-->>Reducer: Continue normal processing
  else asyncio.IncompleteReadError raised
    alt err.partial length == 0
      Note over Scheduler: Debug log and return None (short-circuit)
      Scheduler-->>Reducer: Close handling (no error)
    else err.partial length > 0
      Note over Scheduler: Re-raise exception (propagate error)
      Scheduler-->>Reducer: Error propagated
    end
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly describes the primary change to the query scheduler by indicating that reducer connections without a handshake will be gracefully terminated, follows Conventional Commit conventions, and clearly references the related issue, making it both specific and informative.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@junhaoliao junhaoliao requested a review from gibber9809 October 6, 2025 21:55

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba34c27 and f256f14.

📒 Files selected for processing (1)
  • components/job-orchestration/job_orchestration/scheduler/query/reducer_handler.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: package-image
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-22.04)
  • GitHub Check: rust-checks (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)

Comment thread components/job-orchestration/job_orchestration/scheduler/query/reducer_handler.py Outdated

@gibber9809 gibber9809 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. For PR title how about:
fix(query-scheduler): Gracefully terminate reducer connections that do not attempt a connection handshake (fixes #1386).

@junhaoliao junhaoliao changed the title fix(query-scheduler): Handle empty reducer connections gracefully (fixes #1386). fix(query-scheduler): Gracefully terminate reducer connections that do not attempt a connection handshake (fixes #1386). Oct 9, 2025
@junhaoliao junhaoliao merged commit b8f7b75 into y-scope:main Oct 9, 2025
24 checks passed
@junhaoliao junhaoliao deleted the graceful-reducer-handler branch May 7, 2026 19:46
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants