Skip to content

[PD] Add retry interval in ensure_prefill_info#20832

Merged
ShangmingCai merged 1 commit intomainfrom
pd_retry_interval
Mar 18, 2026
Merged

[PD] Add retry interval in ensure_prefill_info#20832
ShangmingCai merged 1 commit intomainfrom
pd_retry_interval

Conversation

@ShangmingCai
Copy link
Copy Markdown
Collaborator

Motivation

A following PR to #20785

  • Non-blocking version with retry interval, or _max_ensure_retries would be meaningless for production since 30 retries all fail within 1 second.

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

Signed-off-by: Shangming Cai <csmthu@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the robustness of the prefill information ensuring process within the disaggregation decode component. By introducing a time-based retry interval, the system avoids aggressive, immediate re-attempts for failed prefill operations, allowing for better handling of temporary network or resource issues. This change prevents a large number of retries from occurring within a very short timeframe, making the retry logic more effective and less resource-intensive.

Highlights

  • Retry Mechanism Enhancement: Introduced a retry interval (_ensure_retry_interval) to prevent rapid, consecutive retries for ensure_prefill_info calls, making the system more resilient to transient failures.
  • Configuration Adjustment: Decreased the maximum number of retries (_max_ensure_retries) from 30 to 20, aligning with the new retry interval logic.
  • Time-based Retries: Implemented a mechanism to track the last attempt time for each bootstrap address, ensuring that subsequent retries adhere to the defined interval.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ShangmingCai
Copy link
Copy Markdown
Collaborator Author

/rerun-stage stage-c-test-8-gpu-h20

@github-actions
Copy link
Copy Markdown
Contributor

✅ Triggered stage-c-test-8-gpu-h20 to run independently (skipping dependencies).

@github-actions
Copy link
Copy Markdown
Contributor

🔗 View workflow run

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry interval to the ensure_prefill_info function, which is a good improvement to prevent rapid-fire retries in a non-blocking scenario. The changes are logical and correctly implement the intended delay using time.monotonic. My feedback focuses on minor code style improvements to make the cleanup logic more concise and robust.

Comment thread python/sglang/srt/disaggregation/decode.py
Comment thread python/sglang/srt/disaggregation/decode.py
@ShangmingCai
Copy link
Copy Markdown
Collaborator Author

ShangmingCai commented Mar 18, 2026

image

Related CI has passed.

@ShangmingCai ShangmingCai merged commit 8b46f1f into main Mar 18, 2026
89 of 97 checks passed
@ShangmingCai ShangmingCai deleted the pd_retry_interval branch March 18, 2026 08:02
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
Signed-off-by: Shangming Cai <csmthu@gmail.com>
0-693 pushed a commit to 0-693/sglang that referenced this pull request Mar 25, 2026
Signed-off-by: Shangming Cai <csmthu@gmail.com>
dutsc pushed a commit to dutsc/sglang that referenced this pull request Mar 30, 2026
Signed-off-by: Shangming Cai <csmthu@gmail.com>
JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026
Signed-off-by: Shangming Cai <csmthu@gmail.com>
yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026
Signed-off-by: Shangming Cai <csmthu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants