Fix spec info's filter when reqs are finished right after prefill by hnyls2002 · Pull Request #14742 · sgl-project/sglang

hnyls2002 · 2025-12-09T14:41:40Z

This PR fixes #14368

Before the fix in the scheduler filter_batch.

[2025-12-11 04:05:55] Scheduler hit an exception: Traceback (most recent call last):
  File "/host_home/common_sync/sglang/python/sglang/srt/managers/scheduler.py", line 2706, in run_scheduler_process
    scheduler.event_loop_normal()
  File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/host_home/common_sync/sglang/python/sglang/srt/managers/scheduler.py", line 989, in event_loop_normal
    batch = self.get_next_batch_to_run()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/host_home/common_sync/sglang/python/sglang/srt/managers/scheduler.py", line 1676, in get_next_batch_to_run
    self.last_batch.filter_batch(
  File "/host_home/common_sync/sglang/python/sglang/srt/managers/schedule_batch.py", line 1857, in filter_batch
    self.spec_info.filter_batch(
  File "/host_home/common_sync/sglang/python/sglang/srt/speculative/eagle_info.py", line 746, in filter_batch
    raise ValueError(error_msg)
ValueError: length of new_indices: 6 != length of topk_p: 7, this should not happen

[2025-12-11 04:05:55] SIGQUIT received. signum=None, frame=None. It usually means one child failed.
[1]    1715904 killed     python test_eagle_infer_a.py

Background of the issue

spec_info contains some batching information and should be filtered together with the scheduler batch (which is currently NOT).
- DECODE: In spec v1's decoding stage, we filter the spec info right after the verification, which causes misalignment with the filtering of the scheduler batch.
- PREFILL: We will never meet finished requests in the forward_draft_extend, so in the prefill/extend stage, the spec_info filter will be with the scheduler batch filter, which is a different behavior from the decoding stage.

We introduced has_been_filtered to indicate whether the filtering in the scheduler batch should also apply to the spec_info, but this flag was incorrectly assigned previously.

When there are requests finished immediately after the prefill/extend stage, and there is no chunked prefill request, the previous implementation would consider this filter a DECODE filter and wouldn't filter the spec_info at all.

gemini-code-assist · 2025-12-09T14:41:55Z

Summary of Changes

Hello @hnyls2002, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue within the schedule_batch manager where the spec_info filtering mechanism was not behaving as expected, particularly when requests finished their prefill stage. The change refactors the logic for a flag that indicates whether filtering has occurred, ensuring that spec_info is always processed with the correct state, thereby preventing potential errors in speculative decoding or related operations.

Highlights

Refactored Filtering Logic: Simplified the determination of the has_been_filtered flag passed to spec_info.filter_batch by introducing a new boolean variable is_extend_filter.
Corrected Spec Info Filtering: Addressed a bug where spec_info was incorrectly filtered when requests completed immediately after prefill, ensuring the filter_batch method receives the accurate filtering state.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the logic for determining the has_been_filtered flag in filter_batch, which fixes a subtle bug related to speculative decoding when requests are finished right after prefill. The change is correct and simplifies the implementation. I've added a small suggestion to further improve readability by inlining the logic, making it more self-contained.

jimmy-evo · 2025-12-10T03:52:52Z

works fine with me
LGTM

hnyls2002 · 2025-12-11T01:57:41Z

/tag-and-rerun-ci

…l-project#14742)

fix

c1a3a30

hnyls2002 requested review from Ying1123, merrymercy, xiezhq-hermann and zhyncs as code owners December 9, 2025 14:41

gemini-code-assist Bot reviewed Dec 9, 2025

View reviewed changes

Comment thread python/sglang/srt/managers/schedule_batch.py

hnyls2002 mentioned this pull request Dec 9, 2025

fix: Handle spec_info length mismatch in Eagle Prefill/Extend phase #14536

Closed

6 tasks

hnyls2002 added 3 commits December 10, 2025 13:36

remove the filter batch in forward_draft_extend

f2104b0

default strict check

8c661fe

Merge branch 'main' into lsyin/fix-spec-info-filter

a15cffb

github-actions Bot added the run-ci label Dec 11, 2025

hnyls2002 added 3 commits December 11, 2025 11:06

add new ci

9dcd0ee

update

513b0d2

add

f32a6da

ispobock approved these changes Dec 11, 2025

View reviewed changes

hnyls2002 added 5 commits December 11, 2025 21:01

work around retract filter

aab0427

robust test

1e6aeb6

fix

5a20aab

Merge branch 'main' into lsyin/fix-spec-info-filter

f078da8

Merge branch 'main' into lsyin/fix-spec-info-filter

f0c7819

hnyls2002 merged commit ed52d01 into main Dec 13, 2025
110 of 125 checks passed

hnyls2002 deleted the lsyin/fix-spec-info-filter branch December 13, 2025 16:32

YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026

Fix spec info's filter when reqs are finished right after prefill (sg…

94efa0e

…l-project#14742)

This was referenced Feb 13, 2026

[Bug] Issue: Warning "length of new_indices != length of topk_p" during NEXTN speculative decoding with Qwen3-Next #15412

Closed

[Spec] Move forward timeout before verify to fix Eagle v1 filter mismatch #18760

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix spec info's filter when reqs are finished right after prefill#14742

Fix spec info's filter when reqs are finished right after prefill#14742
hnyls2002 merged 12 commits intomainfrom
lsyin/fix-spec-info-filter

hnyls2002 commented Dec 9, 2025 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Dec 9, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

jimmy-evo commented Dec 10, 2025 •

edited

Loading

Uh oh!

hnyls2002 commented Dec 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hnyls2002 commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background of the issue

Uh oh!

gemini-code-assist Bot commented Dec 9, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

jimmy-evo commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hnyls2002 commented Dec 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hnyls2002 commented Dec 9, 2025 •

edited

Loading

jimmy-evo commented Dec 10, 2025 •

edited

Loading