Skip to content

[Spec] Move forward timeout before verify to fix Eagle v1 filter mismatch#18760

Merged
hnyls2002 merged 2 commits intomainfrom
fix/move-forward-timeout-before-verify
Feb 13, 2026
Merged

[Spec] Move forward timeout before verify to fix Eagle v1 filter mismatch#18760
hnyls2002 merged 2 commits intomainfrom
fix/move-forward-timeout-before-verify

Conversation

@hnyls2002
Copy link
Copy Markdown
Collaborator

@hnyls2002 hnyls2002 commented Feb 13, 2026

Motivation

Fixes ValueError: length of new_indices != length of topk_p in Eagle speculative decoding v1.

Related: #14742, #17831

Root Cause

Forward timeout was checked in process_batch_result_decode, which runs AFTER verify has already set topk_p to N-K entries. The timeout set req.to_finish and check_finished() immediately converted it to finished_reason, causing M additional requests to finish. The next iteration's filter_batch(v1_spec_info_filtered=True) then found N-K-M active requests but topk_p still had N-K entries — length mismatch.

Abort doesn't have this problem because it only sets to_finish (in process_input_requests), and filter_batch only checks finished_reason. The pending to_finish is caught by verify's check_finished() in the next forward pass, keeping it consistent with topk_p.

Fix

  • Move forward timeout detection from process_batch_result_decode / process_batch_result_prefill to get_next_batch_to_run(), next to _abort_on_queued_timeout(). Forward timeout now follows the same pattern as abort: only set to_finish, let verify's check_finished() handle conversion.
  • Remove dead code filter in forward_draft_extendreq.finished() always returned False there since check_finished() was never called before that point.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@hnyls2002 hnyls2002 changed the title Fix the abortion between verify(spec-v1) and process decode. [Spec] Move forward timeout check before verify to fix Eagle v1 filter mismatch Feb 13, 2026
@hnyls2002 hnyls2002 changed the title [Spec] Move forward timeout check before verify to fix Eagle v1 filter mismatch [Spec] Move forward timeout before verify to fix Eagle v1 filter mismatch Feb 13, 2026
@hnyls2002
Copy link
Copy Markdown
Collaborator Author

Reproduction

python3 -m sglang.launch_server --model meta-llama/Meta-Llama-3.1-8B-Instruct --host 0.0.0.0 --port 23333 --speculative-draft-model lmsys/sglang-EAGLE-LLaMA3-Instruct-8B --speculative-algorithm EAGLE --speculative-num-steps 5 --speculative-eagle-topk 1 --speculative-num-draft-tokens 6 --dtype float16
import time
import threading
import requests

URL = "http://localhost:23333/v1/chat/completions"

def send(tag, max_tokens):
    try:
        r = requests.post(URL, json={
            "model": "default",
            "messages": [{"role": "user", "content": "Write a long story."}],
            "max_tokens": max_tokens,
        }, timeout=30)
        print(f"[{tag}] {r.status_code}")
    except Exception as e:
        print(f"[{tag}] {e}")

# Wave 1: old forward_entry_time — will timeout after 3s
threads = [threading.Thread(target=send, args=(f"old-{i}", 512)) for i in range(32)]
for t in threads: t.start()

# Wait for wave 1 to enter decode
time.sleep(2)

# Wave 2: new forward_entry_time — will NOT timeout yet
threads2 = [threading.Thread(target=send, args=(f"new-{i}", 512)) for i in range(16)]
for t in threads2: t.start()

for t in threads + threads2: t.join()

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@hnyls2002 hnyls2002 force-pushed the fix/move-forward-timeout-before-verify branch from 0f1be09 to e0c304a Compare February 13, 2026 02:38
@hnyls2002
Copy link
Copy Markdown
Collaborator Author

test_abort.py passed, and other failures are unrelated.

@hnyls2002 hnyls2002 merged commit d29e331 into main Feb 13, 2026
274 of 301 checks passed
@hnyls2002 hnyls2002 deleted the fix/move-forward-timeout-before-verify branch February 13, 2026 04:58
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant