Skip to content

fix(scheduler): preserve prefill error outputs during decode#1622

Merged
jundot merged 1 commit into
jundot:mainfrom
ken-zzzzz:fix/preserve-prefill-error-outputs
Jun 3, 2026
Merged

fix(scheduler): preserve prefill error outputs during decode#1622
jundot merged 1 commit into
jundot:mainfrom
ken-zzzzz:fix/preserve-prefill-error-outputs

Conversation

@ken-zzzzz

Copy link
Copy Markdown
Contributor

Closes #1621

Summary

  • Preserve scheduler rejection outputs when decode responses are produced in the same step().
  • Add a regression test covering a prefill failure output followed by decode output in the same scheduler step.

Why

When prefill failed, _schedule_waiting() could return a terminal error RequestOutput for the failed request. If another request produced decode output in the same Scheduler.step(), the decode path replaced output.outputs instead of appending to it, dropping the prefill error output.

That left the failed request removed from scheduler state but without a finished output delivered to engine_core, so the client could wait indefinitely with no tokens and no error.

Changes

  • Changed Scheduler.step() to append decode outputs with output.outputs.extend(outputs).
  • Changed finished request ID handling to merge with output.finished_request_ids.update(finished_ids).
  • Added TestSchedulerStepOutputs.test_decode_outputs_preserve_prefill_rejections.

Tests

.venv/bin/python -m pytest tests/test_scheduler.py -q

Result:

105 passed

Also verified the new regression test fails when the scheduler fix is reverted.

@jundot

jundot commented Jun 3, 2026

Copy link
Copy Markdown
Owner

Thanks, this matches the scheduler output assembly bug from #1621.

I verified that the current scheduler can drop a prefill rejection output when decode responses are produced in the same step, and this change preserves the terminal error output by appending decode outputs instead of replacing the list. The regression test covers the exact mixed rejection/decode case, and the scheduler test suite passes on the merged tree.

This looks good to me, and I'm going to merge it.

@jundot jundot merged commit 8fdabbf into jundot:main Jun 3, 2026
@ken-zzzzz ken-zzzzz deleted the fix/preserve-prefill-error-outputs branch June 3, 2026 08:26
@ken-zzzzz

Copy link
Copy Markdown
Contributor Author

Thanks for addressing this issue so quickly. I really appreciate your amazing work on the project 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request stalls after Prefill failed for <req-id>

2 participants