Skip to content

[Scheduler] Unify idle checks into is_fully_idle() and fix weight update test#20296

Merged
hnyls2002 merged 6 commits intomainfrom
lsyin/cleanup-scheduling-idle
Mar 11, 2026
Merged

[Scheduler] Unify idle checks into is_fully_idle() and fix weight update test#20296
hnyls2002 merged 6 commits intomainfrom
lsyin/cleanup-scheduling-idle

Conversation

@hnyls2002
Copy link
Copy Markdown
Collaborator

@hnyls2002 hnyls2002 commented Mar 10, 2026

Summary

  • Consolidate _is_no_request(), _is_idle_for_hicache_storage_op(), and inline health-check logic into a single is_fully_idle(for_health_check=False)
  • Tighten release_memory_occupation and flush_cache guards to use is_fully_idle() — now also checks chunked_req, dllm_manager, waiting_queue, and grammar_queue

Background

  • Old _is_no_request() did not check waiting_queue or grammar_queue, so flush_cache could clear KV cache while retracted requests sat in waiting_queue
  • Old _is_idle_for_hicache_storage_op() was a superset of _is_no_request() but missed chunked_req and dllm_manager
  • Health-check skip logic was inlined with ~25 lines of ad-hoc checks duplicating the above

Fix

  • is_fully_idle() covers all batch-running state (including chunked_req, dllm_manager, overlap, PP, disagg) plus all waiting queues
  • for_health_check=True skips grammar queue and prefill inflight queue (they don't produce immediate batch results, so ongoing requests in other queues already carry health info)
  • release_memory_occupation now asserts is_fully_idle() instead of _is_no_request()
  • Fix test_update_weights to use pause_generation("abort") which fully drains all requests before weight update (old "retract" mode left requests in waiting_queue, incompatible with the stricter idle check)
  • Update hicache docs to reference is_fully_idle()

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/rerun-ut test_update_weights_from_tensor.py

@github-actions
Copy link
Copy Markdown
Contributor

✅ Triggered /rerun-ut on 1-gpu-5090 runner:

cd test/ && python3 registered/rl/test_update_weights_from_tensor.py

@github-actions
Copy link
Copy Markdown
Contributor

🔗 View workflow run

@github-actions github-actions Bot added documentation Improvements or additions to documentation hicache Hierarchical Caching for SGLang labels Mar 10, 2026
@hnyls2002 hnyls2002 changed the title [Scheduler] Strengthen release_memory_occupation idle guard and fix test to use abort mode [Scheduler] Unify idle checks into is_fully_idle() and fix weight update test Mar 10, 2026
@hnyls2002 hnyls2002 merged commit 50953ae into main Mar 11, 2026
273 of 286 checks passed
@hnyls2002 hnyls2002 deleted the lsyin/cleanup-scheduling-idle branch March 11, 2026 00:50
liubiyongge pushed a commit to liubiyongge/sglang that referenced this pull request Mar 13, 2026
whybeyoung pushed a commit to whybeyoung/sglang that referenced this pull request Mar 14, 2026
whybeyoung added a commit to whybeyoung/sglang that referenced this pull request Mar 14, 2026
hnyls2002 added a commit that referenced this pull request Mar 17, 2026
Move disagg queue checks (bootstrap/prealloc/transfer) from the
health-check idle path to the true-idle-only path. These queues
may have items without any request running on GPU, so they cannot
piggyback health check results.

Related: #20296, #20191
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
KHAEntertainment pushed a commit to Clarit-AI/Engram that referenced this pull request Mar 31, 2026
JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026
yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation hicache Hierarchical Caching for SGLang high priority run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant