Skip to content

dp-attention: add follow_bootstrap_room + auto load-balance; drop decode_round_robin#16110

Merged
hnyls2002 merged 1 commit intosgl-project:mainfrom
mufeez-amjad:mufeez/dp-attention-refactor
Dec 30, 2025
Merged

dp-attention: add follow_bootstrap_room + auto load-balance; drop decode_round_robin#16110
hnyls2002 merged 1 commit intosgl-project:mainfrom
mufeez-amjad:mufeez/dp-attention-refactor

Conversation

@mufeez-amjad
Copy link
Copy Markdown
Contributor

Motivation

Towards #16080

Modifications

  • Introduced follow_bootstrap_room as the correct PD prefill routing method (uses bootstrap_room), and made round_robin truly round-robin.
  • Changed --load-balance-method default to auto, resolving to:
    • non-PD: round_robin
    • PD prefill: follow_bootstrap_room
    • PD decode: round_robin
  • Removed the obsolete decode_round_robin policy and related code (reverting add decode round robin policy #15164).

Accuracy Tests

Added a few tests to test/srt/test_server_args.py.

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Dec 30, 2025
@mufeez-amjad mufeez-amjad force-pushed the mufeez/dp-attention-refactor branch from b99d7d9 to 56223e5 Compare December 30, 2025 01:13
@hnyls2002
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@hnyls2002
Copy link
Copy Markdown
Collaborator

@mufeez-amjad Good Job

@hnyls2002 hnyls2002 merged commit cbff7ad into sgl-project:main Dec 30, 2025
262 of 378 checks passed
YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants