Skip to content

Support moe_dp_size = 1 for various attention_cp_size#22003

Merged
ch-wan merged 5 commits intosgl-project:mainfrom
Shunkangz:support_moe_dp_size_1
Apr 20, 2026
Merged

Support moe_dp_size = 1 for various attention_cp_size#22003
ch-wan merged 5 commits intosgl-project:mainfrom
Shunkangz:support_moe_dp_size_1

Conversation

@Shunkangz
Copy link
Copy Markdown
Contributor

@Shunkangz Shunkangz commented Apr 3, 2026

Motivation

Previously, we can only support attention_cp_size == moe_dp_size which is too restricted. In the real world case, we should let the MoE part unchanged and only apply the context parallel into attention layer.

Modifications

Accuracy Tests

Previous

--tp-size 4 --moe-dp-size 2 --ep-size 2 --attn-cp-size 2
Total latency: 171.566 s
Score: 0.965
Output throughput: 941.495 token/s

After

--tp-size 4 --moe-dp-size 1 --ep-size 4 --attn-cp-size 2
Total latency: 73.150 s
Score: 0.975
Output throughput: 1953.284 token/s

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@Shunkangz Shunkangz force-pushed the support_moe_dp_size_1 branch 3 times, most recently from 607b101 to 4d6d17d Compare April 3, 2026 05:22
@Shunkangz
Copy link
Copy Markdown
Contributor Author

/tag-and-rerun-ci

@github-actions github-actions Bot added the run-ci label Apr 3, 2026
@Shunkangz
Copy link
Copy Markdown
Contributor Author

/tag-and-rerun-ci

@Shunkangz
Copy link
Copy Markdown
Contributor Author

/rerun-failed-ci

1 similar comment
@Shunkangz
Copy link
Copy Markdown
Contributor Author

/rerun-failed-ci

Comment thread python/sglang/srt/layers/dp_attention.py Outdated
Comment thread python/sglang/srt/layers/dp_attention.py Outdated
Comment thread python/sglang/srt/distributed/parallel_state.py Outdated
Comment thread python/sglang/srt/distributed/parallel_state.py
Comment thread python/sglang/srt/layers/utils/cp_utils.py Outdated
Comment thread python/sglang/srt/layers/communicator.py
Comment thread python/sglang/srt/server_args.py Outdated
Comment thread python/sglang/srt/layers/communicator.py
@Shunkangz Shunkangz force-pushed the support_moe_dp_size_1 branch from 3da6de9 to 688f894 Compare April 13, 2026 05:09
@Shunkangz
Copy link
Copy Markdown
Contributor Author

/tag-and-rerun-ci

@Shunkangz
Copy link
Copy Markdown
Contributor Author

/rerun-failed-ci

1 similar comment
@Shunkangz
Copy link
Copy Markdown
Contributor Author

/rerun-failed-ci

Copy link
Copy Markdown
Collaborator

@ch-wan ch-wan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix the PR title? I will merge it when it's fixed.

@Shunkangz Shunkangz changed the title Support general attention_cp_size % moe_dp_size == 0 Support attention_cp_size % moe_dp_size == 0 and moe_dp_size = 1 Apr 20, 2026
@ch-wan ch-wan changed the title Support attention_cp_size % moe_dp_size == 0 and moe_dp_size = 1 Support moe_dp_size = 1 for various attention_cp_size Apr 20, 2026
@ch-wan ch-wan merged commit 3dc1491 into sgl-project:main Apr 20, 2026
359 of 461 checks passed
zhangying098 pushed a commit to zhangying098/sglang that referenced this pull request Apr 23, 2026
)

Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
kyx1999 pushed a commit to KMSorSMS/sglang that referenced this pull request Apr 27, 2026
)

Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants