Skip to content

[AMD] Optimize MiniMax-M2.5 - use aiter biased_grouped_topk for sigmoid scoring in MoE routing#23611

Merged
HaiShaw merged 1 commit intosgl-project:mainfrom
yctseng0211:aiter_topk
Apr 25, 2026
Merged

[AMD] Optimize MiniMax-M2.5 - use aiter biased_grouped_topk for sigmoid scoring in MoE routing#23611
HaiShaw merged 1 commit intosgl-project:mainfrom
yctseng0211:aiter_topk

Conversation

@yctseng0211
Copy link
Copy Markdown
Collaborator

@yctseng0211 yctseng0211 commented Apr 24, 2026

Motivation

  • For models using sigmoid scoring with correction bias (e.g., MiniMax-M2.5),
    use aiter.biased_grouped_topk (ASM kernel) instead of sgl_kernel.topk_sigmoid
    on AMD GPUs.
  • The aiter kernel runs at ~6 us/call vs ~9.3 us/call for the sgl_kernel variant,
    reducing MoE routing overhead by ~35% per call.
  • Benchmarked on MI355X with MiniMax-M2.5 FP8 (TP=4, ISL=8192, OSL=1024):
    +2.0% output throughput at conc=64, +2.4% at conc=32. No regression at any
    concurrency level (conc=4..128).

Modifications

Accuracy Tests

baseline (topk_sigmoid) patch (aiter biased_grouped_topk)
GSM8K Accuracy 93.3% 93.4%

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@yctseng0211 yctseng0211 changed the title [AMD] Use aiter biased_grouped_topk for sigmoid scoring in MoE routing [AMD] Optimize MiniMax-M2.5 - use aiter biased_grouped_topk for sigmoid scoring in MoE routing Apr 24, 2026
@yctseng0211 yctseng0211 marked this pull request as ready for review April 24, 2026 08:21
@HaiShaw HaiShaw merged commit fb272d2 into sgl-project:main Apr 25, 2026
57 of 65 checks passed
vguduruTT pushed a commit to vguduruTT/sglang that referenced this pull request May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants