[Perf] Restore torch.compile fusion for topk postprocessing by nvcastet · Pull Request #21771 · sgl-project/sglang

nvcastet · 2026-03-31T18:02:10Z

Motivation

PR #16945 reorganized topk logic into _post_process_topk_ids but inlined topk_ids_logical_to_physical and _mask_topk_ids_padded_region instead of calling the existing @torch.compile-decorated _biased_grouped_topk_postprocess. This was flagged during review by @fzyzcjy (comment):

qq: does this mean this will launch a kernel while this should be fused in many cases

The regression causes these two operations to run as separate eager kernels on CUDA instead of being fused via torch.compile, impacting expert-parallel / EPLB paths.

Current ToT:

This PR restores fusion present before PR #16945:

Modifications

Replace the two inlined calls in _post_process_topk_ids with a call to the existing compiled _biased_grouped_topk_postprocess, restoring kernel fusion. The function was already defined with @torch.compile(dynamic=True, backend=get_compiler_backend()) but had become dead code after #16945.

Checklist

Format: pre-commit run --all-files

PR sgl-project#16945 refactored topk postprocessing into `_post_process_topk_ids` but inlined the `topk_ids_logical_to_physical` and `_mask_topk_ids_padded_region` calls instead of delegating to the existing `@torch.compile`-decorated `_biased_grouped_topk_postprocess`. This caused those two operations to run as separate eager kernels instead of being fused by torch.compile, a regression for CUDA paths using expert-parallel / EPLB. Fix: call `_biased_grouped_topk_postprocess` (which already carries `@torch.compile(dynamic=True)`) from within `_post_process_topk_ids`, restoring the compiled kernel fusion. Ref: sgl-project#16945 (comment)

nvcastet · 2026-03-31T18:04:13Z

/tag-and-rerun-ci

gemini-code-assist

Code Review

This pull request refactors the post-processing logic for top-k IDs in the MoE layer by replacing sequential calls to topk_ids_logical_to_physical and _mask_topk_ids_padded_region with a consolidated call to _biased_grouped_topk_postprocess when running on CUDA. I have no feedback to provide.

trevor-m

LGTM

…stprocessing (sgl-project#21771) Upstream SHA: 490fa9f Cherry-picked from sgl-project/sglang Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ect#21771)

nvcastet requested review from BBuf, Edwardf0t1, Fridge003, HaiShaw, Ying1123, ch-wan, ispobock and merrymercy as code owners March 31, 2026 18:02

github-actions Bot added the run-ci label Mar 31, 2026

gemini-code-assist Bot reviewed Mar 31, 2026

View reviewed changes

trevor-m approved these changes Mar 31, 2026

View reviewed changes

YAMY1234 mentioned this pull request Apr 4, 2026

[Performance Regression] ~2x throughput drop in disaggregated PD mode with Wide-EP (DeepSeek-R1 FP4, GB200) between SGLang v0.5.8 and latest nightly #22095

Closed

Fridge003 approved these changes Apr 6, 2026

View reviewed changes

Merge branch 'main' into fix/restore-torch-compile-topk-postprocess

fe74d40

Fridge003 merged commit 490fa9f into sgl-project:main Apr 7, 2026
206 of 249 checks passed

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

[Perf] Restore torch.compile fusion for topk postprocessing (sgl-proj…

33e3d08

…ect#21771)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Perf] Restore torch.compile fusion for topk postprocessing#21771

[Perf] Restore torch.compile fusion for topk postprocessing#21771
Fridge003 merged 2 commits intosgl-project:mainfrom
nvcastet:fix/restore-torch-compile-topk-postprocess

nvcastet commented Mar 31, 2026 •

edited

Loading

Uh oh!

nvcastet commented Mar 31, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

trevor-m left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nvcastet commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

nvcastet commented Mar 31, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

trevor-m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nvcastet commented Mar 31, 2026 •

edited

Loading