[RL] Fix weight update for mxfp8 flashinfer_cutlass gemm backend by zianglih · Pull Request #22484 · sgl-project/sglang

zianglih · 2026-04-10T03:32:15Z

Motivation

#21576 refactors mxfp8 scaling factor swizzling to in-place style. However, on the flashinfer_cutlass mxfp8 code path, block_scale_interleave may pad the scales, violating the shape contract for weight update. Therefore, we revert to the previous original & swizzled dual buffer approach.

For the full mxfp8 DeepSeek 671B model, all the duplicate ue8m0 accounts for less than 1gb, thus the overhead is negligible.

In the future we should rely on a restore_weights_before_loading api which is still under development.

Modifications

Revert flashinfer_cutlass mxfp8 gemm to style prior [FlashInver v0.6.7] Integrate flashinfer_trtllm mxfp8 gemm #21576

Accuracy Tests

Speed Tests and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review and Merge Process

Ping Merge Oncalls to start the process. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

gemini-code-assist · 2026-04-10T03:32:20Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

zianglih · 2026-04-10T03:35:19Z

Hi @b8zhong , could you take a look at the tiny reverting fix? Thank you!

b8zhong · 2026-04-11T03:24:43Z

/tag-and-rerun-ci

zianglih · 2026-04-11T23:53:55Z

/rerun-failed-ci

zianglih · 2026-04-12T00:51:31Z

/rerun-failed-ci

zianglih · 2026-04-12T03:42:57Z

/rerun-failed-ci

…-project#22484)

zianglih requested review from AniZpZ, BBuf, Edwardf0t1, FlamingoPg, HaiShaw, b8zhong and ch-wan as code owners April 10, 2026 03:32

Fix

2f03d08

ziang-and force-pushed the fi-ctlass-fix branch from 355f101 to 2f03d08 Compare April 11, 2026 03:17

b8zhong approved these changes Apr 11, 2026

View reviewed changes

b8zhong enabled auto-merge (squash) April 11, 2026 03:24

github-actions Bot added the run-ci label Apr 11, 2026

Merge branch 'main' into fi-ctlass-fix

0810d65

b8zhong added the high priority label Apr 11, 2026

Merge branch 'main' into fi-ctlass-fix

87bf145

b8zhong merged commit 31453bb into sgl-project:main Apr 12, 2026
108 of 113 checks passed

zianglih mentioned this pull request Apr 13, 2026

[Roadmap] Blackwell MXFP8 and NVFP4 RL training radixark/miles#615

Open

30 tasks

pyc96 pushed a commit to pyc96/sglang that referenced this pull request Apr 14, 2026

[RL] Fix weight update for mxfp8 flashinfer_cutlass gemm backend (sgl…

394a6d0

…-project#22484)

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

[RL] Fix weight update for mxfp8 flashinfer_cutlass gemm backend (sgl…

1287a80

…-project#22484)

zianglih deleted the fi-ctlass-fix branch April 24, 2026 17:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RL] Fix weight update for mxfp8 flashinfer_cutlass gemm backend#22484

[RL] Fix weight update for mxfp8 flashinfer_cutlass gemm backend#22484
b8zhong merged 3 commits intosgl-project:mainfrom
zianglih:fi-ctlass-fix

zianglih commented Apr 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Apr 10, 2026

Uh oh!

zianglih commented Apr 10, 2026

Uh oh!

b8zhong commented Apr 11, 2026

Uh oh!

zianglih commented Apr 11, 2026

Uh oh!

zianglih commented Apr 12, 2026

Uh oh!

zianglih commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zianglih commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

Uh oh!

gemini-code-assist Bot commented Apr 10, 2026

Uh oh!

zianglih commented Apr 10, 2026

Uh oh!

b8zhong commented Apr 11, 2026

Uh oh!

zianglih commented Apr 11, 2026

Uh oh!

zianglih commented Apr 12, 2026

Uh oh!

zianglih commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zianglih commented Apr 10, 2026 •

edited

Loading