[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support by BBuf · Pull Request #22574 · sgl-project/sglang

BBuf · 2026-04-11T04:27:06Z

Summary

add a FLUX.1-dev ModelOpt NVFP4 mixed-transformer builder for SGLang diffusion
make NVFP4 loading configurable for nibble swapping and preserve validated FLUX.1-dev export layout
fix FLUX attention/single-block quant prefixes so FLUX.1 fallback excludes match the intended modules
add unit coverage for the new NVFP4 config and FLUX prefix behavior

Validation

Remote RTX 5090 (4 GPUs), torch.compile disabled throughout benchmark/profile/correctness runs
pytest -q python/sglang/multimodal_gen/test/unit/test_transformer_quant.py -q in the remote diffusion container
BF16 benchmark denoise: 37.6940s
NVFP4 benchmark denoise: 29.0421s (22.95% faster)
BF16 end-to-end: 38.2545s
NVFP4 end-to-end: 29.4954s (22.90% faster)
Correctness check against BF16 at 512x512 / 8 steps: trajectory cosine 0.9933, final image PSNR 28.16 dB

bf16:

nvfp4:

Notes

The validated FLUX.1-dev path uses --transformer-path for the mixed SGLang transformer override.
Profiling traces were captured on both main and this branch with identical 4-GPU settings and torch.compile disabled.

gemini-code-assist · 2026-04-11T04:27:12Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

BBuf · 2026-04-12T06:43:43Z

/tag-and-rerun-ci

This reverts commit 03a1a7b

…22649)

…2574)" (#22649)" This reverts commit bf022e1.

…t#22574)" (sgl-project#22649)

diffusion: add FLUX.1-dev modelopt nvfp4 support

42df059

BBuf requested review from mickqian, ping1jing2, yhyang201 and yingluosanqian as code owners April 11, 2026 04:27

github-actions Bot added quant LLM Quantization blackwell SM100/SM120 diffusion SGLang Diffusion labels Apr 11, 2026

docs: refresh diffusion FLUX NVFP4 skills

29241e2

github-actions Bot added the documentation Improvements or additions to documentation label Apr 11, 2026

BBuf added 4 commits April 11, 2026 12:55

docs: narrow diffusion skill updates

84a9e7e

docs: move modelopt support matrix to quant docs

6555844

docs: drop stale fp8 converter references

d3eb46d

diffusion: unify modelopt transformer builders

4071ce1

BBuf requested review from DarkSharpness, HydraQYH, celve and yuan-luo as code owners April 11, 2026 15:12

github-actions Bot added the jit-kernel label Apr 11, 2026

github-actions Bot added the run-ci label Apr 12, 2026

docs: trim modelopt matrix note

792a53f

BBuf merged commit 03a1a7b into sgl-project:main Apr 12, 2026
86 of 124 checks passed

mickqian added a commit that referenced this pull request Apr 13, 2026

Revert "[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (#22574)"

2184b4e

This reverts commit 03a1a7b

mickqian mentioned this pull request Apr 13, 2026

Revert "[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (#22574)" #22649

Merged

5 tasks

mickqian added a commit that referenced this pull request Apr 13, 2026

Revert "[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (#22574)" (#…

bf022e1

…22649)

BBuf added a commit that referenced this pull request Apr 13, 2026

Revert "Revert "[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (#2…

fbdc957

…2574)" (#22649)" This reverts commit bf022e1.

pyc96 pushed a commit to pyc96/sglang that referenced this pull request Apr 14, 2026

[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (sgl-project#22574)

0d041f0

pyc96 pushed a commit to pyc96/sglang that referenced this pull request Apr 14, 2026

Revert "[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (sgl-projec…

fdcc906

…t#22574)" (sgl-project#22649)

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (sgl-project#22574)

dccd029

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

Revert "[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support (sgl-projec…

e70cbce

…t#22574)" (sgl-project#22649)

BBuf mentioned this pull request Apr 29, 2026

SGLang AI Agent Performance Optimization PRs (2026-01-29 to 2026-04-29) BBuf/AI-Infra-Auto-Driven-SKILLS#46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support#22574

[Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support#22574
BBuf merged 7 commits intosgl-project:mainfrom
BBuf:codex/flux1-modelopt-nvfp4

BBuf commented Apr 11, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Apr 11, 2026

Uh oh!

BBuf commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BBuf commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Notes

Uh oh!

gemini-code-assist Bot commented Apr 11, 2026

Uh oh!

BBuf commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BBuf commented Apr 11, 2026 •

edited

Loading