Skip to content

[diffusion] model: Fix FLUX.1 output correctness#21041

Merged
sglang-npu-bot merged 2 commits intosgl-project:mainfrom
avjves:feature/flux_correctness
Mar 24, 2026
Merged

[diffusion] model: Fix FLUX.1 output correctness#21041
sglang-npu-bot merged 2 commits intosgl-project:mainfrom
avjves:feature/flux_correctness

Conversation

@avjves
Copy link
Copy Markdown
Contributor

@avjves avjves commented Mar 20, 2026

Motivation

Currently FLUX.1-dev has two correctness issues that deteriorate the output quality. This can be seen as a blocky output image and the inability to generate higher resolution (2048 * 2048) images that it should be able to. This PR adds fixes to the two issues, improving the output quality significantly.

Modifications

  1. Image seq len calculation properly takes into account the patchification. Without this fix the mu values are too big, pushing sigma values to 1.0 incorrectly.
    2. Change the self-attention to not replicate the un-sharded text tokens. This now follows how it is done with FLUX.2, for example. Already fixed in [diffusion] fix: fix accuracy for some image models #20679

Accuracy Tests

Without fixes:

1 GPU:

image

Run command:

sglang generate --model-path black-forest-labs/FLUX.1-dev --seed 42 --prompt "A small cat" --height 1024 --width 1024 --num-inference-steps 25 --ulysses-degree 8 --guidance-scale 0.0 --num-gpus 8

8 GPUs:

image

Run command:

sglang generate --model-path black-forest-labs/FLUX.1-dev --seed 42 --prompt "A small cat" --height 1024 --width 1024 --num-inference-steps 25 --ulysses-degree 8 --guidance-scale 0.0 --num-gpus 8

8 GPUs (2048x2048):

image

Run command:

sglang generate --model-path black-forest-labs/FLUX.1-dev --seed 42 --prompt "A small cat" --height 2048 --width 2048 --num-inference-steps 25 --ulysses-degree 8 --guidance-scale 0.0 --num-gpus 8

With fixes:

1 GPU:

image

Run command:

sglang generate --model-path black-forest-labs/FLUX.1-dev --seed 42 --prompt "A small cat" --height 1024 --width 1024 --num-inference-steps 25 --ulysses-degree 1 --guidance-scale 0.0 --num-gpus 1

8 GPUs:

image

Run command:

sglang generate --model-path black-forest-labs/FLUX.1-dev --seed 42 --prompt "A small cat" --height 1024 --width 1024 --num-inference-steps 25 --ulysses-degree 8 --guidance-scale 0.0 --num-gpus 8

8 GPUs (2048x2048):

image

Run command:

sglang generate --model-path black-forest-labs/FLUX.1-dev --seed 42 --prompt "A small cat" --height 2048 --width 2048 --num-inference-steps 25 --ulysses-degree 8 --guidance-scale 0.0 --num-gpus 8

Benchmarking and Profiling

Stage Name Baseline (ms) New (ms) Diff (ms) Diff (%) Status
InputValidationStage 0.04 0.05 +0.01 +24.1% ⚪️
TextEncodingStage 3713.29 3790.03 +76.74 +2.1% ⚪️
TimestepPreparationStage 14.74 14.20 -0.54 -3.7% ⚪️
LatentPreparationStage 7.39 5.91 -1.48 -20.0% ⚪️
DenoisingStage 6975.15 7064.81 +89.66 +1.3% ⚪️
DecodingStage 1112.75 1115.90 +3.15 +0.3% ⚪️

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added the diffusion SGLang Diffusion label Mar 20, 2026
@mickqian
Copy link
Copy Markdown
Collaborator

excellent, could you attach the result of diffusers as well?

@avjves
Copy link
Copy Markdown
Contributor Author

avjves commented Mar 23, 2026

Hi,

there's still a small difference between SGL-D and diffusers output:
image

However, the quality seems to be on par here, i.e. no longer a "blocky" image and can generate 2kx2k images, just a bit different image. Still some discrepancies in the implemention, I wager.

Seems like the modification 2) was already fixed in #20679, I'll remove it from my PR.

@avjves avjves force-pushed the feature/flux_correctness branch from 1bad8fe to 6dd8f1d Compare March 23, 2026 08:40
Copy link
Copy Markdown
Collaborator

@mickqian mickqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outstanding work

@mickqian
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@yhyang201
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@yhyang201
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@yhyang201
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

1 similar comment
@yhyang201
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

@sglang-npu-bot sglang-npu-bot merged commit eefb504 into sgl-project:main Mar 24, 2026
83 of 89 checks passed
@ping1jing2
Copy link
Copy Markdown
Collaborator

I merged it as Mick approved and all CIs passed

adityavaid pushed a commit to adityavaid/sglang that referenced this pull request Mar 24, 2026
adityavaid pushed a commit to adityavaid/sglang that referenced this pull request Mar 24, 2026
0-693 pushed a commit to 0-693/sglang that referenced this pull request Mar 25, 2026
johnnycxm pushed a commit to johnnycxm/sglang that referenced this pull request Mar 25, 2026
Co-authored-by: Mick <mickjagger19@icloud.com>
johnnycxm pushed a commit to johnnycxm/sglang that referenced this pull request Mar 25, 2026
Co-authored-by: Mick <mickjagger19@icloud.com>
JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026
Co-authored-by: Mick <mickjagger19@icloud.com>
yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026
Co-authored-by: Mick <mickjagger19@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants