Skip to content

[diffusion] Validate attention backend for Ring Attention in USPAttention#21828

Merged
mickqian merged 1 commit intosgl-project:mainfrom
yeahdongcn:xd/assert_ring
Apr 4, 2026
Merged

[diffusion] Validate attention backend for Ring Attention in USPAttention#21828
mickqian merged 1 commit intosgl-project:mainfrom
yeahdongcn:xd/assert_ring

Conversation

@yeahdongcn
Copy link
Copy Markdown
Collaborator

@yeahdongcn yeahdongcn commented Apr 1, 2026

Motivation

Found this issue in a MUSA container where MATE is not installed. Although the validation in ServerArgs._adjust_attention_backend() checks at the string level and will pick fa for Ring Attention (if attention backend is not set), when actually resolving the attention backend via get_attn_backend(), Torch SDPA could still be selected as it is the fallback solution. This leads to silent incorrect behavior or confusing errors downstream.

> sglang generate --model-path /home/dist/diffusion/yeahdongcn/Qwen/Qwen-Image \
    --sp-degree 2 --ulysses-degree 1 --ring-degree 2 --num-gpus 2 \
    --warmup --prompt "Doraemon is eating dorayaki"

Modifications

Validate attention backend for Ring Attention in USPAttention.

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a validation check to ensure that Ring Attention is only used with FlashAttention or SageAttention backends when the ring parallel world size is greater than one. Feedback suggests replacing the assert statement with a RuntimeError to ensure the check is always performed even when Python optimizations are enabled, along with a suggestion to refine the error message for better readability.

Comment thread python/sglang/multimodal_gen/runtime/layers/attention/layer.py Outdated
…tion

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
@yeahdongcn yeahdongcn changed the title [diffusion] Add attention backend assertion for Ring Attention [diffusion] Validate attention backend for Ring Attention in USPAttention Apr 1, 2026
Copy link
Copy Markdown
Collaborator

@mickqian mickqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/tag-and-rerun-ci

@mickqian
Copy link
Copy Markdown
Collaborator

mickqian commented Apr 4, 2026

/tag-and-rerun-ci

@github-actions github-actions Bot added the run-ci label Apr 4, 2026
@yhyang201
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@yhyang201
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

@mickqian mickqian merged commit 1fb4bf3 into sgl-project:main Apr 4, 2026
159 of 169 checks passed
sundar24295s pushed a commit to sundar24295s/sglang that referenced this pull request Apr 4, 2026
…Attention (sgl-project#21828)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026
…Attention (sgl-project#21828)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Fridge003 pushed a commit that referenced this pull request Apr 7, 2026
…Attention (#21828)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
xiezhq-hermann pushed a commit to antgroup/sglang that referenced this pull request Apr 7, 2026
…Attention (sgl-project#21828)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026
…Attention (sgl-project#21828)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants