[diffusion] [AMD] model: allow AITER backends in Flux 2 pipeline#22802
[diffusion] [AMD] model: allow AITER backends in Flux 2 pipeline#22802HaiShaw merged 4 commits intosgl-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds AITER and AITER_SAGE to the supported attention backends for the Flux2Transformer2DModel. However, feedback indicates that enabling these backends is premature due to critical integration issues, including a tensor layout mismatch in the AITER implementation, missing entries in the Ring Attention whitelist, and signature incompatibilities in the AITER_SAGE forward method.
| AttentionBackendEnum.AITER, | ||
| AttentionBackendEnum.AITER_SAGE, |
There was a problem hiding this comment.
Enabling these backends for Flux2 is premature due to several integration issues in the underlying attention infrastructure:
AITERLayout Mismatch (High Severity): TheAITerImpl.forwardimplementation inaiter.pyexpects a[batch_size, num_heads, seq_len, head_dim]layout (as explicitly stated in its docstring and comments). However,USPAttention(whichFlux2uses) provides tensors in[batch_size, seq_len, num_heads, head_dim]format for both local and replicated-prefix paths. This will result in incorrect attention computation.- Ring Attention Whitelist (Medium Severity): Both
AITERandAITER_SAGEare currently missing from the Ring Attention whitelist inpython/sglang/multimodal_gen/runtime/layers/attention/layer.py. Any attempt to useFlux2with these backends and Ring Attention enabled will trigger aRuntimeErrorinUSPAttention.__init__. AITER_SAGESignature (Medium Severity): TheAITERSageImpl.forwardmethod does not accept**kwargs. If the Ring Attention whitelist is updated,ring_attnmight still fail if it attempts to pass additional parameters (likedropout_poris_causal) to the implementation.
Please ensure the backends in aiter.py, aiter_sage.py, and the whitelist in layer.py are updated to support the Flux2 pipeline requirements before enabling them here.
There was a problem hiding this comment.
They are already supported. It's just Flux dit that disabled it. Tested running with AITER just fine.
There was a problem hiding this comment.
Hmm, @avjves please provide some insight here? Also it is good to provide example runs with aiter, aitger_sage backend in PR body.
There was a problem hiding this comment.
Sorry, my response here to Gemini wasn't maybe as clear as it should be.
- The AITER implementation has a wrong docstring. It says "[batch_size, seq_len, head_num, head_dim]", but that would crash whenever AITER is used anywhere in the codebase. Seq_len and head_num dimensions are in reality swapped. As such, there is no layout mismatch here. I can change the docstring as well, though it's not really the scope of this PR.
- A check for backends for ring-attn was added in [diffusion] Validate attention backend for Ring Attention in USPAttention #21828 , but it doesn't include AITER. It should be added there as well, but it's again not in the scope of this PR, as that's not a Flux2 issue per se. I try to avoid changing too many "unrelated" issues in a PR afterwards to keep the history clear. I do see this adding review overhead, so maybe it is better to bundle all together and change the scope of the PR afterwards.
- This is a newer change, as most attention backends do not take in extra arguments. This should be looked at when fixing 2).
Sure, I'll add some example outputs as well :)
There was a problem hiding this comment.
Added the output images as well :)
There was a problem hiding this comment.
Fixed the docstring, but I think 2) and 3) warrant their own PR.
|
@amd-bot ci-status |
CI Status for PR #22802PR: [diffusion] [AMD] model: allow AITER backends in Flux 2 pipeline AMD: 2 failures (0 likely related) | Others: 1 failure (0 related) AMD CI Failures
Other CI Failures
DetailsAll failures are unrelated to this PR's change (adding
Verdict: No action needed from the PR author. All failures are pre-existing bugs (now fixed) or infrastructure issues. Re-running CI after rebasing on latest main (which includes #22690) should resolve the
|
|
/tag-and-rerun-ci |
|
@HaiShaw Is everything now OK with this? Would be great to get if merged ASAP. |
Can you provide a full run example with command (which is useful for others to begin with)? |
Sure, added in the PR description. Posting here as well: The images shown in the description are ran with this command. |
Motivation
This PR is related to #22690.
PR #22423 specified supported backends for Flux2, but this only included
SDPAandFA, notAITER. This is a big performance regression for AMD devices.Modifications
AITERandAITER_SAGEas supported backends for Flux2Accuracy Tests
With SDPA as backend:

With AITER as backend:

Run command for both (First ran without fixes in this PR and latter with the fixes):
Speed Tests and Profiling
Checklist
Review and Merge Process
/tag-and-rerun-ci,/tag-run-ci-label,/rerun-failed-ci