Skip to content

Fix AuraFlow attn processors applying norm_added_q to key projection#13533

Merged
yiyixuxu merged 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix-auraflow-norm-added-k
Apr 21, 2026
Merged

Fix AuraFlow attn processors applying norm_added_q to key projection#13533
yiyixuxu merged 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix-auraflow-norm-added-k

Conversation

@Ricardo-M-L
Copy link
Copy Markdown
Contributor

What does this PR do?

Both AuraFlowAttnProcessor2_0 and FusedAuraFlowAttnProcessor2_0 in attention_processor.py contain a copy-paste error where attn.norm_added_q is called on the encoder key projection while guarded by a check on attn.norm_added_k:

if attn.norm_added_q is not None:
    encoder_hidden_states_query_proj = attn.norm_added_q(encoder_hidden_states_query_proj)
if attn.norm_added_k is not None:
    encoder_hidden_states_key_proj = attn.norm_added_q(encoder_hidden_states_key_proj)  # bug: should be norm_added_k

This silently applies the query's added-QK norm to the key projection, which is wrong whenever the two layers have different parameters (and is definitely wrong as a logical statement of the attention equations).

Every other attention processor in this file that defines both norm_added_q and norm_added_k (e.g. FluxAttnProcessor, CogVideoXAttnProcessor, HunyuanAttnProcessor, etc.) correctly applies norm_added_k to the key.

This PR fixes both occurrences (lines 2143 and 2240).

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?

Who can review?

@yiyixuxu @sayakpaul

Both AuraFlowAttnProcessor2_0 and FusedAuraFlowAttnProcessor2_0 were
calling attn.norm_added_q on encoder_hidden_states_key_proj while
guarded by a check on attn.norm_added_k. This applies the query
normalization layer to the key, which is a copy-paste error.

Consistent with every other attention processor in this file that
defines both norm_added_q and norm_added_k (e.g. FluxAttnProcessor,
CogVideoXAttnProcessor, HunyuanAttnProcessor), where norm_added_k is
applied to the added key projection.
@github-actions github-actions Bot added models size/S PR with diff < 50 LOC labels Apr 21, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@yiyixuxu yiyixuxu merged commit b9d6420 into huggingface:main Apr 21, 2026
13 of 14 checks passed
terarachang pushed a commit to terarachang/diffusers that referenced this pull request Apr 30, 2026
…uggingface#13533)

Both AuraFlowAttnProcessor2_0 and FusedAuraFlowAttnProcessor2_0 were
calling attn.norm_added_q on encoder_hidden_states_key_proj while
guarded by a check on attn.norm_added_k. This applies the query
normalization layer to the key, which is a copy-paste error.

Consistent with every other attention processor in this file that
defines both norm_added_q and norm_added_k (e.g. FluxAttnProcessor,
CogVideoXAttnProcessor, HunyuanAttnProcessor), where norm_added_k is
applied to the added key projection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants