Skip to content

[Bugfix] fix ragged attention kernel auto-tuning table key#9497

Merged
yaochengji merged 7 commits intomasterfrom
chengji/fix-attn-table
Jul 23, 2025
Merged

[Bugfix] fix ragged attention kernel auto-tuning table key#9497
yaochengji merged 7 commits intomasterfrom
chengji/fix-attn-table

Conversation

@yaochengji
Copy link
Copy Markdown
Collaborator

@yaochengji yaochengji commented Jul 22, 2025

Fix the bug that q_heads, kv_heads and max_model_len are accidentally padded to power of 2.

Without fix, models like QWen2.5-7B (q_heads=28, kv_heads=4) cannot hit the attention table.

@yaochengji yaochengji force-pushed the chengji/fix-attn-table branch from 53077e8 to 3b99bc8 Compare July 22, 2025 19:44
@yaochengji yaochengji requested a review from vanbasten23 July 22, 2025 19:45
@yaochengji
Copy link
Copy Markdown
Collaborator Author

@bythew3i could you take a look?

Copy link
Copy Markdown
Contributor

@bythew3i bythew3i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chengji.

It is intended to pad. I think the bug is in our autotuning instead of here, we should pad the num_heads before running autotune to make the table smaller.

To unblock the model, I think this PR LGTM but we really need to fix the autotune script.

@yaochengji
Copy link
Copy Markdown
Collaborator Author

Thanks Chengji.

It is intended to pad. I think the bug is in our autotuning instead of here, we should pad the num_heads before running autotune to make the table smaller.

To unblock the model, I think this PR LGTM but we really need to fix the autotune script.

Thanks @bythew3i , will change it back when auto-tuning table is corrected.

@yaochengji yaochengji enabled auto-merge (squash) July 22, 2025 21:46
Comment thread torch_xla/experimental/pallas_kernels/ragged_paged_attention_v2.py Outdated
Comment thread torch_xla/experimental/pallas_kernels/ragged_paged_attention_v2.py
Copy link
Copy Markdown
Collaborator

@vanbasten23 vanbasten23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chengji!

@yaochengji yaochengji force-pushed the chengji/fix-attn-table branch from 5ef3bc6 to 0bbf02d Compare July 23, 2025 05:26
@yaochengji yaochengji merged commit 31c4c2f into master Jul 23, 2025
23 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants