Skip to content

Remove dynamic grid#8896

Merged
yaochengji merged 1 commit intopytorch:masterfrom
bythew3i:ragged-attn-v2
Mar 27, 2025
Merged

Remove dynamic grid#8896
yaochengji merged 1 commit intopytorch:masterfrom
bythew3i:ragged-attn-v2

Conversation

@bythew3i
Copy link
Copy Markdown
Contributor

@bythew3i bythew3i commented Mar 27, 2025

The call site of the kernel probably did not check the cu_q_lens[num_seqs[0]] (AKA the actual total num batched q len) is <= max_num_batched_tokens. This could cause hang if we use dynamic grid in kernel.

We should consider adding runtime validation before calling the kernel to prevent cases like this.
For now, we just roll back the dynamic grid change to make the integration work.

Tested:

python test/test_pallas.py -v -k PallasTest.test_ragged_paged_attention_wrapper

@yaochengji yaochengji merged commit b24e6e9 into pytorch:master Mar 27, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants