[FlexAttention] Enable different qk and v head-dims#134043
[FlexAttention] Enable different qk and v head-dims#134043drisspg wants to merge 14 commits intogh/drisspg/36/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134043
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 067c895 with merge base 5f3d22a ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
# Summary Adds the option for the head dims to be different between QK and V tensors. Local testing shows that when QK_HEAD_DIM > V head dim this works great for the forward Not when V > QK , still debugging [ghstack-poisoned]
# Summary Adds the option for the head dims to be different between QK and V tensors. Local testing shows that when QK_HEAD_DIM > V head dim this works great for the forward Not when V > QK , still debugging [ghstack-poisoned]
# Summary Adds the option for the head dims to be different between QK and V tensors. Local testing shows that when QK_HEAD_DIM > V head dim this works great for the forward Not when V > QK , still debugging [ghstack-poisoned]
# Summary Adds the option for the head dims to be different between QK and V tensors. Fixes issue: #133674 [ghstack-poisoned]
# Summary Adds the option for the head dims to be different between QK and V tensors. Fixes issue: #133674 [ghstack-poisoned]
|
|
||
| q_range = stride_qg * off_g[:, None, None] + stride_qm * off_m[None, :, None] + stride_qk * offs_d[None, None, :] | ||
|
|
||
| Q_block_ptr = tl.make_block_ptr( |
# Summary Adds the option for the head dims to be different between QK and V tensors. Fixes issue: #133674 [ghstack-poisoned]
# Summary Adds the option for the head dims to be different between QK and V tensors. Fixes issue: #133674 [ghstack-poisoned]
|
@pytorchbot merge |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot merge -f "stucked ROCM jobs, flex attention unit tests only on CUDA" |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge -f "stucked ROCM jobs, flex attention unit tests only on CUDA" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "Need to revert, in order to be able to revert #133373, feel free to reland this after solving conflicts" -c ghfirst |
|
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit e847b6b. Reverted #134043 on behalf of https://github.com/jeanschmidt due to Need to revert, in order to be able to revert #133373, feel free to reland this after solving conflicts ([comment](#134043 (comment)))
|
@drisspg your PR has been successfully reverted. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
# Summary Adds the option for the head dims to be different between QK and V tensors. Fixes issue: pytorch#133674 V_DIM > QK_DIM is blocked by landing: triton-lang/triton#4138 / triton-lang/triton#4540 Into PyTorch's triton branch. Pull Request resolved: pytorch#134043 Approved by: https://github.com/Chillee
…134043)" This reverts commit e847b6b. Reverted pytorch#134043 on behalf of https://github.com/jeanschmidt due to Need to revert, in order to be able to revert pytorch#133373, feel free to reland this after solving conflicts ([comment](pytorch#134043 (comment)))
Stack from ghstack (oldest at bottom):
Summary
Adds the option for the head dims to be different between QK and V tensors.
Fixes issue: #133674
V_DIM > QK_DIM is blocked by landing: triton-lang/triton#4138 / triton-lang/triton#4540
Into PyTorch's triton branch.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang