Disable Flash Attention GQA support on ROCM by xinyazhang · Pull Request #133884 · pytorch/pytorch

xinyazhang · 2024-08-19T18:20:05Z

Currently GQA is unsupported on ROCM.

Partially addresses #133540

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang

pytorch-bot · 2024-08-19T18:20:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133884

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit 3246e80 with merge base f31404b ():

NEW FAILURE - The following job has failed:

rocm / linux-focal-rocm6.1-py3.8 / test (default, 2, 6, linux.rocm.gpu.2) (gh)
test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16

FLAKY - The following job failed but was likely due to flakiness present on trunk:

rocm / linux-focal-rocm6.1-py3.8 / test (default, 3, 6, linux.rocm.gpu.2) (gh) (disabled by #134143)
inductor/test_b2b_gemm.py::B2BGEMMTest::test_b2b_gemm_trivial_right_assoc_good_shape

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jithunnair-amd · 2024-08-27T20:04:39Z

Closing this in favor of #134498, which will include the GQA-related changes.

xinyazhang · 2024-09-12T15:40:55Z

Already part of AOTriton 0.7b integration PR

xinyazhang added 2 commits August 19, 2024 17:38

Skip enable_gqa=True tests

8eadde5

Claim GQA is not supported on ROCM in can_use_flash_attention

3246e80

pytorch-bot bot added the module: rocm AMD GPU support for Pytorch label Aug 19, 2024

pytorchbot added the open source label Aug 19, 2024

pruthvistony added ciflow/rocm Trigger "default" config CI on ROCm rocm This tag is for PRs from ROCm team rocm priority high priority ROCm PRs from performance or other aspects labels Aug 19, 2024

jithunnair-amd approved these changes Aug 21, 2024

View reviewed changes

xinyazhang marked this pull request as ready for review August 21, 2024 15:01

xinyazhang mentioned this pull request Aug 23, 2024

[ROCM] Properly disable Flash Attention/Efficient Attention with environment variables #133866

Closed

jithunnair-amd requested a review from malfet August 27, 2024 18:24

jithunnair-amd marked this pull request as draft August 27, 2024 19:43

xinyazhang closed this Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable Flash Attention GQA support on ROCM#133884

Disable Flash Attention GQA support on ROCM#133884
xinyazhang wants to merge 2 commits intopytorch:mainfrom
ROCm:xinyazhang/nogqa-2.5main

xinyazhang commented Aug 19, 2024 •

edited by jithunnair-amd

Loading

Uh oh!

pytorch-bot bot commented Aug 19, 2024 •

edited

Loading

Uh oh!

jithunnair-amd commented Aug 27, 2024

Uh oh!

xinyazhang commented Sep 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

xinyazhang commented Aug 19, 2024 • edited by jithunnair-amd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133884

❌ 1 New Failure, 1 Unrelated Failure

Uh oh!

jithunnair-amd commented Aug 27, 2024

Uh oh!

xinyazhang commented Sep 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xinyazhang commented Aug 19, 2024 •

edited by jithunnair-amd

Loading

pytorch-bot bot commented Aug 19, 2024 •

edited

Loading