[release/2.3] [ROCM] Properly disable Flash Attention/Efficient Attention with environment variables by xinyazhang · Pull Request #1571 · ROCm/pytorch

xinyazhang · 2024-09-03T21:35:27Z

Now USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py can compile correctly.

This is cherry-picked version of pytorch#133866

…NTION=0

…M_EFF_ATTENTION=0

…in build system

aten/src/ATen/native/transformers/cuda/sdp_utils.cpp

jithunnair-amd · 2024-09-11T00:45:58Z

aten/src/ATen/native/transformers/cuda/sdp_utils.cpp

+                  "Mem Efficient attention was not compiled for current AMD GPU architecture. Attempting to run on architecture ", dprops->gcnArchName);
+      }
+      return false;
+  }


@pruthvistony Isn't this where we would need to add a return true to address the "Control reaching end of non-void function" error?

No its NOT here.

pruthvistony · 2024-09-11T15:49:44Z

For 2.3 mem_efficient should always be false.
@xinyazhang , please update the PR.

xinyazhang added 3 commits September 3, 2024 21:28

Do not include aotriton if USE_FLASH_ATTENTION=0 AND USE_MEM_EFF_ATTE…

1a8026e

…NTION=0

Disable AOTriton in sdp_utils.cpp if USE_FLASH_ATTENTION=0 AND USE_ME…

18c78cf

…M_EFF_ATTENTION=0

Fix circular dependency of USE_FLASH_ATTENTION/USE_MEM_EFF_ATTENTION …

988567c

…in build system

pruthvistony reviewed Sep 9, 2024

View reviewed changes

aten/src/ATen/native/transformers/cuda/sdp_utils.cpp Show resolved Hide resolved

jithunnair-amd changed the title ~~[ROCM] Properly disable Flash Attention/Efficient Attention with environment variables~~ [release/2.3] [ROCM] Properly disable Flash Attention/Efficient Attention with environment variables Sep 11, 2024

jithunnair-amd reviewed Sep 11, 2024

View reviewed changes

xinyazhang added 2 commits September 11, 2024 15:50

No ME on ROCM

59ff632

Disregard ME enablement since AOTriton 0.4.x does not have ME support

2c4693b

xinyazhang force-pushed the xinyazhang/internal-2.3-nofa branch from 56f999f to 2c4693b Compare September 11, 2024 15:56

Update to handle compilation break in debug mode

cb7696d

pruthvistony approved these changes Sep 11, 2024

View reviewed changes

pruthvistony merged commit 1b935e2 into release/2.3 Sep 11, 2024

pruthvistony deleted the xinyazhang/internal-2.3-nofa branch September 11, 2024 18:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[release/2.3] [ROCM] Properly disable Flash Attention/Efficient Attention with environment variables#1571

[release/2.3] [ROCM] Properly disable Flash Attention/Efficient Attention with environment variables#1571
pruthvistony merged 6 commits intorelease/2.3from
xinyazhang/internal-2.3-nofa

xinyazhang commented Sep 3, 2024

Uh oh!

Uh oh!

jithunnair-amd Sep 11, 2024

Uh oh!

pruthvistony Sep 11, 2024

Uh oh!

pruthvistony commented Sep 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xinyazhang commented Sep 3, 2024

Uh oh!

Uh oh!

jithunnair-amd Sep 11, 2024

Choose a reason for hiding this comment

Uh oh!

pruthvistony Sep 11, 2024

Choose a reason for hiding this comment

Uh oh!

pruthvistony commented Sep 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants