[varlen_attn for inference] add test to aot_inductor#175936
[varlen_attn for inference] add test to aot_inductor#175936liangel-02 wants to merge 19 commits intogh/liangel-02/16/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175936
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (4 Unrelated Failures)As of commit 2e18591 with merge base 4bc9d7f ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| @unittest.skipIf(not SM90OrLater, "FA3 requires SM90+") | ||
| @unittest.skipIf("FA3" not in list_flash_attention_impls(), "FA3 not available") | ||
| def test_varlen_attn_paged_kv_cache(self): | ||
| if self.device != GPU_TYPE: |
There was a problem hiding this comment.
needed instead of @requires_gpu because otherwise the CPU variant will still run
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
This reverts commit cc142e7. Reverted #175936 on behalf of https://github.com/zou3519 due to sorry I think this broke inductor rocm ([comment](#175897 (comment)))
|
@liangel-02 your PR has been reverted as part of the stack under #175897. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
|
Starting merge as part of PR stack under #176723 |
1 similar comment
|
Starting merge as part of PR stack under #176723 |
`aten/src/ATen/native/transformers/cuda/attention.cu`
- renamed `_flash_attention_forward` to `_flash_attention_forward_impl`. this is now the core logic and takes `optional<Tensor> out`.
- `_flash_attention_forward` is the non-out variant version and is a thin wrapper that calls `_flash_attention_forward_impl` with `out=std::nullopt`
- `_flash_attention_forward_no_dropout_inplace` is the out-variant and calls `_flash_attention_forward_impl` with `Tensor& out`
`aten/src/ATen/native/native_functions.yaml`
- i registered a new op `_flash_attention_forward_no_dropout_inplace`
`torch/_meta_registrations.py`
- added meta registration that calls `meta__flash_attention_forward` but doesn't return out tensor
`torch/nn/attention/varlen.py`
- added public `varlen_attn_out` and private custom op `_varlen_attn_out` with `mutates_args={"out"}`
`test/test_varlen_attention.py`
- added out variant to existing tests
Pull Request resolved: #176015
Approved by: https://github.com/drisspg
ghstack dependencies: #175897, #175924, #175936
This reverts commit 388d61e. Reverted #175936 on behalf of https://github.com/huydhn due to Sorry for reverting your change but a bunch of internal builds need to be updated to unblock this change D95758397 ([comment](#175924 (comment)))
|
@liangel-02 your PR has been reverted as part of the stack under #175924. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
|
@liangel-02 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Differential Revision: [D95996398](https://our.internmc.facebook.com/intern/diff/D95996398) [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Differential Revision: [D95996398](https://our.internmc.facebook.com/intern/diff/D95996398) [ghstack-poisoned]
|
Starting merge as part of PR stack under #176723 |
1 similar comment
|
Starting merge as part of PR stack under #176723 |
`aten/src/ATen/native/transformers/cuda/attention.cu`
- renamed `_flash_attention_forward` to `_flash_attention_forward_impl`. this is now the core logic and takes `optional<Tensor> out`.
- `_flash_attention_forward` is the non-out variant version and is a thin wrapper that calls `_flash_attention_forward_impl` with `out=std::nullopt`
- `_flash_attention_forward_no_dropout_inplace` is the out-variant and calls `_flash_attention_forward_impl` with `Tensor& out`
`aten/src/ATen/native/native_functions.yaml`
- i registered a new op `_flash_attention_forward_no_dropout_inplace`
`torch/_meta_registrations.py`
- added meta registration that calls `meta__flash_attention_forward` but doesn't return out tensor
`torch/nn/attention/varlen.py`
- added public `varlen_attn_out` and private custom op `_varlen_attn_out` with `mutates_args={"out"}`
`test/test_varlen_attention.py`
- added out variant to existing tests
Pull Request resolved: #176015
Approved by: https://github.com/drisspg
ghstack dependencies: #175924, #175936
Pull Request resolved: #176723 Approved by: https://github.com/drisspg ghstack dependencies: #175924, #175936, #176015
ghstack-source-id: 968c9eb Pull Request resolved: pytorch/pytorch#175936
Stack from ghstack (oldest at bottom):
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo