Skip to content

[varlen_attn for inference] add test to aot_inductor#175936

Closed
liangel-02 wants to merge 19 commits intogh/liangel-02/16/basefrom
gh/liangel-02/16/head
Closed

[varlen_attn for inference] add test to aot_inductor#175936
liangel-02 wants to merge 19 commits intogh/liangel-02/16/basefrom
gh/liangel-02/16/head

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 27, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175936

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (4 Unrelated Failures)

As of commit 2e18591 with merge base 4bc9d7f (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

liangel-02 added a commit that referenced this pull request Feb 27, 2026
ghstack-source-id: d78ddbb
Pull Request resolved: #175936
@unittest.skipIf(not SM90OrLater, "FA3 requires SM90+")
@unittest.skipIf("FA3" not in list_flash_attention_impls(), "FA3 not available")
def test_varlen_attn_paged_kv_cache(self):
if self.device != GPU_TYPE:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed instead of @requires_gpu because otherwise the CPU variant will still run

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Feb 27, 2026
ghstack-source-id: 4d5593d
Pull Request resolved: #175936
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Feb 27, 2026
ghstack-source-id: 4cd7118
Pull Request resolved: #175936
@liangel-02 liangel-02 requested a review from drisspg February 27, 2026 00:22
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Feb 27, 2026
ghstack-source-id: bff1fea
Pull Request resolved: #175936
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Feb 27, 2026
ghstack-source-id: 8aa6652
Pull Request resolved: #175936
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
@liangel-02 liangel-02 changed the title add test to aot_inductor [varlen_attn for inference] add test to aot_inductor Mar 4, 2026
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
@liangel-02 liangel-02 added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 5, 2026
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
pytorchmergebot added a commit that referenced this pull request Mar 7, 2026
This reverts commit cc142e7.

Reverted #175936 on behalf of https://github.com/zou3519 due to sorry I think this broke inductor rocm ([comment](#175897 (comment)))
@pytorchmergebot
Copy link
Collaborator

@liangel-02 your PR has been reverted as part of the stack under #175897.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Mar 7, 2026
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #176723

1 similar comment
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #176723

pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2026
`aten/src/ATen/native/transformers/cuda/attention.cu`

- renamed `_flash_attention_forward` to `_flash_attention_forward_impl`. this is now the core logic and takes `optional<Tensor> out`.
- `_flash_attention_forward` is the non-out variant version and is a thin wrapper that calls `_flash_attention_forward_impl` with `out=std::nullopt`
- `_flash_attention_forward_no_dropout_inplace` is the out-variant and calls `_flash_attention_forward_impl` with `Tensor& out`

`aten/src/ATen/native/native_functions.yaml`

- i registered a new op `_flash_attention_forward_no_dropout_inplace`

`torch/_meta_registrations.py`

- added meta registration that calls `meta__flash_attention_forward` but doesn't return out tensor

`torch/nn/attention/varlen.py`

- added public `varlen_attn_out` and private custom op `_varlen_attn_out` with `mutates_args={"out"}`

`test/test_varlen_attention.py`

- added out variant to existing tests

Pull Request resolved: #176015
Approved by: https://github.com/drisspg
ghstack dependencies: #175897, #175924, #175936
pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2026
pytorchmergebot added a commit that referenced this pull request Mar 10, 2026
This reverts commit 388d61e.

Reverted #175936 on behalf of https://github.com/huydhn due to Sorry for reverting your change but a bunch of internal builds need to be updated to unblock this change D95758397 ([comment](#175924 (comment)))
@pytorchmergebot
Copy link
Collaborator

@liangel-02 your PR has been reverted as part of the stack under #175924.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
@liangel-02
Copy link
Contributor Author

@liangel-02 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

Differential Revision: [D95996398](https://our.internmc.facebook.com/intern/diff/D95996398)

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

Differential Revision: [D95996398](https://our.internmc.facebook.com/intern/diff/D95996398)

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #176723

1 similar comment
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #176723

pytorchmergebot pushed a commit that referenced this pull request Mar 11, 2026
`aten/src/ATen/native/transformers/cuda/attention.cu`

- renamed `_flash_attention_forward` to `_flash_attention_forward_impl`. this is now the core logic and takes `optional<Tensor> out`.
- `_flash_attention_forward` is the non-out variant version and is a thin wrapper that calls `_flash_attention_forward_impl` with `out=std::nullopt`
- `_flash_attention_forward_no_dropout_inplace` is the out-variant and calls `_flash_attention_forward_impl` with `Tensor& out`

`aten/src/ATen/native/native_functions.yaml`

- i registered a new op `_flash_attention_forward_no_dropout_inplace`

`torch/_meta_registrations.py`

- added meta registration that calls `meta__flash_attention_forward` but doesn't return out tensor

`torch/nn/attention/varlen.py`

- added public `varlen_attn_out` and private custom op `_varlen_attn_out` with `mutates_args={"out"}`

`test/test_varlen_attention.py`

- added out variant to existing tests

Pull Request resolved: #176015
Approved by: https://github.com/drisspg
ghstack dependencies: #175924, #175936
pytorchmergebot pushed a commit that referenced this pull request Mar 11, 2026
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
ghstack-source-id: 968c9eb
Pull Request resolved: pytorch/pytorch#175936
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants