[varlen_attn for inference] add page_table by liangel-02 · Pull Request #175924 · pytorch/pytorch

liangel-02 · 2026-02-26T23:38:21Z

page_table is an FA3 feature so we need to modify the function definitions in native_functions.yaml. if this is used with FA2, we throw an error

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2026-02-26T23:38:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175924

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 0767407 with merge base 4bc9d7f ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / linux-jammy-py3.10-gcc11 / test (distributed, 1, 2, linux.2xlarge) (gh) (similar failure)
test/distributed/tensor/test_dtensor_ops.py::TestLocalDTensorOpsCPU::test_dtensor_op_db_nanmean_cpu_float32

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / linux-jammy-py3.10-gcc11 / test (distributed, 2, 2, linux.2xlarge) (gh) (trunk failure)
'Test'

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: b9de853 Pull Request resolved: #175924

github-actions · 2026-02-26T23:42:34Z

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.

Caused by:

aten/src/ATen/native/native_functions.yaml

github-actions · 2026-02-26T23:42:35Z

Attention! PyTorch one of the C-stable API file was changed

You MUST NOT change existing function declarations in this, as this header defines a stable C ABI. If you need to change the signature for a function, introduce a new v2 version of the function and modify code generation to target the new version of the function.

Caused by:

torch/csrc/inductor/aoti_torch/generated/c_shim_cuda.h

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

ghstack-source-id: 6e2d2ce Pull Request resolved: #175924

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

ghstack-source-id: 19f1d54 Pull Request resolved: #175924

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 · 2026-02-27T00:18:40Z

torch/nn/attention/varlen.py

    value: torch.Tensor,
    cu_seq_q: torch.Tensor,
-    cu_seq_k: torch.Tensor,
+    cu_seq_k: torch.Tensor | None,


if using page_table, cu_seq_k needs to be None

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 · 2026-02-27T00:23:03Z

test/test_varlen_attention.py

+    @unittest.skipIf("FA3" not in list_flash_attention_impls(), "FA3 not available")
+    @parametrize("dtype", [torch.bfloat16, torch.float16])
+    @parametrize("page_size", [32, 64, 128])
+    @parametrize("compile", [False, True])


adding test for torch.compile

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

drisspg · 2026-03-04T04:32:16Z

aten/src/ATen/native/transformers/cuda/attention.cu

    const std::optional<Tensor>& _seqused_k,
-    const std::optional<Tensor>& _alibi_slopes
+    const std::optional<Tensor>& _alibi_slopes,
+    const std::optional<Tensor>& _page_table


OOC fa2 does support this right? https://github.com/Dao-AILab/flash-attention/blob/9a25eba569317708ae295e396aaac0050b28e52b/csrc/flash_attn/flash_api.cpp#L523

perhaps I just didn't end up wiring it or there was a semantic change?

drisspg · 2026-03-04T04:33:35Z

test/test_varlen_attention.py

+        page_table = torch.zeros(
+            batch_size, max_pages_per_seq, device=device, dtype=torch.int32
+        )
+        for i in range(batch_size):


@claude what isa. more efficient way to right this in using more native PyTorch ops

drisspg · 2026-03-04T04:36:17Z

torch/nn/attention/varlen.py

            is larger than the actual sequence. Inference-only (not supported in backward).
+        page_table (Tensor, optional): Page table mapping logical to physical pages for paged


we should add more description here on the semantics here

pytorchmergebot · 2026-03-08T23:40:41Z

Starting merge as part of PR stack under #176723

Pull Request resolved: #175936 Approved by: https://github.com/drisspg ghstack dependencies: #175897, #175924

`aten/src/ATen/native/transformers/cuda/attention.cu` - renamed `_flash_attention_forward` to `_flash_attention_forward_impl`. this is now the core logic and takes `optional<Tensor> out`. - `_flash_attention_forward` is the non-out variant version and is a thin wrapper that calls `_flash_attention_forward_impl` with `out=std::nullopt` - `_flash_attention_forward_no_dropout_inplace` is the out-variant and calls `_flash_attention_forward_impl` with `Tensor& out` `aten/src/ATen/native/native_functions.yaml` - i registered a new op `_flash_attention_forward_no_dropout_inplace` `torch/_meta_registrations.py` - added meta registration that calls `meta__flash_attention_forward` but doesn't return out tensor `torch/nn/attention/varlen.py` - added public `varlen_attn_out` and private custom op `_varlen_attn_out` with `mutates_args={"out"}` `test/test_varlen_attention.py` - added out variant to existing tests Pull Request resolved: #176015 Approved by: https://github.com/drisspg ghstack dependencies: #175897, #175924, #175936

Pull Request resolved: #176723 Approved by: https://github.com/drisspg ghstack dependencies: #175897, #175924, #175936, #176015

huydhn · 2026-03-10T01:07:26Z

@pytorchbot revert -m 'Sorry for reverting your change but a bunch of internal builds need to be updated to unblock this change D95758397' -c ghfirst

pytorchmergebot · 2026-03-10T01:10:02Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

#176723)" This reverts commit 26dddb9. Reverted #176723 on behalf of https://github.com/huydhn due to Sorry for reverting your change but a bunch of internal builds need to be updated to unblock this change D95758397 ([comment](#175924 (comment)))

This reverts commit 492c742. Reverted #176015 on behalf of https://github.com/huydhn due to Sorry for reverting your change but a bunch of internal builds need to be updated to unblock this change D95758397 ([comment](#175924 (comment)))

This reverts commit 388d61e. Reverted #175936 on behalf of https://github.com/huydhn due to Sorry for reverting your change but a bunch of internal builds need to be updated to unblock this change D95758397 ([comment](#175924 (comment)))

This reverts commit 9b53dac. Reverted #175924 on behalf of https://github.com/huydhn due to Sorry for reverting your change but a bunch of internal builds need to be updated to unblock this change D95758397 ([comment](#175924 (comment)))

pytorchmergebot · 2026-03-10T01:10:29Z

@liangel-02 your PR has been successfully reverted.

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 · 2026-03-10T19:36:30Z

@liangel-02 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error Differential Revision: [D95996400](https://our.internmc.facebook.com/intern/diff/D95996400) [ghstack-poisoned]

pytorchmergebot · 2026-03-11T20:22:04Z

Starting merge as part of PR stack under #176723

pytorchmergebot · 2026-03-11T20:30:17Z

Starting merge as part of PR stack under #176723

Pull Request resolved: #175936 Approved by: https://github.com/drisspg ghstack dependencies: #175924

`aten/src/ATen/native/transformers/cuda/attention.cu` - renamed `_flash_attention_forward` to `_flash_attention_forward_impl`. this is now the core logic and takes `optional<Tensor> out`. - `_flash_attention_forward` is the non-out variant version and is a thin wrapper that calls `_flash_attention_forward_impl` with `out=std::nullopt` - `_flash_attention_forward_no_dropout_inplace` is the out-variant and calls `_flash_attention_forward_impl` with `Tensor& out` `aten/src/ATen/native/native_functions.yaml` - i registered a new op `_flash_attention_forward_no_dropout_inplace` `torch/_meta_registrations.py` - added meta registration that calls `meta__flash_attention_forward` but doesn't return out tensor `torch/nn/attention/varlen.py` - added public `varlen_attn_out` and private custom op `_varlen_attn_out` with `mutates_args={"out"}` `test/test_varlen_attention.py` - added out variant to existing tests Pull Request resolved: #176015 Approved by: https://github.com/drisspg ghstack dependencies: #175924, #175936

Pull Request resolved: #176723 Approved by: https://github.com/drisspg ghstack dependencies: #175924, #175936, #176015

ghstack-source-id: ca17f5c Pull Request resolved: pytorch/pytorch#175924

add page_table to varlen_attn

6097e52

[ghstack-poisoned]

liangel-02 requested review from albanD, jbschlosser, mikaylagawarecki and soulitzer as code owners February 26, 2026 23:38

liangel-02 mentioned this pull request Feb 26, 2026

[varlen_attn for inference] add seqused_k #175897

Closed

pytorch-bot bot added ciflow/inductor release notes: inductor (aoti) labels Feb 26, 2026

liangel-02 added a commit that referenced this pull request Feb 26, 2026

add page_table to varlen_attn

c3c251e

ghstack-source-id: b9de853 Pull Request resolved: #175924

Update on "add page_table to varlen_attn"

d5ee67d

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 added a commit that referenced this pull request Feb 26, 2026

add page_table to varlen_attn

f3e9ef8

ghstack-source-id: 6e2d2ce Pull Request resolved: #175924

Update on "add page_table to varlen_attn"

5a8d69e

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 added a commit that referenced this pull request Feb 26, 2026

add page_table to varlen_attn

261a5cf

ghstack-source-id: 19f1d54 Pull Request resolved: #175924

Update on "add page_table to varlen_attn"

0685d30

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 mentioned this pull request Feb 27, 2026

[varlen_attn for inference] add test to aot_inductor #175936

Closed

Update on "add page_table to varlen_attn"

6d0cf10

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 commented Feb 27, 2026

View reviewed changes

Update on "add page_table to varlen_attn"

94ab77c

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 requested a review from drisspg February 27, 2026 00:22

liangel-02 commented Feb 27, 2026

View reviewed changes

liangel-02 added 2 commits February 26, 2026 16:32

Update on "add page_table to varlen_attn"

46fb540

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

Update on "add page_table to varlen_attn"

1e6f596

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

liangel-02 mentioned this pull request Feb 27, 2026

[varlen_attn for inference] add out variant #176015

Closed

Update on "add page_table to varlen_attn"

9c0b0dd

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

drisspg reviewed Mar 4, 2026

View reviewed changes

pytorchmergebot closed this in 9b53dac Mar 8, 2026

pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2026

[varlen_attn for inference] add test to aot_inductor (#175936)

388d61e

Pull Request resolved: #175936 Approved by: https://github.com/drisspg ghstack dependencies: #175897, #175924

pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2026

[varlen_attn for inference] remove unnecessary tensor creation (#176723)

26dddb9

Pull Request resolved: #176723 Approved by: https://github.com/drisspg ghstack dependencies: #175897, #175924, #175936, #176015

pytorchmergebot reopened this Mar 10, 2026

liangel-02 added 3 commits March 9, 2026 19:12

Update on "[varlen_attn for inference] add page_table"

7ce62e2

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

Update on "[varlen_attn for inference] add page_table"

d8f7c98

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

Update on "[varlen_attn for inference] add page_table"

9d5f87f

`page_table` is an FA3 feature so we need to modify the function definitions in `native_functions.yaml`. if this is used with FA2, we throw an error [ghstack-poisoned]

meta-codesync bot added fb-exported meta-exported labels Mar 11, 2026

pytorchmergebot closed this in ae0bccf Mar 11, 2026

pytorchmergebot pushed a commit that referenced this pull request Mar 11, 2026

[varlen_attn for inference] add test to aot_inductor (#175936)

82fec54

Pull Request resolved: #175936 Approved by: https://github.com/drisspg ghstack dependencies: #175924

pytorchmergebot pushed a commit that referenced this pull request Mar 11, 2026

[varlen_attn for inference] remove unnecessary tensor creation (#176723)

f1f3d70

Pull Request resolved: #176723 Approved by: https://github.com/drisspg ghstack dependencies: #175924, #175936, #176015

sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026

add page_table to varlen_attn

c1178e2

ghstack-source-id: ca17f5c Pull Request resolved: pytorch/pytorch#175924

		is larger than the actual sequence. Inference-only (not supported in backward).
		page_table (Tensor, optional): Page table mapping logical to physical pages for paged

Conversation

liangel-02 commented Feb 26, 2026 • edited by huydhn Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175924

✅ You can merge normally! (3 Unrelated Failures)

Uh oh!

github-actions bot commented Feb 26, 2026

Attention! native_functions.yaml was changed

Uh oh!

github-actions bot commented Feb 26, 2026

Attention! PyTorch one of the C-stable API file was changed

Uh oh!

liangel-02 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

liangel-02 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

drisspg Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drisspg Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

drisspg Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

pytorchmergebot commented Mar 8, 2026

Uh oh!

huydhn commented Mar 10, 2026

Uh oh!

pytorchmergebot commented Mar 10, 2026

Uh oh!

pytorchmergebot commented Mar 10, 2026

Uh oh!

liangel-02 commented Mar 10, 2026

Uh oh!

pytorchmergebot commented Mar 11, 2026

Uh oh!

pytorchmergebot commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liangel-02 commented Feb 26, 2026 •

edited by huydhn

Loading

pytorch-bot bot commented Feb 26, 2026 •

edited

Loading

drisspg Mar 4, 2026 •

edited

Loading