[inductor] Fix `index_reduce_` on view inputs raising AssertionError in `assert_functional_graph` by aorenste · Pull Request #176606 · pytorch/pytorch

aorenste · 2026-03-05T14:56:41Z

Stack from ghstack (oldest at bottom):

-> [inductor] Fix index_reduce_ on view inputs raising AssertionError in assert_functional_graph #176606

The _index_fill decomposition used mutable empty_like + copy_ to
restore strides when index_copy returned a contiguous tensor, which
broke the functional graph invariant. Replace with the functional
prims.copy_strided prim that does the same thing as a single op.

Fixes #144846

Authored with Claude.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

…in `assert_functional_graph` The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes #144846 Authored with Claude. [ghstack-poisoned]

pytorch-bot · 2026-03-05T14:56:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176606

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (7 Unrelated Failures)

As of commit 766cfa8 with merge base d87ebee ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (default, 3, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) (gh) (similar failure)
test/inductor/test_auto_chunker.py::AutoChunkerTest::test_fused_linear_cel
trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (distributed, 2, 3, lf.linux.g4dn.12xlarge.nvidia.gpu) (gh) (similar failure)
distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size
trunk / macos-py3-arm64 / test (default, 3, 3, macos-m1-stable) (gh) (similar failure)
Build left local git repository checkout dirty

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

trunk / linux-jammy-rocm-py3.10 / test (default, 5, 6, linux.rocm.gpu.gfx950.1) (gh) (trunk failure)
test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_linear_layer_norm_input_cuda
trunk / linux-jammy-rocm-py3.10 / test (distributed, 1, 3, linux.rocm.gpu.gfx950.4) (gh) (trunk failure)
distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size
trunk / linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx950.4) (gh) (trunk failure)
distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_negative_input_rank

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-03-05T14:56:48Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…in `assert_functional_graph` The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes #144846 Authored with Claude. ghstack-source-id: 8468ef5 Pull Request resolved: #176606

…rtionError in `assert_functional_graph`" The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes #144846 Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]

…in `assert_functional_graph` The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes #144846 Authored with Claude. ghstack-source-id: 292cd14 Pull Request resolved: #176606

Lucaskabela · 2026-03-05T21:07:34Z

test/inductor/test_torchinductor.py

            out2 = run_session(100, 16, 64, self.device)
            self.assertEqual(out2.device.type, self.device)

+    def test_index_reduce_on_view_input(self):


Do we know why this causes a failure for Pallas backend? Let's make sure those signals are good before landing

According to claude the pallas backend doesn't support index_reduce_ so the only solution was to expect it to fail.

aorenste · 2026-03-06T02:51:42Z

@pytorchbot merge

pytorchmergebot · 2026-03-06T02:53:52Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

pytorch-bot · 2026-03-06T02:53:57Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

aorenste · 2026-03-06T16:00:57Z

@pytorchbot merge

pytorchmergebot · 2026-03-06T16:03:16Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

jeffdaily · 2026-03-07T00:36:26Z

This broke ROCm CI.

PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=4 PYTORCH_TEST_WITH_ROCM=1 python test/test_meta.py TestMetaCUDA.test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32

test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32 GH job link HUD commit link

jeffdaily · 2026-03-07T00:38:03Z

@pytorchbot revert -c weird -m "trunk was passing pre-merge, but failures appeared post-merge only. see test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32 GH job link HUD commit link"

pytorchmergebot · 2026-03-07T00:39:58Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

malfet · 2026-03-07T01:00:47Z

@jeffdaily already pinged the team, but feel free to find commit that causes conflicts and revert as well

zou3519 · 2026-03-07T03:01:39Z

@pytorchbot revert -c weird -m "trunk was passing pre-merge, but failures appeared post-merge only. see test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32 GH job link HUD commit link"

pytorchmergebot · 2026-03-07T03:03:24Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2026-03-07T03:03:29Z

Reverting PR 176606 failed

Reason: Command git -C /home/runner/work/pytorch/pytorch revert --no-edit 354b0ff88ca5350b68da5d1dee8f3e32dcede563 returned non-zero exit code 1

Auto-merging test/inductor/test_torchinductor.py
CONFLICT (content): Merge conflict in test/inductor/test_torchinductor.py
error: could not revert 354b0ff88ca... [inductor] Fix `index_reduce_` on view inputs raising AssertionError in `assert_functional_graph` (#176606)
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git revert --continue".
hint: You can instead skip this commit with "git revert --skip".
hint: To abort and get back to the state before "git revert",
hint: run "git revert --abort".
hint: Disable this message with "git config set advice.mergeConflict false"

Details for Dev Infra team

Raised by workflow job

zou3519 · 2026-03-07T03:04:12Z

rip I thought I got the merge conflict

…hader codegen (#176436)" This reverts commit 4926192. Reverted #176436 on behalf of https://github.com/zou3519 due to sorry I need to revert this in order to revert #176606 ([comment](#176436 (comment)))

…degen (#176436) Metal Shading Language rejects implicit float-to-bfloat conversions, so bare float literals like `0.0` in generated shaders cause compilation failures when the target variable is `bfloat` (or `half`). Three codegen methods were affected: - `constant()` ignored its `dtype` parameter and returned raw literals. - `masked()` assigned a bare literal in the else-branch (`} else tmp = 0.0;`). - `where()` passed a bare literal through the ternary without casting. All three now emit `static_cast<bfloat>(...)` / `static_cast<half>(...)` where needed. Tests added for half-precision constants, reductions, and conditionals. Pull Request resolved: #176436 Approved by: https://github.com/malfet

zou3519 · 2026-03-07T03:12:33Z

@pytorchbot revert -c weird -m "trunk was passing pre-merge, but failures appeared post-merge only. see test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32 GH job link HUD commit link"

pytorchmergebot · 2026-03-07T03:15:32Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

…onError in `assert_functional_graph` (#176606)" This reverts commit 354b0ff. Reverted #176606 on behalf of https://github.com/zou3519 due to trunk was passing pre-merge, but failures appeared post-merge only. see test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/22771325935/job/66057190983) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/354b0ff88ca5350b68da5d1dee8f3e32dcede563) ([comment](#176606 (comment)))

pytorchmergebot · 2026-03-07T03:15:40Z

@aorenste your PR has been successfully reverted.

…rtionError in `assert_functional_graph`" The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes #144846 Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]

…in `assert_functional_graph` The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes #144846 Authored with Claude. ghstack-source-id: ad3ad51 Pull Request resolved: #176606

aorenste · 2026-03-09T02:16:03Z

@pytorchbot merge

pytorchmergebot · 2026-03-09T02:18:14Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…in `assert_functional_graph` (pytorch#176606) The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes pytorch#144846 Authored with Claude. Pull Request resolved: pytorch#176606 Approved by: https://github.com/Lucaskabela

…hader codegen (pytorch#176436)" This reverts commit 4926192. Reverted pytorch#176436 on behalf of https://github.com/zou3519 due to sorry I need to revert this in order to revert pytorch#176606 ([comment](pytorch#176436 (comment)))

…onError in `assert_functional_graph` (pytorch#176606)" This reverts commit 354b0ff. Reverted pytorch#176606 on behalf of https://github.com/zou3519 due to trunk was passing pre-merge, but failures appeared post-merge only. see test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/22771325935/job/66057190983) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/354b0ff88ca5350b68da5d1dee8f3e32dcede563) ([comment](pytorch#176606 (comment)))

…in `assert_functional_graph` (pytorch#176606) The `_index_fill` decomposition used mutable `empty_like + copy_` to restore strides when `index_copy` returned a contiguous tensor, which broke the functional graph invariant. Replace with the functional `prims.copy_strided` prim that does the same thing as a single op. Fixes pytorch#144846 Authored with Claude. Pull Request resolved: pytorch#176606 Approved by: https://github.com/Lucaskabela, https://github.com/mlazos

pytorch-bot bot added ciflow/inductor module: inductor labels Mar 5, 2026

pytorch-bot bot added the ciflow/inductor-pallas label Mar 5, 2026

aorenste marked this pull request as ready for review March 5, 2026 20:59

aorenste requested review from Lucaskabela and angelayi March 5, 2026 20:59

Lucaskabela approved these changes Mar 5, 2026

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 6, 2026

pytorchmergebot added the merging label Mar 6, 2026

pytorchmergebot removed the merging label Mar 6, 2026

aorenste added the topic: not user facing topic category label Mar 6, 2026

pytorchmergebot added the merging label Mar 6, 2026

pytorchmergebot added the Merged label Mar 6, 2026

pytorchmergebot closed this in 354b0ff Mar 6, 2026

pytorchmergebot removed the merging label Mar 6, 2026

zou3519 mentioned this pull request Mar 7, 2026

[Inductor][MPS] Fix half-precision type mismatches in Metal shader codegen #176436

Closed

pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Mar 7, 2026

pytorchmergebot reopened this Mar 7, 2026

pytorch-auto-revert bot mentioned this pull request Mar 7, 2026

[DO NOT CLOSE] Autorevert actions shadow mode stream #163650

Open

mlazos approved these changes Mar 8, 2026

View reviewed changes

pytorchmergebot added the merging label Mar 9, 2026

pytorchmergebot closed this in 0a97ed1 Mar 9, 2026

pytorchmergebot removed the merging label Mar 9, 2026

github-actions bot deleted the gh/aorenste/207/head branch April 8, 2026 02:23

Conversation

aorenste commented Mar 5, 2026 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176606

✅ You can merge normally! (7 Unrelated Failures)

Uh oh!

pytorch-bot bot commented Mar 5, 2026

This PR needs a release notes: label

Uh oh!

Lucaskabela Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aorenste Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

aorenste commented Mar 6, 2026

Uh oh!

pytorchmergebot commented Mar 6, 2026

Merge failed

Uh oh!

pytorch-bot bot commented Mar 6, 2026

This PR needs a release notes: label

Uh oh!

aorenste commented Mar 6, 2026

Uh oh!

pytorchmergebot commented Mar 6, 2026

Merge started

Uh oh!

jeffdaily commented Mar 7, 2026

Uh oh!

jeffdaily commented Mar 7, 2026

Uh oh!

pytorchmergebot commented Mar 7, 2026

Uh oh!

malfet commented Mar 7, 2026

Uh oh!

zou3519 commented Mar 7, 2026

Uh oh!

pytorchmergebot commented Mar 7, 2026

Uh oh!

pytorchmergebot commented Mar 7, 2026

Reverting PR 176606 failed

Uh oh!

zou3519 commented Mar 7, 2026

Uh oh!

zou3519 commented Mar 7, 2026

Uh oh!

pytorchmergebot commented Mar 7, 2026

Uh oh!

pytorchmergebot commented Mar 7, 2026

Uh oh!

aorenste commented Mar 9, 2026

Uh oh!

pytorchmergebot commented Mar 9, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

aorenste commented Mar 5, 2026 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Mar 5, 2026 •

edited

Loading

This PR needs a `release notes:` label

Lucaskabela Mar 5, 2026 •

edited

Loading

This PR needs a `release notes:` label