[Inductor] fix performance regression caused by #173662 by AmesingFlank · Pull Request #176772 · pytorch/pytorch

AmesingFlank · 2026-03-07T01:00:33Z

Stack from ghstack (oldest at bottom):

-> [Inductor] fix performance regression caused by #173662 #176772

Tested by running

python benchmarks/dynamo/pr_time_benchmarks/benchmarks/mm_loop.py a d

before this PR:

collecting compile time instruction count for mm_loop_inductor_gpu
W0307 01:06:14.007000 19968 /home/dev/pytorch/torch/_inductor/utils.py:1720] [0/0] Not enough SMs to use max_autotune_gemm mode
compile time instruction count for iteration 0 is 18639759705
compile time instruction count for iteration 1 is 4313416006
compile time instruction count for iteration 2 is 4306442862
compile time instruction count for iteration 3 is 4312169006
compile time instruction count for iteration 4 is 4311841549
collecting compile time instruction count for mm_loop_inductor_dynamic_gpu
compile time instruction count for iteration 0 is 8210716786
compile time instruction count for iteration 1 is 8000993645
compile time instruction count for iteration 2 is 7997733533
compile time instruction count for iteration 3 is 7993982380
compile time instruction count for iteration 4 is 7994009181

with this PR:

collecting compile time instruction count for mm_loop_inductor_gpu
W0307 01:01:10.094000 14988 /home/dev/pytorch/torch/_inductor/utils.py:1720] [0/0] Not enough SMs to use max_autotune_gemm mode
compile time instruction count for iteration 0 is 18228833593
compile time instruction count for iteration 1 is 58028492104
compile time instruction count for iteration 2 is 3907665800
compile time instruction count for iteration 3 is 3903875384
compile time instruction count for iteration 4 is 3904861924
collecting compile time instruction count for mm_loop_inductor_dynamic_gpu
compile time instruction count for iteration 0 is 7664300088
compile time instruction count for iteration 1 is 7598450061
compile time instruction count for iteration 2 is 7596564537
compile time instruction count for iteration 3 is 7590008847
compile time instruction count for iteration 4 is 7589530612

Also, user triton kernel fusion isn't affected:

python -m pytest test/inductor/test_triton_kernels.py -k TestUserKernelEpilogueFusion

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @Lucaskabela

[ghstack-poisoned]

ghstack-source-id: 91f34cc Pull-Request: #176772

pytorch-bot · 2026-03-07T01:00:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176772

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 5aecd88 with merge base 9774102 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-03-07T01:00:40Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

ghstack-source-id: 91f34cc Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: 9ca1d8f Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: 4ba8944 Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: 4c02f9a Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: adc2f8a Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: 48c9117 Pull-Request: #176772

laithsakka

i land as as

laithsakka · 2026-03-08T02:43:37Z

@pytorchbot merge

pytorchmergebot · 2026-03-08T02:45:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2026-03-08T03:43:32Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m1-14)

Details for Dev Infra team

Raised by workflow job

ghstack-source-id: 48c9117 Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: 64331bc Pull-Request: #176772

AmesingFlank · 2026-03-08T05:53:41Z

@pytorchbot merge

pytorchmergebot · 2026-03-08T05:56:47Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2026-03-08T07:20:56Z

Merge failed

Reason: 1 jobs have failed, first few of them are: inductor / inductor-test / test (inductor_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team

Raised by workflow job

ghstack-source-id: 64331bc Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: 4a338c8 Pull-Request: #176772

AmesingFlank · 2026-03-08T07:59:30Z

@pytorchbot merge

pytorchmergebot · 2026-03-08T08:02:46Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

ghstack-source-id: 4a338c8 Pull-Request: #176772

[ghstack-poisoned]

ghstack-source-id: d3f38fa Pull-Request: #176772

AmesingFlank · 2026-03-08T08:48:58Z

@pytorchbot merge

pytorchmergebot · 2026-03-08T08:49:15Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

pytorchmergebot · 2026-03-08T08:51:34Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

The inductor NameError flake (buf1 not defined) that prompted the pin has been fixed upstream by multiple PyTorch PRs: - pytorch/pytorch#176772 (fix is_unfusable scheduler logic) - pytorch/pytorch#176832 (guard get_read_writes behind config flag) - pytorch/pytorch#177062 (fix MultiOutput write deps) The root cause was pytorch/pytorch#173662 which overrode UserDefinedTritonKernel.get_read_writes() even when the epilogue fusion feature was disabled, breaking buffer scheduling for TritonTemplateBuffer subclasses like Helion's HelionTemplateBuffer. Verified the previously-failing test passes against current PyTorch main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update

5a1e2e8

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 7, 2026

[Inductor] fix performance regression caused by #173662

e7bbe3f

ghstack-source-id: 91f34cc Pull-Request: #176772

pytorch-bot bot added ciflow/inductor module: inductor labels Mar 7, 2026

AmesingFlank added a commit that referenced this pull request Mar 7, 2026

[Inductor] fix performance regression caused by #173662

fd0cb93

ghstack-source-id: 91f34cc Pull-Request: #176772

AmesingFlank added a commit that referenced this pull request Mar 7, 2026

[Inductor] fix performance regression caused by #173662

d645fe5

ghstack-source-id: 91f34cc Pull-Request: #176772

Update

cc93222

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 7, 2026

[Inductor] fix performance regression caused by #173662

c251f19

ghstack-source-id: 9ca1d8f Pull-Request: #176772

Update

ec3d456

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 7, 2026

[Inductor] fix performance regression caused by #173662

a8269b4

ghstack-source-id: 4ba8944 Pull-Request: #176772

AmesingFlank added the topic: not user facing topic category label Mar 7, 2026

laithsakka mentioned this pull request Mar 7, 2026

symbolic shapes guarding_hint_or_throw and optimization_hint #175497

Closed

Update

4f6a45a

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 7, 2026

[Inductor] fix performance regression caused by #173662

8265cde

ghstack-source-id: 4c02f9a Pull-Request: #176772

Update

1926d2e

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 7, 2026

[Inductor] fix performance regression caused by #173662

fe261ca

ghstack-source-id: adc2f8a Pull-Request: #176772

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

137a667

ghstack-source-id: adc2f8a Pull-Request: #176772

Update

15b37a9

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

fed15bd

ghstack-source-id: 48c9117 Pull-Request: #176772

pytorch-bot bot added the module: dynamo label Mar 8, 2026

laithsakka approved these changes Mar 8, 2026

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 8, 2026

pytorchmergebot added the merging label Mar 8, 2026

pytorchmergebot removed the merging label Mar 8, 2026

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

88c2032

ghstack-source-id: 48c9117 Pull-Request: #176772

Update

bcc9f85

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

2f55639

ghstack-source-id: 64331bc Pull-Request: #176772

pytorchmergebot added the merging label Mar 8, 2026

pytorchmergebot removed the merging label Mar 8, 2026

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

ef04ceb

ghstack-source-id: 64331bc Pull-Request: #176772

Update

6c17727

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

609cc94

ghstack-source-id: 4a338c8 Pull-Request: #176772

pytorchmergebot added the merging label Mar 8, 2026

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

0b8b584

ghstack-source-id: 4a338c8 Pull-Request: #176772

Update

5aecd88

[ghstack-poisoned]

AmesingFlank added a commit that referenced this pull request Mar 8, 2026

[Inductor] fix performance regression caused by #173662

258e154

ghstack-source-id: d3f38fa Pull-Request: #176772

pytorchmergebot added the Merged label Mar 8, 2026

pytorchmergebot closed this in 7643509 Mar 8, 2026

pytorchmergebot removed the merging label Mar 8, 2026

v0i0 mentioned this pull request Mar 11, 2026

Unpin H100 nightly torch and Triton versions pytorch/helion#1654

Draft

Conversation

AmesingFlank commented Mar 7, 2026 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176772

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

pytorch-bot bot commented Mar 7, 2026

This PR needs a release notes: label

Uh oh!

laithsakka left a comment

Choose a reason for hiding this comment

Uh oh!

laithsakka commented Mar 8, 2026

Uh oh!

pytorchmergebot commented Mar 8, 2026

Merge started

Uh oh!

pytorchmergebot commented Mar 8, 2026

Merge failed

Uh oh!

AmesingFlank commented Mar 8, 2026

Uh oh!

pytorchmergebot commented Mar 8, 2026

Merge started

Uh oh!

pytorchmergebot commented Mar 8, 2026

Merge failed

Uh oh!

AmesingFlank commented Mar 8, 2026

Uh oh!

pytorchmergebot commented Mar 8, 2026

Merge started

Uh oh!

AmesingFlank commented Mar 8, 2026

Uh oh!

pytorchmergebot commented Mar 8, 2026

Uh oh!

pytorchmergebot commented Mar 8, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AmesingFlank commented Mar 7, 2026 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Mar 7, 2026 •

edited

Loading

This PR needs a `release notes:` label