[Inductor] Fallback to super().get_read_writes when epilogue_fusion_user_defined_triton_kernel is disabled by coufon · Pull Request #176832 · pytorch/pytorch

coufon · 2026-03-08T16:38:23Z

Summary: PR #173662 introduces a new fusion epilogue_fusion_user_defined_triton_kernel. It overrides the get_read_writes method of UserDefinedTritonKernel, even when epilogue_fusion_user_defined_triton_kernel is disabled. This may cause regression to existing Triton kernel CUDA graph for other models. This diff fixed the regression by falling back to the original get_read_writes when epilogue_fusion_user_defined_triton_kernel is disabled

Differential Revision: D95727036

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

pytorch-bot · 2026-03-08T16:38:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176832

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (4 Unrelated Failures)

As of commit 904933b with merge base 82374f2 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.gfx950.1) (gh) (similar failure)
test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_profiler_mark_wrapper_call_cuda_gpu_wrapper

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

trunk / macos-py3-arm64 / test (default, 3, 3, macos-m1-stable) (gh) (trunk failure)
Build left local git repository checkout dirty
trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m1-14) (gh) (trunk failure)
Build left local git repository checkout dirty

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-03-08T16:38:30Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

meta-codesync · 2026-03-08T16:38:35Z

@coufon has exported this pull request. If you are a Meta employee, you can view the originating Diff in D95727036.

…ser_defined_triton_kernel is disabled (pytorch#176832) Summary: PR pytorch#173662 introduces a new fusion `epilogue_fusion_user_defined_triton_kernel`. It overrides the `get_read_writes` method of `UserDefinedTritonKernel`, even when `epilogue_fusion_user_defined_triton_kernel` is disabled. This may cause regression to existing Triton kernel CUDA graph for other models. This diff fixed the regression by falling back to the original `get_read_writes` when `epilogue_fusion_user_defined_triton_kernel` is disabled Reviewed By: AmesingFlank Differential Revision: D95727036

…ser_defined_triton_kernel is disabled (pytorch#176832) Summary: Pull Request resolved: pytorch#176832 PR pytorch#173662 introduces a new fusion `epilogue_fusion_user_defined_triton_kernel`. It overrides the `get_read_writes` method of `UserDefinedTritonKernel`, even when `epilogue_fusion_user_defined_triton_kernel` is disabled. This may cause regression to existing Triton kernel CUDA graph for other models. This diff fixed the regression by falling back to the original `get_read_writes` when `epilogue_fusion_user_defined_triton_kernel` is disabled Reviewed By: AmesingFlank Differential Revision: D95727036

AmesingFlank · 2026-03-09T01:48:44Z

@pytorchmergebot merge

pytorchmergebot · 2026-03-09T01:50:56Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

pytorch-bot · 2026-03-09T01:51:00Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

AmesingFlank · 2026-03-09T01:52:00Z

@pytorchmergebot merge

pytorchmergebot · 2026-03-09T01:54:02Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

The inductor NameError flake (buf1 not defined) that prompted the pin has been fixed upstream by multiple PyTorch PRs: - pytorch/pytorch#176772 (fix is_unfusable scheduler logic) - pytorch/pytorch#176832 (guard get_read_writes behind config flag) - pytorch/pytorch#177062 (fix MultiOutput write deps) The root cause was pytorch/pytorch#173662 which overrode UserDefinedTritonKernel.get_read_writes() even when the epilogue fusion feature was disabled, breaking buffer scheduling for TritonTemplateBuffer subclasses like Helion's HelionTemplateBuffer. Verified the previously-failing test passes against current PyTorch main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ser_defined_triton_kernel is disabled (pytorch#176832) Summary: PR pytorch#173662 introduces a new fusion `epilogue_fusion_user_defined_triton_kernel`. It overrides the `get_read_writes` method of `UserDefinedTritonKernel`, even when `epilogue_fusion_user_defined_triton_kernel` is disabled. This may cause regression to existing Triton kernel CUDA graph for other models. This diff fixed the regression by falling back to the original `get_read_writes` when `epilogue_fusion_user_defined_triton_kernel` is disabled Differential Revision: D95727036 Pull Request resolved: pytorch#176832 Approved by: https://github.com/AmesingFlank

pytorch-bot Bot added ciflow/inductor module: inductor labels Mar 8, 2026

meta-codesync Bot added fb-exported meta-exported labels Mar 8, 2026

AmesingFlank approved these changes Mar 8, 2026

View reviewed changes

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 8, 2026

coufon force-pushed the export-D95727036 branch from 951d081 to 220b4d0 Compare March 8, 2026 18:03

coufon force-pushed the export-D95727036 branch from 220b4d0 to 9b292c1 Compare March 8, 2026 18:03

coufon force-pushed the export-D95727036 branch from 9b292c1 to 904933b Compare March 8, 2026 18:07

pytorchmergebot added the merging label Mar 9, 2026

pytorchmergebot removed the merging label Mar 9, 2026

AmesingFlank added the topic: not user facing topic category label Mar 9, 2026

pytorchmergebot added the merging label Mar 9, 2026

pytorchmergebot closed this in a69ac73 Mar 9, 2026

pytorchmergebot added Merged and removed merging labels Mar 9, 2026

v0i0 mentioned this pull request Mar 11, 2026

Unpin H100 nightly torch and Triton versions pytorch/helion#1654

Merged

jjvraw mentioned this pull request Apr 29, 2026

[Inductor] User kernel pointer arg with no tl.load/tl.store incorrectly eliminated when epilogue fusion enabled #181864

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inductor] Fallback to super().get_read_writes when epilogue_fusion_user_defined_triton_kernel is disabled#176832

[Inductor] Fallback to super().get_read_writes when epilogue_fusion_user_defined_triton_kernel is disabled#176832
coufon wants to merge 1 commit intopytorch:mainfrom
coufon:export-D95727036

coufon commented Mar 8, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Mar 8, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Mar 8, 2026

Uh oh!

meta-codesync Bot commented Mar 8, 2026

Uh oh!

AmesingFlank commented Mar 9, 2026

Uh oh!

pytorchmergebot commented Mar 9, 2026

Uh oh!

pytorch-bot Bot commented Mar 9, 2026

Uh oh!

AmesingFlank commented Mar 9, 2026

Uh oh!

pytorchmergebot commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

coufon commented Mar 8, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176832

✅ You can merge normally! (4 Unrelated Failures)

Uh oh!

pytorch-bot Bot commented Mar 8, 2026

This PR needs a release notes: label

Uh oh!

meta-codesync Bot commented Mar 8, 2026

Uh oh!

AmesingFlank commented Mar 9, 2026

Uh oh!

pytorchmergebot commented Mar 9, 2026

Merge failed

Uh oh!

pytorch-bot Bot commented Mar 9, 2026

This PR needs a release notes: label

Uh oh!

AmesingFlank commented Mar 9, 2026

Uh oh!

pytorchmergebot commented Mar 9, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coufon commented Mar 8, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Mar 8, 2026 •

edited

Loading

This PR needs a `release notes:` label

This PR needs a `release notes:` label