[BE] [Inductor] Re-Land Support TMA before strict 3.4 cutoff by njriasan · Pull Request #160747 · pytorch/pytorch

njriasan · 2025-08-15T16:00:57Z

Summary: Inductor's 3.4 Triton release is the most common used variant of Triton, but if someone is working with an alternative version of Triton this may not match. This moves the version check from 3.4 Triton to any variant that has support for the TMA APIs.

Test Plan:
Testing the previously failing test inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda

Rollback Plan:

Differential Revision: D80348643

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

pytorch-bot · 2025-08-15T16:01:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160747

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 922e96a with merge base 58f9a3d ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor / cuda12.8-py3.10-gcc9-sm86 / test (inductor_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
Process completed with exit code 1.
inductor / cuda12.8-py3.10-gcc9-sm86 / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-15T16:01:07Z

This pull request was exported from Phabricator. Differential Revision: D80348643

…#160747) Summary: Inductor's 3.4 Triton release is the most common used variant of Triton, but if someone is working with an alternative version of Triton this may not match. This moves the version check from 3.4 Triton to any variant that has support for the TMA APIs. Test Plan: Testing the previously failing test `inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda` Rollback Plan: Differential Revision: D80348643

facebook-github-bot · 2025-08-15T16:16:46Z

This pull request was exported from Phabricator. Differential Revision: D80348643

njriasan · 2025-08-17T00:27:26Z

@pytorchbot merge

pytorchmergebot · 2025-08-17T00:29:24Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

malfet · 2025-08-17T14:21:13Z

@pytorchbot revert -m "Looks like this breaks rocm, see https://hud.pytorch.org/hud/pytorch/pytorch/main/1?per_page=50&name_filter=rocm%20%2F%20linux-jammy-rocm-py3.10" -c nosignal

pytorchmergebot · 2025-08-17T14:22:39Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

…160747)" This reverts commit 8f43454. Reverted #160747 on behalf of https://github.com/malfet due to Looks like this breaks rocm, see https://hud.pytorch.org/hud/pytorch/pytorch/main/1?per_page=50&name_filter=rocm%20%2F%20linux-jammy-rocm-py3.10 ([comment](#160747 (comment)))

pytorchmergebot · 2025-08-17T14:22:51Z

@njriasan your PR has been successfully reverted.

njriasan · 2025-08-17T17:18:21Z

https://hud.pytorch.org/hud/pytorch/pytorch/main/1?per_page=50&name_filter=rocm%20%2F%20linux-jammy-rocm-py3.10

Thanks! I'll rerun this test on rocm. I expect this PR shouldn't impact rocm because TMA isn't supported on AMD, but that probably points to an existing bug.

njriasan · 2025-08-19T03:18:04Z

Doing a deeper dive it looks like this test is just broken. In particular I believe this check is setting expected_num_block_pointers=6 because of TMA, which only works for Nvidia and not AMD. I'll update this test to work better.

njriasan · 2025-08-19T03:24:19Z

Doing a deeper dive it looks like this test is just broken. In particular I believe this check is setting expected_num_block_pointers=6 because of TMA, which only works for Nvidia and not AMD. I'll update this test to work better.

Actually it seems like these tests may not a more rigorous check to disable them on AMD. I'll add that check.

Summary: Inductor's 3.4 Triton release is the most common used variant of Triton, but if someone is working with an alternative version of Triton this may not match. This moves the version check from 3.4 Triton to any variant that has support for the TMA APIs. Test Plan: Testing the previously failing test `inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda` Rollback Plan: Differential Revision: D80348643

facebook-github-bot · 2025-08-19T03:25:57Z

This pull request was exported from Phabricator. Differential Revision: D80348643

…#160747) Summary: Inductor's 3.4 Triton release is the most common used variant of Triton, but if someone is working with an alternative version of Triton this may not match. This moves the version check from 3.4 Triton to any variant that has support for the TMA APIs. Test Plan: Testing the previously failing test `inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda` Rollback Plan: Differential Revision: D80348643

facebook-github-bot · 2025-08-19T03:44:01Z

This pull request was exported from Phabricator. Differential Revision: D80348643

njriasan · 2025-08-19T03:44:02Z

@pytorchmergebot merge

pytorchmergebot · 2025-08-19T03:46:36Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

njriasan · 2025-08-19T14:42:54Z

I see rocm failures on CI, but they seem unrelated and I'm not convinced these tests should be running. Before reverting this PR can we try fixing the CI issues in a followup PR. What I saw with the last failure is that it was broken on main but it just didn't run for some reason.

njriasan · 2025-08-19T14:56:54Z

I added a fix here: #160974.

jeffdaily

Please fix in follow-up PR or I will revert this. Breaking ROCm CI with this typo.

jeffdaily · 2025-08-19T16:44:16Z

test/inductor/test_torchinductor_strided_blocks.py

+    not (
+        HAS_CUDA_AND_TRITON
+        and torch.cuda.get_device_capability()[0] >= 9
+        and torch.hip.version is None


Should be torch.version.hip not hip.version.

Fixed in #160981.

broke rocm inductor tests

broke rocm inductor tests Fixes #ISSUE_NUMBER Pull Request resolved: #160981 Approved by: https://github.com/jeffdaily, https://github.com/Skylion007 Co-authored-by: Jeff Daily <jeff.daily@amd.com>

…#160747) Summary: Inductor's 3.4 Triton release is the most common used variant of Triton, but if someone is working with an alternative version of Triton this may not match. This moves the version check from 3.4 Triton to any variant that has support for the TMA APIs. Test Plan: Testing the previously failing test `inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda` Rollback Plan: Differential Revision: D80348643 Pull Request resolved: pytorch#160747 Approved by: https://github.com/NikhilAPatel

…ytorch#160747)" This reverts commit 8f43454. Reverted pytorch#160747 on behalf of https://github.com/malfet due to Looks like this breaks rocm, see https://hud.pytorch.org/hud/pytorch/pytorch/main/1?per_page=50&name_filter=rocm%20%2F%20linux-jammy-rocm-py3.10 ([comment](pytorch#160747 (comment)))

…#160747) Summary: Inductor's 3.4 Triton release is the most common used variant of Triton, but if someone is working with an alternative version of Triton this may not match. This moves the version check from 3.4 Triton to any variant that has support for the TMA APIs. Test Plan: Testing the previously failing test `inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda` Rollback Plan: Differential Revision: D80348643 Pull Request resolved: pytorch#160747 Approved by: https://github.com/NikhilAPatel

…ytorch#160747)" This reverts commit 8f43454. Reverted pytorch#160747 on behalf of https://github.com/malfet due to Looks like this breaks rocm, see https://hud.pytorch.org/hud/pytorch/pytorch/main/1?per_page=50&name_filter=rocm%20%2F%20linux-jammy-rocm-py3.10 ([comment](pytorch#160747 (comment)))

…#160747) Summary: Inductor's 3.4 Triton release is the most common used variant of Triton, but if someone is working with an alternative version of Triton this may not match. This moves the version check from 3.4 Triton to any variant that has support for the TMA APIs. Test Plan: Testing the previously failing test `inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda` Rollback Plan: Differential Revision: D80348643 Pull Request resolved: pytorch#160747 Approved by: https://github.com/NikhilAPatel

broke rocm inductor tests Fixes #ISSUE_NUMBER Pull Request resolved: pytorch#160981 Approved by: https://github.com/jeffdaily, https://github.com/Skylion007 Co-authored-by: Jeff Daily <jeff.daily@amd.com>

pytorch-bot bot added ciflow/inductor module: inductor labels Aug 15, 2025

facebook-github-bot added the fb-exported label Aug 15, 2025

njriasan added the topic: not user facing topic category label Aug 15, 2025

njriasan force-pushed the export-D80348643 branch from d6638f3 to c44b291 Compare August 15, 2025 16:16

njriasan added ciflow/h100 better-engineering Relatively self-contained tasks for better engineering contributors labels Aug 15, 2025

njriasan requested a review from NikhilAPatel August 15, 2025 20:11

NikhilAPatel approved these changes Aug 15, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 15, 2025

pytorchmergebot added the merging label Aug 17, 2025

pytorchmergebot closed this in 8f43454 Aug 17, 2025

pytorchmergebot added Merged and removed merging labels Aug 17, 2025

pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Aug 17, 2025

pytorchmergebot reopened this Aug 17, 2025

njriasan force-pushed the export-D80348643 branch from c44b291 to 73fec3a Compare August 19, 2025 03:25

njriasan force-pushed the export-D80348643 branch from 73fec3a to 922e96a Compare August 19, 2025 03:43

pytorchmergebot added the merging label Aug 19, 2025

pytorchmergebot closed this in df60736 Aug 19, 2025

pytorchmergebot removed the merging label Aug 19, 2025

jeffdaily reviewed Aug 19, 2025

View reviewed changes

jeffdaily mentioned this pull request Aug 19, 2025

[BE] Disable CudaGraphTreeTests on rocm #160974

Closed

jeffdaily added a commit to ROCm/pytorch that referenced this pull request Aug 19, 2025

forward fix pytorch#160747

8752ecb

broke rocm inductor tests

Conversation

njriasan commented Aug 15, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160747

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

njriasan commented Aug 17, 2025

Uh oh!

pytorchmergebot commented Aug 17, 2025

Merge started

Uh oh!

malfet commented Aug 17, 2025

Uh oh!

pytorchmergebot commented Aug 17, 2025

Uh oh!

pytorchmergebot commented Aug 17, 2025

Uh oh!

njriasan commented Aug 17, 2025

Uh oh!

njriasan commented Aug 19, 2025

Uh oh!

njriasan commented Aug 19, 2025

Uh oh!

facebook-github-bot commented Aug 19, 2025

Uh oh!

facebook-github-bot commented Aug 19, 2025

Uh oh!

njriasan commented Aug 19, 2025

Uh oh!

pytorchmergebot commented Aug 19, 2025

Merge started

Uh oh!

njriasan commented Aug 19, 2025

Uh oh!

njriasan commented Aug 19, 2025

Uh oh!

jeffdaily left a comment

Choose a reason for hiding this comment

Uh oh!

jeffdaily Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

jeffdaily Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

njriasan commented Aug 15, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 15, 2025 •

edited

Loading