[ROCm] Enable StaticCudaLauncher for ROCm#166492
[ROCm] Enable StaticCudaLauncher for ROCm#166492chinmaydk99 wants to merge 7 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166492
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 3 Unrelated FailuresAs of commit 92ae60a with merge base 823edb4 ( NEW FAILURE - The following job has failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
To add the ciflow label This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
|
To add the ciflow label This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
|
To add the ciflow label This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
dce13f2 to
d80e990
Compare
d80e990 to
4772f18
Compare
4772f18 to
49accbe
Compare
Merge failedReason: 11 jobs have failed, first few of them are: pull / linux-jammy-py3.14-clang12 / test (default, 3, 5, lf.linux.4xlarge), pull / linux-jammy-py3.10-clang12 / test (default, 5, 5, lf.linux.4xlarge), pull / linux-jammy-py3.10-gcc11 / test (default, 4, 5, lf.linux.2xlarge), pull / linux-jammy-py3.10-clang18-asan / test (default, 3, 7, lf.linux.4xlarge), rocm-mi300 / linux-noble-rocm-py3.12-mi300 / test (default, 1, 6, linux.rocm.gpu.gfx942.1.b) Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Rebase failed due to Command Raised by https://github.com/pytorch/pytorch/actions/runs/20824947914 |
|
@pytorchbot merge -f "only failures were flaky or broken trunk; meta internal diff also failed but we have approval to land anyway" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "sorry, a previous commit introduce the new test file breaks internal system, please rebase after #169121 's revert and reland" -c ghfirst |
|
@pytorchbot successfully started a revert job. Check the current status here. |
|
@chinmaydk99 your PR has been successfully reverted. |
|
@pytorchbot merge -f "linter infra error, other failures are known flaky" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: PR #166492 has not been reviewed yet |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Rebase failed due to Command Raised by https://github.com/pytorch/pytorch/actions/runs/21394151706 |
|
@pytorchbot merge -i |
|
This PR needs to be approved by an authorized maintainer before merge. |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 5 checks: pull / linux-jammy-py3.14t-clang15 / test (dynamo_wrapped, 1, 3, linux.2xlarge), rocm-mi300 / linux-noble-rocm-py3.12-mi300 / test (default, 2, 6, linux.rocm.gpu.gfx942.1), inductor / inductor-test-cuda13 / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu), rocm-mi200 / linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.2), Meta Internal-Only Changes Check Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge -f "Dr CI unrelated failures; lint good" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This PR enables ROCm/HIP support for PyTorch's StaticCudaLauncher, which provides static compilation and launching of Triton kernels. The implementation has been tested on AMD MI300 and MI200 hardware.
Changes
Python (torch/_inductor/runtime/)
C++ (torch/csrc/)
Tests (test/inductor/)
Enabled Unit Tests
All tests in test/inductor/test_static_cuda_launcher.py now pass on ROCm:
cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci @aditvenk @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @jataylo @hongxiayang @naromero77amd @jerrymannil @xinyazhang @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @aditew01 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @dllehr-amd @chenyang78