[aarch64] CUDA 12.8 aarch64 builds to nightly binaries#146378
[aarch64] CUDA 12.8 aarch64 builds to nightly binaries#146378tinglvv wants to merge 8 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146378
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 3 New Failures, 1 Cancelled Job, 1 Unrelated FailureAs of commit f8136e4 with merge base 0463cb6 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
atalman
left a comment
There was a problem hiding this comment.
Lgtm, perhaps we should deprecate CUDA 12.6 aarch64 builds once CUDA 12.8 is available
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-focal-rocm6.3-py3.10 / test (default, 2, 2, linux.rocm.gpu.2) Details for Dev Infra teamRaised by workflow job |
|
ROCM build failures are known SEV. Windows build failure should be unrelated? Restarted in https://github.com/pytorch/pytorch/actions/runs/13132625013/job/36681731706, but not sure if we need to wait for the result as this change doesnt concern Windows. |
|
@pytorchbot merge -i "rocm failures are known" |
|
❌ 🤖 pytorchbot command failed: Try |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 5 checks: windows-binary-wheel / wheel-py3_11-cuda11_8-build, linux-binary-libtorch-cxx11-abi / libtorch-rocm6_2_4-shared-with-deps-cxx11-abi-test, trunk / linux-focal-rocm6.3-py3.10 / test (default, 1, 2, linux.rocm.gpu.2), trunk / linux-focal-rocm6.3-py3.10 / test (default, 2, 2, linux.rocm.gpu.2), trunk / linux-focal-rocm6.3-py3.10 / test (distributed, 1, 1, linux.rocm.gpu.4) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
#145570
Adding Cuda 12.8 and keeping 12.6 for the sbsa build, supported CUDA_ARCH: 9.0, 10.0, 12.0
Refactor the binaries matrix for cuda sbsa build. Previously cuda-aarch64 was hardcoded to cuda 12.6. Now reads 12.6 and 12.8, new build naming example manywheel-py3_9-cuda-aarch64-12_8-build
TODO: once 12.8 is stable, remove 12.6 in sbsa
cc @atalman @malfet @ptrblck @nWEIdia