[CD] Add CUDA 13.0 x86 nightly builds#160956
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160956
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 3 PendingAs of commit b2ffcd6 with merge base 19c70c2 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@tinglvv current error is related to nvshmem and sm_75: |
|
Please note Docker builds are using correct version: |
|
Does the NVSHMEM error occur to sm_75 only? If so, it would point to NVSHMEM 3.3.20 dropping support of this arch? |
|
Don't see sm75 mentioned here: https://docs.nvidia.com/nvshmem/release-notes-install-guide/release-notes/release-3320.html |
Adding dependencies to unblock pytorch/pytorch#160956
|
https://download.pytorch.org/whl/nightly/cu130 is updated after pytorch/test-infra#7038. Rerunning the test. |
Related to pytorch/pytorch#160956 follow up for #7038 cc @atalman
|
Disabled sm_75 for NVSHMEM for CUDA 13 build temporarily (3069af6), will need to enable when hotfix in 3.3.21 is released. |
|
@pytorchmergebot merge -f "signal looks good" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
Is this the only code change related to "disable nvshmem"? |
Hi @kwen2501 , thanks for bringing this up. I updated to NVSHMEM 3.3.24 in a separate PR - #161321. |
The other PR looks good! For ARM, both issues are fixed (glibc issue and download link fixed), so we can enable it back. I'll open a PR to put it back. |
Related to #159779 Adding CUDA 13.0 libtorch builds, followup after #160956 Removing CUDA 12.9 builds, See #159980 Pull Request resolved: #161916 Approved by: https://github.com/jeanschmidt, https://github.com/Skylion007 Co-authored-by: Ting Lu <tingl@nvidia.com>
…161916) Related to pytorch#159779 Adding CUDA 13.0 libtorch builds, followup after pytorch#160956 Removing CUDA 12.9 builds, See pytorch#159980 Pull Request resolved: pytorch#161916 Approved by: https://github.com/jeanschmidt, https://github.com/Skylion007 Co-authored-by: Ting Lu <tingl@nvidia.com>
pytorch#159779 CUDA 13.0.0 NVSHMEM 3.3.20 CUDNN 9.12.0.46 Adding x86 linux builds for CUDA 13. Adding libtorch docker. Package naming changed for CUDA 13 (removed postfix -cu13 for some packages). Preparation checklist: 1. Update index https://download.pytorch.org/whl/nightly/cu130 with pypi packages 2. Update packaging name based on https://pypi.org/project/cuda-toolkit/ metadata Pull Request resolved: pytorch#160956 Approved by: https://github.com/atalman Co-authored-by: atalman <atalman@fb.com>
…161916) Related to pytorch#159779 Adding CUDA 13.0 libtorch builds, followup after pytorch#160956 Removing CUDA 12.9 builds, See pytorch#159980 Pull Request resolved: pytorch#161916 Approved by: https://github.com/jeanschmidt, https://github.com/Skylion007 Co-authored-by: Ting Lu <tingl@nvidia.com>
Undo changes introduced in #160956 as driver has been updated to 580 for both fleets
Undo changes introduced in #160956 as driver has been updated to 580 for both fleets Fixes #163342 Pull Request resolved: #163349 Approved by: https://github.com/seemethere
…161916) Related to pytorch#159779 Adding CUDA 13.0 libtorch builds, followup after pytorch#160956 Removing CUDA 12.9 builds, See pytorch#159980 Pull Request resolved: pytorch#161916 Approved by: https://github.com/jeanschmidt, https://github.com/Skylion007 Co-authored-by: Ting Lu <tingl@nvidia.com>
Undo changes introduced in pytorch#160956 as driver has been updated to 580 for both fleets Fixes pytorch#163342 Pull Request resolved: pytorch#163349 Approved by: https://github.com/seemethere
…161916) Related to pytorch#159779 Adding CUDA 13.0 libtorch builds, followup after pytorch#160956 Removing CUDA 12.9 builds, See pytorch#159980 Pull Request resolved: pytorch#161916 Approved by: https://github.com/jeanschmidt, https://github.com/Skylion007 Co-authored-by: Ting Lu <tingl@nvidia.com>
Undo changes introduced in pytorch#160956 as driver has been updated to 580 for both fleets Fixes pytorch#163342 Pull Request resolved: pytorch#163349 Approved by: https://github.com/seemethere
Undo changes introduced in pytorch#160956 as driver has been updated to 580 for both fleets Fixes pytorch#163342 Pull Request resolved: pytorch#163349 Approved by: https://github.com/seemethere
Undo changes introduced in #160956 as driver has been updated to 580 for both fleets Fixes #163342 Pull Request resolved: #163349 Approved by: https://github.com/seemethere Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
…161916) Related to pytorch#159779 Adding CUDA 13.0 libtorch builds, followup after pytorch#160956 Removing CUDA 12.9 builds, See pytorch#159980 Pull Request resolved: pytorch#161916 Approved by: https://github.com/jeanschmidt, https://github.com/Skylion007 Co-authored-by: Ting Lu <tingl@nvidia.com>
Undo changes introduced in pytorch#160956 as driver has been updated to 580 for both fleets Fixes pytorch#163342 Pull Request resolved: pytorch#163349 Approved by: https://github.com/seemethere
#159779
CUDA 13.0.0
NVSHMEM 3.3.20
CUDNN 9.12.0.46
Adding x86 linux builds for CUDA 13.
Adding sbsa docker.
Adding libtorch docker.
Package naming changed for CUDA 13 (removed postfix -cu13 for some packages).
Preparation checklist:
cc @ptrblck @nWEIdia @atalman @malfet @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta