Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/121684
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 456757f with merge base a03b9a2 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@ptrblck @nWEIdia why not with cudnn 9.0? It's major improve on flash attention, and I saw that headers were updated to cudnn 9.0 |
|
@johnnynunez It's not available as described in the review: #121684 (comment) |
I see.. thanks |
|
when will it merged? :) |
|
12.4 update 1 is out: |
|
@ptrblck @nWEIdia nvidia cudnn9 now is available nvidia-cudnn-cu12 9.0.0.312 |
a02e1d9 to
fb5fc09
Compare
GHA results show this is needed to fix errors in pytorch/pytorch#121684 Reference: pytorch#1374
|
@nWEIdia while windows AMI is not yet in place we would need to add only Linux part of things. |
GHA results show this is needed to fix errors in pytorch/pytorch#121684 Reference: #1374
|
Hi @malfet, for pip wheels, the build seems successful, but the test job failed: installing it required the presence of cu124/ AWS directory. https://github.com/pytorch/pytorch/actions/runs/8701283769/job/23866760262 |
reference: https://docs.nvidia.com/cuda/archive/12.4.0/cuda-toolkit-release-notes/index.html#id6 Linux x86_64 Driver Version | Windows x86_64 Driver Version CUDA 12.4 GA >=550.54.14 | >=551.61
This reverts commit aaa24b9.
driver-version: "550.54.15"
This reverts commit 0e23765.
550.54.15 since
pytorch/test-infra@d5695df
is in main branch of pytorch/test-infra
We need to keep 12.1 here still since this is the default wheel we will be uploading to pypi 12.4 should still be experimental build for now. Co-authored-by: Andrey Talman <atalman@fb.com>
97c18d7 to
6d474e5
Compare
|
@pytorchmergebot merge -f "All required jobs are passing" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Reference: #98492 Co-authored-by: Andrey Talman <atalman@fb.com> Pull Request resolved: #121684 Approved by: https://github.com/atalman
Trying to keep a record of the steps before I lose track of it. - 1st Commit: Similar to pytorch/builder#1720 - 2nd Commit: Update CUDA 12.4 CI CUDA versions from 12.4.0 to 12.4.1 mapping to changes in https://github.com/pytorch/pytorch/pull/125944/files - 3rd Commit: update for aarch64 install_cuda_aarch64.sh docker step - 4th Commit: aaa456e Related #121684 - Synchronization point: Meta helps uploading pypi cuda dependencies specified in .github/scripts/generate_binary_build_matrix.py - The above pypi upload is done (thanks Andrey!), restarted jobs like https://github.com/pytorch/pytorch/actions/runs/10188203670/job/28369471321 - 7753234, use temporary docker containers (generated from a previous successful container build). If merged, these containers would be rebuilt, therefore testing them now. (5th commit) - 6th commit 5f93c62: revert the 5th commit. Update, done but have to debug seemingly irrelevant failures (rocm/xpu/mps) Pull Request resolved: #132202 Approved by: https://github.com/Skylion007, https://github.com/eqy, https://github.com/atalman
Reference: #98492
cc @albanD @ptrblck @atalman @malfet