Skip to content

[WIP] test cudnn8.7#92971

Closed
ptrblck wants to merge 3 commits intopytorch:mainfrom
ptrblck:test_cudnn8.7
Closed

[WIP] test cudnn8.7#92971
ptrblck wants to merge 3 commits intopytorch:mainfrom
ptrblck:test_cudnn8.7

Conversation

@ptrblck
Copy link
Copy Markdown
Collaborator

@ptrblck ptrblck commented Jan 25, 2023

@ptrblck ptrblck requested a review from a team as a code owner January 25, 2023 07:03
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Jan 25, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92971

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 Failures

As of commit e45a727:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jan 25, 2023
@ptrblck ptrblck added the ciflow/binaries Trigger all binary build and upload jobs on the PR label Jan 25, 2023
@atalman atalman added the ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR label Jan 25, 2023
@ptrblck ptrblck requested a review from jeffdaily as a code owner January 25, 2023 19:48
@ptrblck
Copy link
Copy Markdown
Collaborator Author

ptrblck commented Jan 26, 2023

@atalman

The failing tests are:

conda-related

CondaError: Downloaded bytes did not match Content-Length
  url: https://conda.anaconda.org/nvidia/win-64/libcublas-dev-11.9.2.110-0.tar.bz2

## Package Plan ##

  target_path: C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\pkgs\libcublas-dev-11.9.2.110-0.tar.bz2
  Content-Length: 311278235
  downloaded bytes: 172816824
CondaError: Downloaded bytes did not match Content-Length
  url: https://conda.anaconda.org/nvidia/win-64/libcufft-dev-10.7.1.112-0.tar.bz2
  target_path: C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\pkgs\libcufft-dev-10.7.1.112-0.tar.bz2
  Content-Length: 262238290
  downloaded bytes: 53258682
CondaError: Downloaded bytes did not match Content-Length
  url: https://conda.anaconda.org/nvidia/win-64/nsight-compute-2022.4.0.15-0.tar.bz2
  target_path: C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\pkgs\nsight-compute-2022.4.0.15-0.tar.bz2
  Content-Length: 627707653
  downloaded bytes: 252242363

rocm-related

+ g++ /builder/test_example_code/simple-torch-test.cpp -I/tmp/libtorch/include -I/tmp/libtorch/include/torch/csrc/api/include -D_GLIBCXX_USE_CXX11_ABI=1 -std=gnu++14 -L/tmp/libtorch/lib -Wl,-R/tmp/libtorch/lib -Wl,--no-as-needed -ltorch -ltorch_cpu -lc10 -o simple-torch-test
/usr/bin/ld: /tmp/libtorch/lib/libamd_comgr.so: undefined reference to `del_curterm@NCURSES6_TINFO_5.0.19991023'
/usr/bin/ld: /tmp/libtorch/lib/libamd_comgr.so: undefined reference to `setupterm@NCURSES6_TINFO_5.0.19991023'
/usr/bin/ld: /tmp/libtorch/lib/libamd_comgr.so: undefined reference to `tigetnum@NCURSES6_TINFO_5.0.19991023'
/usr/bin/ld: /tmp/libtorch/lib/libamd_comgr.so: undefined reference to `set_curterm@NCURSES6_TINFO_5.0.19991023'
collect2: error: ld returned 1 exit status

Let me know if you see any failing issues related to the cuDNN update.

@atalman
Copy link
Copy Markdown
Contributor

atalman commented Jan 26, 2023

@ptrblck Failures above are not related to cudnn update. These looks like flaky issues. cudnn was not updated for 11.6. I think we are good to merge. I seen multiple nightly failures like this today as well on our nightly builds: https://github.com/pytorch/pytorch/actions/runs/4013294779/jobs/6895536351

pytorchmergebot pushed a commit that referenced this pull request Jan 27, 2023
Add cudnn install 8.7.0.84 for CUDA 11.8 .

Same as: #84964
Related to pytorch/builder#1271
Test PR: #92971
Pull Request resolved: #93086
Approved by: https://github.com/kit1980, https://github.com/malfet
@atalman atalman added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 27, 2023
@malfet
Copy link
Copy Markdown
Contributor

malfet commented Jan 27, 2023

Small suggestion (if this is a work-in-progress change, or PR needed to test builder changes): do you mind keeping it as draft (and closing if it was used to test now merged builder change)

@atalman atalman marked this pull request as draft January 27, 2023 17:48
@atalman atalman marked this pull request as draft January 27, 2023 17:48
@atalman
Copy link
Copy Markdown
Contributor

atalman commented Jan 27, 2023

@malfet, done converted to draft

pytorchmergebot pushed a commit that referenced this pull request Jan 30, 2023
Add cudnn install 8.7.0.84 for CUDA 11.8 .

Same as: #84964
Related to pytorch/builder#1271
Test PR: #92971
Pull Request resolved: #93086
Approved by: https://github.com/kit1980, https://github.com/malfet
@github-actions
Copy link
Copy Markdown
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Mar 28, 2023
@github-actions github-actions bot closed this May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/trunk Trigger trunk jobs on your pull request open source Stale topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants