Skip to content

cudatoolkit: introduce 11.8.0#194705

Merged
samuela merged 6 commits intoNixOS:masterfrom
dguibert:dg/cudatoolkit_11_8_0
Oct 20, 2022
Merged

cudatoolkit: introduce 11.8.0#194705
samuela merged 6 commits intoNixOS:masterfrom
dguibert:dg/cudatoolkit_11_8_0

Conversation

@dguibert
Copy link
Copy Markdown
Member

@dguibert dguibert commented Oct 6, 2022

Description of changes
Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.11 Release Notes (or backporting 22.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

@dguibert
Copy link
Copy Markdown
Member Author

dguibert commented Oct 6, 2022

This PR also fix evaluation breakage of cudnn and tensorrt when the cudaVersion is not listed in any supported versions of those libraries. #192958 partially fixes the problem.

@dguibert dguibert requested review from aidalgol and samuela October 6, 2022 06:02
@dguibert dguibert force-pushed the dg/cudatoolkit_11_8_0 branch from a9c3365 to 84ac150 Compare October 6, 2022 06:06
@ofborg ofborg bot added 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. labels Oct 6, 2022
Comment on lines 16 to 18
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this use lib.versions.major/majorMinor?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'll change it for simplicity

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else throw "cudnn-${cuDnnDefaultVersion} does not support your cuda version ${cudaVersion} (or update supportedCudaVersions)"; };
else throw "cudnn-${cuDnnDefaultVersion} does not support cuda ${cudaVersion}"; };

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else throw "tensorrt-${tensorRTDefaultVersion} does not support your cuda version ${cudaVersion} (or update supportedCudaVersions)"; };
else throw "tensorrt-${tensorRTDefaultVersion} does not support cuda ${cudaVersion}"; };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want cudaPackages_11 to be an alias for the latest available version?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but as the current version of cudnn is not compatible with version 11.8 of cuda, I've not updated the alias.

If there is a consensus, no problem to update it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, then yes, keep it back for now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest cudnn is 8.6.0, compatible with cuda 11.8

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'd do the bump in a separate commit, because everything else we can backport to 22.05)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest cudnn is 8.6.0, compatible with cuda 11.8

Thx, I'd missed that. Push an update in this PR.

@SomeoneSerge SomeoneSerge added the 6.topic: cuda Parallel computing platform and API label Oct 6, 2022
Copy link
Copy Markdown
Member

@samuela samuela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dguibert, thanks for putting this together! Just adding a few comments here in addition to what others have already pointed out.

Btw, please don't forget https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md when writing commit messages. I believe cudatoolkit: 1.7.0 -> 1.7.1 should be cudatoolkit: 11.7.0 -> 11.7.1 for example.

@dguibert dguibert force-pushed the dg/cudatoolkit_11_8_0 branch from 84ac150 to 342773d Compare October 7, 2022 11:35
@github-actions github-actions bot added the 6.topic: python Python is a high-level, general-purpose programming language. label Oct 7, 2022
@dguibert dguibert force-pushed the dg/cudatoolkit_11_8_0 branch from 342773d to ec5627d Compare October 7, 2022 11:38
@github-actions github-actions bot removed the 6.topic: python Python is a high-level, general-purpose programming language. label Oct 7, 2022
@dguibert dguibert requested a review from samuela October 7, 2022 11:44
@dguibert
Copy link
Copy Markdown
Member Author

dguibert commented Oct 7, 2022

@SomeoneSerge, I just saw your PR #194791. This one should be OK now to poke the jetson support.

@dguibert dguibert force-pushed the dg/cudatoolkit_11_8_0 branch from ec5627d to bbf4951 Compare October 7, 2022 11:45
@samuela samuela changed the title cudatoolkit: introduce 1.8.0 cudatoolkit: introduce 11.8.0 Oct 8, 2022
@dguibert dguibert force-pushed the dg/cudatoolkit_11_8_0 branch from bbf4951 to ec8c0aa Compare October 11, 2022 16:08
Throw an error if the default version of tensorrt does not support the
cuda version.

Adding cudatoolkit 11.8 fails to evaluate
tensorrt.{tensorRTDefaultVersion} as "11.8" is not listed on any
supportedCudaVersions of any tensorRTVersions attributes.
@dguibert dguibert force-pushed the dg/cudatoolkit_11_8_0 branch from ec8c0aa to af83b0c Compare October 14, 2022 10:10
@ofborg ofborg bot added 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. and removed 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. labels Oct 14, 2022
Copy link
Copy Markdown
Member

@samuela samuela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM i'll go ahead and merge tomorrow unless anyone objects

@samuela samuela merged commit 2fe7609 into NixOS:master Oct 20, 2022
@samuela
Copy link
Copy Markdown
Member

samuela commented Oct 20, 2022

I'm now seeing build failures with cuDNN:

❯ nix-build -A cudaPackages.cudnn
these 2 derivations will be built:
  /nix/store/jr4j41x50wfjvm141x6gd4dnpf5gbj28-cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz.drv
  /nix/store/jnn8s0baxw295hsvdxbbavpiqkyg50w9-cudatoolkit-11-cudnn-8.6.0.drv
building '/nix/store/jr4j41x50wfjvm141x6gd4dnpf5gbj28-cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz.drv'...

trying https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0/local_installers/11.7/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0   445    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404
error: cannot download cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz from any mirror
error: builder for '/nix/store/jr4j41x50wfjvm141x6gd4dnpf5gbj28-cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz.drv' failed with exit code 1;
       last 7 log lines:
       >
       > trying https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0/local_installers/11.7/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz
       >   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
       >                                  Dload  Upload   Total   Spent    Left  Speed
       >   0   445    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
       > curl: (22) The requested URL returned error: 404
       > error: cannot download cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz from any mirror
       For full logs, run 'nix log /nix/store/jr4j41x50wfjvm141x6gd4dnpf5gbj28-cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz.drv'.
error: 1 dependencies of derivation '/nix/store/jnn8s0baxw295hsvdxbbavpiqkyg50w9-cudatoolkit-11-cudnn-8.6.0.drv' failed to build

@dguibert could you take a look?

@samuela
Copy link
Copy Markdown
Member

samuela commented Oct 20, 2022

ok created #196983 to fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: cuda Parallel computing platform and API 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants