Skip to content

Use pip install cu117#85097

Closed
atalman wants to merge 4 commits intopytorch:masterfrom
atalman:use_pip_install_cu117
Closed

Use pip install cu117#85097
atalman wants to merge 4 commits intopytorch:masterfrom
atalman:use_pip_install_cu117

Conversation

@atalman
Copy link
Copy Markdown
Contributor

@atalman atalman commented Sep 15, 2022

Creates new wheel workflow specific to CUDA 11.7 that does not bundle the cudnn and cublas.

Workflow:
https://github.com/pytorch/pytorch/actions/runs/3094622781

New Package:
manywheel-py3_10-cuda11_7-with-pypi-cudnn | 843 MB

Old Package:
manywheel-py3_10-cuda11_7 | 1.65 GB

Testing workflow:

manywheel-py3_7-cuda11_7-with-pypi-cudnn-build / build:

Bundling without cudnn and cublas.
+ DEPS_LIST=("/usr/local/cuda/lib64/libcudart.so.11.0" "/usr/local/cuda/lib64/libnvToolsExt.so.1" "/usr/local/cuda/lib64/libnvrtc.so.11.2" "/usr/local/cuda/lib64/libnvrtc-builtins.so.11.7" "$LIBGOMP_PATH")
+ DEPS_SONAME=("libcudart.so.11.0" "libnvToolsExt.so.1" "libnvrtc.so.11.2" "libnvrtc-builtins.so.11.7" "libgomp.so.1")
.....
pytorch_extra_install_requirements: nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11, nvidia-cublas-cu11

manywheel-py3_7-cuda11_7-build / build

Bundling with cudnn and cublas.
+ DEPS_LIST=("/usr/local/cuda/lib64/libcudart.so.11.0" "/usr/local/cuda/lib64/libnvToolsExt.so.1" "/usr/local/cuda/lib64/libnvrtc.so.11.2" "/usr/local/cuda/lib64/libnvrtc-builtins.so.11.7" "/usr/local/cuda/lib64/libcudnn_adv_infer.so.8" "/usr/local/cuda/lib64/libcudnn_adv_train.so.8" "/usr/local/cuda/lib64/libcudnn_cnn_infer.so.8" "/usr/local/cuda/lib64/libcudnn_cnn_train.so.8" "/usr/local/cuda/lib64/libcudnn_ops_infer.so.8" "/usr/local/cuda/lib64/libcudnn_ops_train.so.8" "/usr/local/cuda/lib64/libcudnn.so.8" "/usr/local/cuda/lib64/libcublas.so.11" "/usr/local/cuda/lib64/libcublasLt.so.11" "$LIBGOMP_PATH")
+ DEPS_SONAME=("libcudart.so.11.0" "libnvToolsExt.so.1" "libnvrtc.so.11.2" "libnvrtc-builtins.so.11.7" "libcudnn_adv_infer.so.8" "libcudnn_adv_train.so.8" "libcudnn_cnn_infer.so.8" "libcudnn_cnn_train.so.8" "libcudnn_ops_infer.so.8" "libcudnn_ops_train.so.8" "libcudnn.so.8" "libcublas.so.11" "libcublasLt.so.11" "libgomp.so.1")

cc: @malfet @ptrblck

@atalman atalman requested a review from jeffdaily as a code owner September 15, 2022 18:41
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 15, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/85097

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 Failures

As of commit d60d7e4:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Sep 15, 2022
@atalman atalman added ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR and removed cla signed labels Sep 15, 2022
@atalman atalman force-pushed the use_pip_install_cu117 branch from 4e8a1a4 to 7d92f92 Compare September 15, 2022 18:43
@atalman atalman requested review from malfet and removed request for jeffdaily September 15, 2022 18:43
@atalman atalman marked this pull request as draft September 15, 2022 18:44
@atalman atalman force-pushed the use_pip_install_cu117 branch from a69724c to 73b212c Compare September 19, 2022 15:55
@atalman atalman force-pushed the use_pip_install_cu117 branch from 34962e3 to 2c73691 Compare September 19, 2022 19:52
@atalman atalman force-pushed the use_pip_install_cu117 branch from 801a314 to 931bd11 Compare September 20, 2022 19:03
Testing

Testing

Testing

Testing

Trying to build whl with pip

New workflow

Adding build name as environment

Using build name to determine how manywheel should be packaged

Fix typo

Fix typo

fix log

Adding new environment variable 1

Adding new env variable

Extra install req

Adding new environment

More new var setting

Adding binary env populate

Test

Fix if condition

testing

Testing

Refactor the code

Testing

Testing

Address comments

Fix lint issue
@atalman atalman force-pushed the use_pip_install_cu117 branch from 07978d1 to b631a9e Compare September 20, 2022 19:35
Refactoring code

Fix builder folder

Testing variables

Testing
@atalman atalman force-pushed the use_pip_install_cu117 branch from 9391c16 to 56b1a0e Compare September 20, 2022 21:59
@atalman atalman changed the title [WIP] Use pip install cu117 Use pip install cu117 Sep 21, 2022
@atalman atalman marked this pull request as ready for review September 21, 2022 01:55
@atalman atalman requested a review from a team as a code owner September 21, 2022 01:55
"package_type": package_type,
"pytorch_extra_install_requirements":
"nvidia-cuda-runtime-cu11;"
"nvidia-cudnn-cu11==8.5.0.96;"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note had to specify full version for now. Partial version did not work.

@atalman atalman requested a review from malfet September 21, 2022 11:18
setup.py Outdated
extra_link_args += ['-g']

# special CUDA 11.7 package that requires installation of cuda runtime, cudnn and cublas
pytorch_extra_install_requirements = os.getenv('PYTORCH_EXTRA_INSTALL_REQUIREMENTS', '')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use double quotes. Also, why do you need default value here?

Suggested change
pytorch_extra_install_requirements = os.getenv('PYTORCH_EXTRA_INSTALL_REQUIREMENTS', '')
pytorch_extra_install_requirements = os.getenv("PYTORCH_EXTRA_INSTALL_REQUIREMENTS",)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, converted to double quotes. default values I think good idea to keep in case if environment variable is not set.

@atalman
Copy link
Copy Markdown
Contributor Author

atalman commented Sep 21, 2022

@pytorchmergebot -f "Already run through all tests"

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 21, 2022

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'Already run through all tests' (choose from 'merge', 'revert', 'rebase', 'label')

usage: @pytorchbot [-h] {merge,revert,rebase,label} ...

Try @pytorchbot --help for more info.

@atalman
Copy link
Copy Markdown
Contributor Author

atalman commented Sep 21, 2022

@pytorchbot --help

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 21, 2022

PyTorchBot Help

usage: @pytorchbot [-h] {merge,revert,rebase,label} ...

In order to invoke the bot on your PR, include a line that starts with
@pytorchbot anywhere in a comment. That line will form the command; no
multi-line commands are allowed. 

Example:
    Some extra context, blah blah, wow this PR looks awesome

    @pytorchbot merge

optional arguments:
  -h, --help            Show this help message and exit.

command:
  {merge,revert,rebase,label}
    merge               Merge a PR
    revert              Revert a PR
    rebase              Rebase a PR
    label               Add label to a PR

Merge

usage: @pytorchbot merge [-g | -f MESSAGE | -l | -r [REBASE]]

Merge an accepted PR, subject to the rules in .github/merge_rules.json.
By default, this will wait for all required checks (lint, pull) to succeed before merging.

optional arguments:
  -g, --green           Merge when all status checks running on the PR pass. To add status checks, use labels like `ciflow/trunk`.
  -f MESSAGE, --force MESSAGE
                        Merge without checking anything. This requires a reason for auditting purpose, for example:
                        @pytorchbot merge -f 'Minor update to fix lint. Expecting all PR tests to pass'
  -l, --land-checks     Merge with land time checks. This will create a new branch with your changes rebased on viable/strict and run a majority of trunk tests _before_ landing to increase trunk reliability and decrease risk of revert. The tests added are: pull, Lint and trunk. Note that periodic is excluded. (EXPERIMENTAL)
  -r [REBASE], --rebase [REBASE]
                        Rebase the PR to re run checks before merging.  It will accept a branch name and will default to master if not specified.

Revert

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst}

Revert a merged PR. This requires that you are a Meta employee.

Example:
  @pytorchbot revert -m="This is breaking tests on trunk. hud.pytorch.org/" -c=nosignal

optional arguments:
  -m MESSAGE, --message MESSAGE
                        The reason you are reverting, will be put in the commit message. Must be longer than 3 words.
  -c {nosignal,ignoredsignal,landrace,weird,ghfirst}, --classification {nosignal,ignoredsignal,landrace,weird,ghfirst}
                        A machine-friendly classification of the revert reason.

Rebase

usage: @pytorchbot rebase [-s | -b BRANCH]

Rebase a PR. Rebasing defaults to the default branch of pytorch (master).
You, along with any member of the pytorch organization, can rebase your PR.

optional arguments:
  -s, --stable          Rebase to viable/strict
  -b BRANCH, --branch BRANCH
                        Branch you would like to rebase to

Label

usage: @pytorchbot label labels [labels ...]

Adds label to a PR

positional arguments:
  labels  Labels to add to given Pull Request

@atalman
Copy link
Copy Markdown
Contributor Author

atalman commented Sep 21, 2022

@pytorchbot merge -f "Full set of tests was executed, this one will fail anyway because of wheels 3.11"

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered with the force (-f) flag. This means your change will be merged immediately, bypassing any CI checks (ETA: 1-5 minutes). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

mehtanirav pushed a commit that referenced this pull request Oct 4, 2022
Creates new wheel workflow specific to CUDA 11.7 that does not bundle the cudnn and cublas.

Workflow:
https://github.com/pytorch/pytorch/actions/runs/3094622781

New Package:
manywheel-py3_10-cuda11_7-with-pypi-cudnn | 843 MB

Old Package:
manywheel-py3_10-cuda11_7 | 1.65 GB

Testing workflow:

[manywheel-py3_7-cuda11_7-with-pypi-cudnn-build / build](https://github.com/pytorch/pytorch/actions/runs/3091145546/jobs/5000867662#logs):
```
Bundling without cudnn and cublas.
+ DEPS_LIST=("/usr/local/cuda/lib64/libcudart.so.11.0" "/usr/local/cuda/lib64/libnvToolsExt.so.1" "/usr/local/cuda/lib64/libnvrtc.so.11.2" "/usr/local/cuda/lib64/libnvrtc-builtins.so.11.7" "$LIBGOMP_PATH")
+ DEPS_SONAME=("libcudart.so.11.0" "libnvToolsExt.so.1" "libnvrtc.so.11.2" "libnvrtc-builtins.so.11.7" "libgomp.so.1")
.....
pytorch_extra_install_requirements: nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11, nvidia-cublas-cu11
```

[manywheel-py3_7-cuda11_7-build / build](https://github.com/pytorch/pytorch/actions/runs/3091145546/jobs/5000863250#logs)

```
Bundling with cudnn and cublas.
+ DEPS_LIST=("/usr/local/cuda/lib64/libcudart.so.11.0" "/usr/local/cuda/lib64/libnvToolsExt.so.1" "/usr/local/cuda/lib64/libnvrtc.so.11.2" "/usr/local/cuda/lib64/libnvrtc-builtins.so.11.7" "/usr/local/cuda/lib64/libcudnn_adv_infer.so.8" "/usr/local/cuda/lib64/libcudnn_adv_train.so.8" "/usr/local/cuda/lib64/libcudnn_cnn_infer.so.8" "/usr/local/cuda/lib64/libcudnn_cnn_train.so.8" "/usr/local/cuda/lib64/libcudnn_ops_infer.so.8" "/usr/local/cuda/lib64/libcudnn_ops_train.so.8" "/usr/local/cuda/lib64/libcudnn.so.8" "/usr/local/cuda/lib64/libcublas.so.11" "/usr/local/cuda/lib64/libcublasLt.so.11" "$LIBGOMP_PATH")
+ DEPS_SONAME=("libcudart.so.11.0" "libnvToolsExt.so.1" "libnvrtc.so.11.2" "libnvrtc-builtins.so.11.7" "libcudnn_adv_infer.so.8" "libcudnn_adv_train.so.8" "libcudnn_cnn_infer.so.8" "libcudnn_cnn_train.so.8" "libcudnn_ops_infer.so.8" "libcudnn_ops_train.so.8" "libcudnn.so.8" "libcublas.so.11" "libcublasLt.so.11" "libgomp.so.1")
```

cc: @malfet @ptrblck
Pull Request resolved: #85097
Approved by: https://github.com/malfet
@atalman atalman mentioned this pull request Nov 8, 2022
pytorchmergebot pushed a commit that referenced this pull request Nov 18, 2022
Fixes #88049

#85097 added new extra dependencies on `nvidia-*`. They are linux (GPU) only packages, but were not marked as such, causing issues installing pytorch 1.13 via Poetry (and possibly other tools that follow PyPI's metadata API) on non-Linux systems. This "fixes" the issue by adding the `; platform_system = 'Linux'` marker on these dependencies, but the main problem of different metadata for different wheels is a [somewhat larger issue](#88049 (comment)).

#85097 used `;` as a delimiter for splitting the different deps, but that is the delimiter used in markers, so I changed to split on `|`.

Pull Request resolved: #88826
Approved by: https://github.com/neersighted, https://github.com/lalmei, https://github.com/malfet
atalman pushed a commit to atalman/pytorch that referenced this pull request Nov 30, 2022
…8826)

Fixes pytorch#88049

pytorch#85097 added new extra dependencies on `nvidia-*`. They are linux (GPU) only packages, but were not marked as such, causing issues installing pytorch 1.13 via Poetry (and possibly other tools that follow PyPI's metadata API) on non-Linux systems. This "fixes" the issue by adding the `; platform_system = 'Linux'` marker on these dependencies, but the main problem of different metadata for different wheels is a [somewhat larger issue](pytorch#88049 (comment)).

pytorch#85097 used `;` as a delimiter for splitting the different deps, but that is the delimiter used in markers, so I changed to split on `|`.

Pull Request resolved: pytorch#88826
Approved by: https://github.com/neersighted, https://github.com/lalmei, https://github.com/malfet
atalman pushed a commit to atalman/pytorch that referenced this pull request Dec 6, 2022
…8826)

Fixes pytorch#88049

pytorch#85097 added new extra dependencies on `nvidia-*`. They are linux (GPU) only packages, but were not marked as such, causing issues installing pytorch 1.13 via Poetry (and possibly other tools that follow PyPI's metadata API) on non-Linux systems. This "fixes" the issue by adding the `; platform_system = 'Linux'` marker on these dependencies, but the main problem of different metadata for different wheels is a [somewhat larger issue](pytorch#88049 (comment)).

pytorch#85097 used `;` as a delimiter for splitting the different deps, but that is the delimiter used in markers, so I changed to split on `|`.

Pull Request resolved: pytorch#88826
Approved by: https://github.com/neersighted, https://github.com/lalmei, https://github.com/malfet
atalman added a commit that referenced this pull request Dec 6, 2022
…89924)

Fixes #88049

#85097 added new extra dependencies on `nvidia-*`. They are linux (GPU) only packages, but were not marked as such, causing issues installing pytorch 1.13 via Poetry (and possibly other tools that follow PyPI's metadata API) on non-Linux systems. This "fixes" the issue by adding the `; platform_system = 'Linux'` marker on these dependencies, but the main problem of different metadata for different wheels is a [somewhat larger issue](#88049 (comment)).

#85097 used `;` as a delimiter for splitting the different deps, but that is the delimiter used in markers, so I changed to split on `|`.

Pull Request resolved: #88826
Approved by: https://github.com/neersighted, https://github.com/lalmei, https://github.com/malfet

Co-authored-by: Jacob Hayes <jacob.r.hayes@gmail.com>
kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Dec 10, 2022
…8826)

Fixes pytorch#88049

pytorch#85097 added new extra dependencies on `nvidia-*`. They are linux (GPU) only packages, but were not marked as such, causing issues installing pytorch 1.13 via Poetry (and possibly other tools that follow PyPI's metadata API) on non-Linux systems. This "fixes" the issue by adding the `; platform_system = 'Linux'` marker on these dependencies, but the main problem of different metadata for different wheels is a [somewhat larger issue](pytorch#88049 (comment)).

pytorch#85097 used `;` as a delimiter for splitting the different deps, but that is the delimiter used in markers, so I changed to split on `|`.

Pull Request resolved: pytorch#88826
Approved by: https://github.com/neersighted, https://github.com/lalmei, https://github.com/malfet
pytorchmergebot pushed a commit that referenced this pull request Jan 31, 2023
…3066)

Like #89924 #91083

#85097 added new extra dependencies on nvidia-*. They are linux x86_64 (GPU) only packages, but were not marked as such, causing issues installing pytorch 1.13 via Poetry (and possibly other tools that follow PyPI's metadata API) on Linux aarch64 systems. This "fixes" the issue by adding the `and platform_machine == 'x86_64'` marker on these dependencies.

Pull Request resolved: #93066
Approved by: https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR cla signed Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants