Skip to content

Add sanity checks for NCCL detection.#22819

Closed
xuhdev wants to merge 8 commits intogh/xuhdev/14/basefrom
gh/xuhdev/14/head
Closed

Add sanity checks for NCCL detection.#22819
xuhdev wants to merge 8 commits intogh/xuhdev/14/basefrom
gh/xuhdev/14/head

Conversation

@xuhdev
Copy link
Collaborator

@xuhdev xuhdev commented Jul 12, 2019

Stack from ghstack:

Check whether NCCL header files and libraries have the same NCCL version number.

Differential Revision: D16283037

Add sanity checks for NCCL detection.

gh-metadata: pytorch pytorch 22819 gh/xuhdev/14/head
@jerryzh168 jerryzh168 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 13, 2019
Add sanity checks for NCCL detection.

gh-metadata: pytorch pytorch 22819 gh/xuhdev/14/head
Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is lovely.

xuhdev added 5 commits July 15, 2019 11:01
Add sanity checks for NCCL detection.

gh-metadata: pytorch pytorch 22819 gh/xuhdev/14/head
Add sanity checks for NCCL detection.

gh-metadata: pytorch pytorch 22819 gh/xuhdev/14/head
Add sanity checks for NCCL detection.

gh-metadata: pytorch pytorch 22819 gh/xuhdev/14/head
Add sanity checks for NCCL detection.

gh-metadata: pytorch pytorch 22819 gh/xuhdev/14/head
Add sanity checks for NCCL detection.

gh-metadata: pytorch pytorch 22819 gh/xuhdev/14/head
facebook-github-bot pushed a commit that referenced this pull request Jul 16, 2019
Summary: Pull Request resolved: #22819

Test Plan: Imported from OSS

Differential Revision: D16283037

Pulled By: ezyang

fbshipit-source-id: fc09c9443a568d9af1c78a847282a7d707c49dd6
@zou3519 zou3519 deleted the gh/xuhdev/14/head branch July 16, 2019 18:35
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 16, 2019
Summary: Pull Request resolved: pytorch/pytorch#22819

Test Plan: Imported from OSS

Differential Revision: D16281714

Pulled By: ezyang

fbshipit-source-id: 396bcbf099bd07b996cf779c6b43092096b52d90
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 16, 2019
Summary: Pull Request resolved: pytorch/pytorch#22819

Test Plan: Imported from OSS

Differential Revision: D16283037

Pulled By: ezyang

fbshipit-source-id: fc09c9443a568d9af1c78a847282a7d707c49dd6
@yf225
Copy link
Contributor

yf225 commented Jul 16, 2019

This is breaking master CUDA builds with

Jul 16 20:03:50 CMake Error at /usr/share/cmake3/Modules/FindPackageHandleStandardArgs.cmake:137 (message):
Jul 16 20:03:50   Could NOT find NCCL (missing: NCCL_INCLUDE_DIRS NCCL_LIBRARIES)

(Example: https://circleci.com/gh/pytorch/pytorch/2204593?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link/console)
I am reverting it.

@xuhdev
Copy link
Collaborator Author

xuhdev commented Jul 16, 2019

I think this should be merged only after #22818 is merged.

@xuhdev xuhdev restored the gh/xuhdev/14/head branch July 16, 2019 20:29
@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in e2046f8.

@xuhdev
Copy link
Collaborator Author

xuhdev commented Jul 16, 2019

It looks like the issue is actually caused by #22818 not this one, but the CI in #22818 shows that it did not cause this issue. What could have changed in between?

@xuhdev xuhdev deleted the gh/xuhdev/14/head branch July 16, 2019 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: build Build system issues module: nccl Problems related to nccl support open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants