Skip to content

[Build] Use open-source NCCL2 in PyTorch#12312

Closed
teng-li wants to merge 2 commits intopytorch:masterfrom
teng-li:nccl_version
Closed

[Build] Use open-source NCCL2 in PyTorch#12312
teng-li wants to merge 2 commits intopytorch:masterfrom
teng-li:nccl_version

Conversation

@teng-li
Copy link
Contributor

@teng-li teng-li commented Oct 4, 2018

  • Removed the old nccl file
  • Make open-source NCCL a submodule
  • CMake to make NCCL itself

NCCL2 now is in the default build.

@teng-li teng-li added oncall: distributed Add this issue/PR to distributed oncall triage queue module: build Build system issues labels Oct 4, 2018
@teng-li teng-li requested a review from ezyang October 4, 2018 02:26
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

teng-li has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ezyang
Copy link
Contributor

ezyang commented Oct 4, 2018

OOC why is it doubly nested in nccl/nccl?

Copy link
Contributor

@orionr orionr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only question is whether we can get everything into the NCCL submodule repository (CMakeLists.txt and .gitignore it looks like). If so, it would make this easier. Note that adding submodules when bringing into fbcode is somewhat of a pain, so let's go through that together. Let's make 100% sure we want the submodule at third_party/nccl/nccl before we work on that, but approving to unblock.


ADD_CUSTOM_COMMAND(
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/nccl

This comment was marked as off-topic.

This comment was marked as off-topic.

@ghost
Copy link

ghost commented Oct 4, 2018

  • make install -j4
    Scanning dependencies of target nccl
    [100%] Generating lib/libnccl.so
    /bin/sh: 1: cd: can't cd to /home/***/000git/pytorch/third_party/nccl/nccl
    CMakeFiles/nccl.dir/build.make:60: recipe for target 'lib/libnccl.so' failed
    make[2]: *** [lib/libnccl.so] Error 2
    CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/nccl.dir/all' failed
    make[1]: *** [CMakeFiles/nccl.dir/all] Error 2
    Makefile:129: recipe for target 'all' failed
    make: *** [all] Error 2

zdevito pushed a commit to zdevito/ATen that referenced this pull request Oct 4, 2018
Summary:
- Removed the old nccl file
- Make open-source NCCL a submodule
- CMake to make NCCL itself

NCCL2 now is in the default build.
Pull Request resolved: pytorch/pytorch#12312

Differential Revision: D10190845

Pulled By: teng-li

fbshipit-source-id: 08d42253b774149a66919d194f88b34628c39bae
@teng-li teng-li deleted the nccl_version branch October 4, 2018 23:34
@ezyang ezyang added the merged label Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: build Build system issues oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants