Skip to content

[Ready for Review] Better fix for NCCL + sccache#8829

Merged
soumith merged 4 commits intopytorch:masterfrom
yf225:nncl_sccache
Jun 25, 2018
Merged

[Ready for Review] Better fix for NCCL + sccache#8829
soumith merged 4 commits intopytorch:masterfrom
yf225:nncl_sccache

Conversation

@yf225
Copy link
Contributor

@yf225 yf225 commented Jun 23, 2018

sccache was stuck when compiling NCCL likely because NUM_JOBS was not set correctly. This PR tests this hypothesis.

@yf225
Copy link
Contributor Author

yf225 commented Jun 23, 2018

@pytorchbot retest this please

1 similar comment
@yf225
Copy link
Contributor Author

yf225 commented Jun 23, 2018

@pytorchbot retest this please

@ssnl
Copy link
Collaborator

ssnl commented Jun 24, 2018

Just to bring this line to attention, not sure if it matters:
NCCL CmakeList.txt's make command also sets -j at https://github.com/pytorch/pytorch/blob/master/third_party/nccl/CMakeLists.txt#L17 .

Also, I believe the original method of temporarily writing NVCC_CUDA_EXECUTABLE could be better done via env CUDA_NVCC_EXECUTABLE=$(which nvcc) before ${CMAKE_VERSION} (still in a same line).

@yf225 yf225 requested a review from orionr as a code owner June 25, 2018 00:08
@@ -217,19 +212,13 @@ function build_nccl() {
-DCMAKE_CXX_FLAGS="$C_FLAGS $CPP_FLAGS $USER_CFLAGS" \
-DCMAKE_SHARED_LINKER_FLAGS="$USER_LDFLAGS" \

This comment was marked as off-topic.

@yf225 yf225 changed the title [WIP] Better fix for NCCL + sccache [Ready for Review] Better fix for NCCL + sccache Jun 25, 2018
@yf225
Copy link
Contributor Author

yf225 commented Jun 25, 2018

Thanks @ssnl! I believe this should be the right fix now.

Copy link
Collaborator

@ssnl ssnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks legit!

@soumith soumith merged commit e31ab99 into pytorch:master Jun 25, 2018
@soumith
Copy link
Collaborator

soumith commented Jun 25, 2018

cc: @orionr we modified third_party/nccl here

@ssnl ssnl mentioned this pull request Jun 27, 2018
facebook-github-bot pushed a commit that referenced this pull request Jun 27, 2018
Summary:
Changes (were merged) in #8834 and #8829 (cc yf225 ) were lost in 9ec0a2a#diff-6997846ce6daf0c271e2db9ef0508551. This PR resubmits them.
Closes #8948

Differential Revision: D8665760

Pulled By: SsnL

fbshipit-source-id: 15514021fa79e6b908ea665dd6cb464b3ea00ab0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants