[c10d] Make C10d support CPU only build by teng-li · Pull Request #11513 · pytorch/pytorch

teng-li · 2018-09-11T08:36:54Z

This makes torch.distributed works for CPU only build.

Also added one more CI test case to cover MPI CPU build.
All CI tests should cover this change

apaszke

I'm not an expert in our build system nor c10d, but looks ok after a cursory look. I have some comments that could be improved. It's not very nice that we have to sprinkle that many ifdefs, and I'm not sure if that will work fine with our binaries, so it might be better to consult someone else too.

Sign in to view

+endif()
+else()
+  message(STATUS "Building C10D without CUDA support")
+endif()


Sign in to view

+  add_definitions(-DUSE_CUDA=1)
+endif()
+else()
+  message(STATUS "Building C10D without CUDA support")


Sign in to view

+#ifdef USE_CUDA
+      ,
+      cuda_(false)
+#endif


Sign in to view

  for (size_t i = 0; i < srcSizes.size(); i++) {
+#ifdef USE_CUDA
    deviceGuard.set_index(key.type->is_cuda() ? key.devices[i] : -1);
+#else


Sign in to view

 c10d_add_test(ProcessGroupGlooAsyncTest.cpp c10d c10d_cuda_test)
+if(DISTRIBUTED_NCCL_FOUND)
+  c10d_add_test(ProcessGroupNCCLTest.cpp c10d c10d_cuda_test)
+endif()


Sign in to view

+if(USE_CUDA AND CUDA_FOUND)
 cuda_add_library(c10d_cuda_test CUDATest.cu)
 target_link_libraries(c10d_cuda_test c10d)
+endif()


facebook-github-bot

teng-li has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

* master: (165 commits) Aibench for asr decoder Explicitly set locale on docs build. (pytorch#11595) Documentation for debugging JIT Fused weightnorm for ATen (pytorch#10842) Move Type, Tensor, TensorMethods to core. Add reminder % to the jit Fix reloading modules back into python (pytorch#11552) Add trigonometry functions to docs/source/onnx.rst Add EndToEndHybridModel CUDA tests (pytorch#11544) minor formatting error log (pytorch#11528) Warn that export+import module always load onto the CPU (pytorch#11485) caffe2::StorageImpl use at::DataPtr (pytorch#11282) Sync all libnccl soversions, not just libnccl.so.1 (pytorch#11575) Document BatchNorm and update default behavior (pytorch#11484) Typo fix in randomness.rst (pytorch#11571) Move some bmm/baddbmm to ATen (pytorch#11292) Make c10d test work on CPU only build (pytorch#11567) Clean up some C++ cruftiness in the script lexer. Allow setting deletion constant Make C10d support CPU only build (pytorch#11513) ...

Summary: This makes torch.distributed works for CPU only build. Also added one more CI test case to cover MPI CPU build. All CI tests should cover this change Pull Request resolved: pytorch#11513 Differential Revision: D9784546 Pulled By: teng-li fbshipit-source-id: 0976a6b0fd199670926f0273e17ad7d2805e42e7

teng-li added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Sep 11, 2018

teng-li requested a review from pietern September 11, 2018 08:36

teng-li requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners September 11, 2018 08:36

[c10d] Make C10d support CPU only build

dabe40e

teng-li force-pushed the c10d_cpu branch from 204d5ca to dabe40e Compare September 11, 2018 08:47

ci env change

90416de

apaszke approved these changes Sep 12, 2018

View reviewed changes

Addressed comments, fixed indentation of cmakefile

61ae2bc

facebook-github-bot reviewed Sep 12, 2018

View reviewed changes

facebook-github-bot closed this in 6dcdbd3 Sep 12, 2018

ssnl mentioned this pull request Sep 12, 2018

Some distributed tests are flaky #11582

Closed

teng-li deleted the c10d_cpu branch October 4, 2018 23:43

ezyang added the merged label Jun 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[c10d] Make C10d support CPU only build#11513

[c10d] Make C10d support CPU only build#11513
teng-li wants to merge 3 commits intopytorch:masterfrom
teng-li:c10d_cpu

teng-li commented Sep 11, 2018 •

edited

Loading

Uh oh!

apaszke left a comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

teng-li commented Sep 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

teng-li commented Sep 11, 2018 •

edited

Loading