Fix broadcast copying device[0] tensor when not using NCCL#8222
Fix broadcast copying device[0] tensor when not using NCCL#8222ssnl merged 4 commits intopytorch:masterfrom
Conversation
|
torch/csrc/utils/tensor_flatten.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/utils/tensor_flatten.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot retest this please |
|
can I get a review on this please? |
|
@apaszke can you take a look at this when you have time? |
torch/csrc/cuda/comm.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/cuda/comm.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…tential extra copy in flatten_dense_tensors
|
Let me know how this looks to you. @apaszke |
|
I notice there was no test added for this |
|
adding one soon @ezyang |
flatten_dense_tensors