Conversation
|
This doesn't seem to compile for me. |
aten/src/ATen/copy_wrapper.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
aten/src/ATen/copy_wrapper.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
aten/src/ATen/copy_wrapper.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
aten/src/ATen/copy_wrapper.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
aten/src/ATen/copy_wrapper.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
looks like lint is failing. |
In order to split ATen's CPU/CUDA code into two separate libraries
which don't require a build flag (AT_CUDA_ENABLED) to separate them,
we need to be able to split source files based on whether or not they
handle CPU functionality only, or also touch CUDA. Copy poses a unique
challenge here, because the naive implementation involves writing
a matrix for all combinations of CPU/GPU in a single file.
This PR splits up Copy.cpp into CPUCopy.cpp and CUDACopy.cpp, respecting
the following matrix:
to\from CPU CUDA
+---------------------------
CPU | CPUCopy.cpp CUDACopy.cpp
CUDA | CUDACopy.cpp CUDACopy.cpp
When you run x.copy_(y) where x is CPU and y is CUDA, we do a second
virtual dispatch to copy_from(y, x) on y's type, so that we can get
from CPUCopy.cpp to CUDACopy.cpp
The new autogenerated code for CPU looks like this:
Tensor & CPUByteType::s_copy_(Tensor & dst, const Tensor & src, bool non_blocking) const {
// code generated by copy_wrapper
checked_cast_tensor<CPUByteTensor>(dst.pImpl, "dst", 0, false);
switch (src.type().ID()) {
case TypeID::CPUByte:
THByteTensor_copyByte(static_cast<CPUByteTensor*>(dst.pImpl)->tensor, static_cast<CPUByteTensor*>(src.pImpl)->tensor);
break;
case TypeID::CPUChar:
THByteTensor_copyChar(static_cast<CPUByteTensor*>(dst.pImpl)->tensor, static_cast<CPUCharTensor*>(src.pImpl)->tensor);
break;
...
default:
return src.type().s_copy_from(src, dst, non_blocking);
Notice that the fall through goes to s_copy_from. s_copy_from is like s_copy
but the arguments are reversed.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
bf48b95 to
578200f
Compare
* Double-dispatch copy.
In order to split ATen's CPU/CUDA code into two separate libraries
which don't require a build flag (AT_CUDA_ENABLED) to separate them,
we need to be able to split source files based on whether or not they
handle CPU functionality only, or also touch CUDA. Copy poses a unique
challenge here, because the naive implementation involves writing
a matrix for all combinations of CPU/GPU in a single file.
This PR splits up Copy.cpp into CPUCopy.cpp and CUDACopy.cpp, respecting
the following matrix:
to\from CPU CUDA
+---------------------------
CPU | CPUCopy.cpp CUDACopy.cpp
CUDA | CUDACopy.cpp CUDACopy.cpp
When you run x.copy_(y) where x is CPU and y is CUDA, we do a second
virtual dispatch to copy_from(y, x) on y's type, so that we can get
from CPUCopy.cpp to CUDACopy.cpp
The new autogenerated code for CPU looks like this:
Tensor & CPUByteType::s_copy_(Tensor & dst, const Tensor & src, bool non_blocking) const {
// code generated by copy_wrapper
checked_cast_tensor<CPUByteTensor>(dst.pImpl, "dst", 0, false);
switch (src.type().ID()) {
case TypeID::CPUByte:
THByteTensor_copyByte(static_cast<CPUByteTensor*>(dst.pImpl)->tensor, static_cast<CPUByteTensor*>(src.pImpl)->tensor);
break;
case TypeID::CPUChar:
THByteTensor_copyChar(static_cast<CPUByteTensor*>(dst.pImpl)->tensor, static_cast<CPUCharTensor*>(src.pImpl)->tensor);
break;
...
default:
return src.type().s_copy_from(src, dst, non_blocking);
Notice that the fall through goes to s_copy_from. s_copy_from is like s_copy
but the arguments are reversed.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Lintfix and no-CUDA fix
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Fix compilation erorr.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* CR
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Double-dispatch copy.
In order to split ATen's CPU/CUDA code into two separate libraries
which don't require a build flag (AT_CUDA_ENABLED) to separate them,
we need to be able to split source files based on whether or not they
handle CPU functionality only, or also touch CUDA. Copy poses a unique
challenge here, because the naive implementation involves writing
a matrix for all combinations of CPU/GPU in a single file.
This PR splits up Copy.cpp into CPUCopy.cpp and CUDACopy.cpp, respecting
the following matrix:
to\from CPU CUDA
+---------------------------
CPU | CPUCopy.cpp CUDACopy.cpp
CUDA | CUDACopy.cpp CUDACopy.cpp
When you run x.copy_(y) where x is CPU and y is CUDA, we do a second
virtual dispatch to copy_from(y, x) on y's type, so that we can get
from CPUCopy.cpp to CUDACopy.cpp
The new autogenerated code for CPU looks like this:
Tensor & CPUByteType::s_copy_(Tensor & dst, const Tensor & src, bool non_blocking) const {
// code generated by copy_wrapper
checked_cast_tensor<CPUByteTensor>(dst.pImpl, "dst", 0, false);
switch (src.type().ID()) {
case TypeID::CPUByte:
THByteTensor_copyByte(static_cast<CPUByteTensor*>(dst.pImpl)->tensor, static_cast<CPUByteTensor*>(src.pImpl)->tensor);
break;
case TypeID::CPUChar:
THByteTensor_copyChar(static_cast<CPUByteTensor*>(dst.pImpl)->tensor, static_cast<CPUCharTensor*>(src.pImpl)->tensor);
break;
...
default:
return src.type().s_copy_from(src, dst, non_blocking);
Notice that the fall through goes to s_copy_from. s_copy_from is like s_copy
but the arguments are reversed.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Lintfix and no-CUDA fix
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Fix compilation erorr.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* CR
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
In order to split ATen's CPU/CUDA code into two separate libraries
which don't require a build flag (AT_CUDA_ENABLED) to separate them,
we need to be able to split source files based on whether or not they
handle CPU functionality only, or also touch CUDA. Copy poses a unique
challenge here, because the naive implementation involves writing
a matrix for all combinations of CPU/GPU in a single file.
This PR splits up Copy.cpp into CPUCopy.cpp and CUDACopy.cpp, respecting
the following matrix:
When you run x.copy_(y) where x is CPU and y is CUDA, we do a second
virtual dispatch to copy_from(y, x) on y's type, so that we can get
from CPUCopy.cpp to CUDACopy.cpp
The new autogenerated code for CPU looks like this:
Notice that the fall through goes to s_copy_from. s_copy_from is like s_copy
but the arguments are reversed.
This commit is a TEMPORARY state of affairs; when the multiple-dispatcher is online we can get rid of all of this goo.
Signed-off-by: Edward Z. Yang ezyang@fb.com