Unify THStorage and THCStorage structs.#9087
Conversation
Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com>
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
| protected: | ||
| friend struct ${Type}; | ||
| ${THStorage} *storage; | ||
| THStorage *storage; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| { | ||
| if(storage->flag & TH_STORAGE_FREEMEM) { | ||
| storage->allocator->free(storage->allocatorContext, storage->data_ptr); | ||
| static_cast<THAllocator*>(storage->allocatorVoidPtr)->free(storage->allocatorContext, storage->data_ptr); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| THStorage* THStorage_newWithSize(at::ScalarType scalar_type, ptrdiff_t size) | ||
| { | ||
| return THStorage_newWithAllocator(scalar_type, size, &THDefaultAllocator, NULL); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| storage->flag = TH_STORAGE_REFCOUNTED | TH_STORAGE_RESIZABLE | TH_STORAGE_FREEMEM; | ||
| storage->allocatorVoidPtr = allocator; | ||
| storage->allocatorContext = allocatorContext; | ||
| storage->device = 0; // device is not meaningful on CPU |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| @@ -142,28 +135,34 @@ THStorage* THStorage_(newWithDataAndAllocator)(real* data, ptrdiff_t size, | |||
| THAllocator* allocator, | |||
| void* allocatorContext) { | |||
| THStorage *storage = static_cast<THStorage*>(THAlloc(sizeof(THStorage))); | |||
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| THCDeviceAllocator *allocator, void *allocatorContext) { | ||
| THCStorage *storage = (THCStorage*)THAlloc(sizeof(THCStorage)); | ||
| memset(storage, 0, sizeof(THCStorage)); | ||
| storage->backend = at::kCUDA; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| #define TH_STORAGE_FREEMEM 4 | ||
|
|
||
| typedef struct THCStorage THCStorage; | ||
| // Blah. I tried to do this with a typedef but it didn't work |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| THCStorage_free(LIBRARY_STATE ptr); | ||
| void THPPointer<THStorage>::free() { | ||
| if (ptr) { | ||
| if (ptr->backend == at::kCPU) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| void THStorage_(resize)(THStorage *storage, ptrdiff_t size) | ||
| { | ||
| AT_ASSERT(storage->backend == at::kCPU); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| #include <TH/THGenerateHalfType.h> | ||
|
|
||
| #ifndef USE_CUDA | ||
| // NB: When USE_CUDA, the *full* implementation of |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| #include "THCGeneral.h" | ||
|
|
||
| /* Global state to be held in the cutorch table. */ |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
gchanan
left a comment
There was a problem hiding this comment.
looks great! I'd prefer to make the device some ridiculously low number on CPU and to have a single definition of the storage free if those are easy to do.
Summary: Closes pytorch#9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes pytorch#9087 Differential Revision: D8712072 fbshipit-source-id: 6f2619d0c76f71e3de4b5b73b81910bf8ba3265b
|
Due to infra shenanigans, updated PR is at #9107 |
Summary: Closes #9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes #9087 Reviewed By: orionr Differential Revision: D8712072 Pulled By: ezyang fbshipit-source-id: c6e1ea236cd1df017b42a7fffb2dbff20d50a284
Summary: Closes pytorch/pytorch#9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes pytorch/pytorch#9087 Reviewed By: orionr Differential Revision: D8712072 Pulled By: ezyang fbshipit-source-id: c6e1ea236cd1df017b42a7fffb2dbff20d50a284
Summary: Closes pytorch/pytorch#9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes pytorch/pytorch#9087 Reviewed By: orionr Differential Revision: D8712072 Pulled By: ezyang fbshipit-source-id: c6e1ea236cd1df017b42a7fffb2dbff20d50a284
Summary: Closes pytorch#9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes pytorch#9087 Reviewed By: orionr Differential Revision: D8712072 Pulled By: ezyang fbshipit-source-id: c6e1ea236cd1df017b42a7fffb2dbff20d50a284
Some details about how this was done:
For now, the allocators for CPU and CUDA are different (unifying
the allocators is a bigger change to make, I'll contribute this in
a later patch). To smooth this over, the allocator field now
stores a void* instead of THAllocator* or THCDeviceAllocator*; to
make this clear the field is renamed to allocatorVoidPtr.
Some THStorage functions which were generated per-scalar are now
generalized, and thus moved out of the generic/ library. This way
they can be called directly from a non-code-generated at::Storage
THCState is moved into a C++ header. This is actually not really
related to this particular diff, but I'll need it soon to replace
THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't
mention it in a C header file.)
THPPointer needs to be adjusted, since there is no more type refinement
between THStorage/THCStorage for it to template match over. This
is a little tricky, because I can't refer to THCStorage_free unless
we actually compile with CUDA. So there's two copies of the function
now: one for the CPU build, one for the CUDA build. If we ever split
CUDA/non-CUDA Python builds, you will have to indirect this through some
dynamic dispatch.
I want to soon replace the THCDeviceAllocator pointers in
THCState with at::Allocator, but I can't reference a C++ namespaced type
from C code, so THCState needs to move.
Signed-off-by: Edward Z. Yang ezyang@fb.com