Unify THStorage and THCStorage structs. by ezyang · Pull Request #9087 · pytorch/pytorch

ezyang · 2018-07-02T01:41:57Z

Some details about how this was done:

For now, the allocators for CPU and CUDA are different (unifying
the allocators is a bigger change to make, I'll contribute this in
a later patch). To smooth this over, the allocator field now
stores a void* instead of THAllocator* or THCDeviceAllocator*; to
make this clear the field is renamed to allocatorVoidPtr.
Some THStorage functions which were generated per-scalar are now
generalized, and thus moved out of the generic/ library. This way
they can be called directly from a non-code-generated at::Storage
THCState is moved into a C++ header. This is actually not really
related to this particular diff, but I'll need it soon to replace
THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't
mention it in a C header file.)
THPPointer needs to be adjusted, since there is no more type refinement
between THStorage/THCStorage for it to template match over. This
is a little tricky, because I can't refer to THCStorage_free unless
we actually compile with CUDA. So there's two copies of the function
now: one for the CPU build, one for the CUDA build. If we ever split
CUDA/non-CUDA Python builds, you will have to indirect this through some
dynamic dispatch.

I want to soon replace the THCDeviceAllocator pointers in
THCState with at::Allocator, but I can't reference a C++ namespaced type
from C code, so THCState needs to move.

Signed-off-by: Edward Z. Yang ezyang@fb.com

Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com>

facebook-github-bot

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

aten/src/ATen/templates/StorageDerived.h

 protected:
  friend struct ${Type};
-  ${THStorage} *storage;
+  THStorage *storage;


aten/src/TH/THStorage.cpp

    {
      if(storage->flag & TH_STORAGE_FREEMEM) {
-        storage->allocator->free(storage->allocatorContext, storage->data_ptr);
+        static_cast<THAllocator*>(storage->allocatorVoidPtr)->free(storage->allocatorContext, storage->data_ptr);


aten/src/TH/THStorage.cpp

+
+THStorage* THStorage_newWithSize(at::ScalarType scalar_type, ptrdiff_t size)
+{
+  return THStorage_newWithAllocator(scalar_type, size, &THDefaultAllocator, NULL);


aten/src/TH/THStorage.cpp

+  storage->flag = TH_STORAGE_REFCOUNTED | TH_STORAGE_RESIZABLE | TH_STORAGE_FREEMEM;
+  storage->allocatorVoidPtr = allocator;
+  storage->allocatorContext = allocatorContext;
+  storage->device = 0;  // device is not meaningful on CPU


aten/src/TH/generic/THStorage.cpp

@@ -142,28 +135,34 @@ THStorage* THStorage_(newWithDataAndAllocator)(real* data, ptrdiff_t size,
                                               THAllocator* allocator,
                                               void* allocatorContext) {
  THStorage *storage = static_cast<THStorage*>(THAlloc(sizeof(THStorage)));


aten/src/THC/generic/THCStorage.cpp

  THCDeviceAllocator *allocator, void *allocatorContext) {
  THCStorage *storage = (THCStorage*)THAlloc(sizeof(THCStorage));
  memset(storage, 0, sizeof(THCStorage));
+  storage->backend = at::kCUDA;


aten/src/THC/generic/THCStorage.h

 #define TH_STORAGE_FREEMEM    4

-typedef struct THCStorage THCStorage;
+// Blah.  I tried to do this with a typedef but it didn't work


torch/csrc/cuda/Storage.cpp

-    THCStorage_free(LIBRARY_STATE ptr);
+void THPPointer<THStorage>::free() {
+  if (ptr) {
+    if (ptr->backend == at::kCPU) {


aten/src/TH/generic/THStorage.cpp


 void THStorage_(resize)(THStorage *storage, ptrdiff_t size)
 {
+  AT_ASSERT(storage->backend == at::kCPU);


torch/csrc/Storage.cpp

 #include <TH/THGenerateHalfType.h>

+#ifndef USE_CUDA
+// NB: When USE_CUDA, the *full* implementation of


aten/src/THC/THCGeneral.hpp

+
+#include "THCGeneral.h"
+
+/* Global state to be held in the cutorch table. */


gchanan

looks great! I'd prefer to make the device some ridiculously low number on CPU and to have a single definition of the storage free if those are easy to do.

Summary: Closes pytorch#9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes pytorch#9087 Differential Revision: D8712072 fbshipit-source-id: 6f2619d0c76f71e3de4b5b73b81910bf8ba3265b

ezyang · 2018-07-02T23:41:58Z

Due to infra shenanigans, updated PR is at #9107

Summary: Closes #9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes #9087 Reviewed By: orionr Differential Revision: D8712072 Pulled By: ezyang fbshipit-source-id: c6e1ea236cd1df017b42a7fffb2dbff20d50a284

Summary: Closes pytorch/pytorch#9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes pytorch/pytorch#9087 Reviewed By: orionr Differential Revision: D8712072 Pulled By: ezyang fbshipit-source-id: c6e1ea236cd1df017b42a7fffb2dbff20d50a284

Summary: Closes pytorch#9107 Some details about how this was done: - For now, the allocators for CPU and CUDA are different (unifying the allocators is a bigger change to make, I'll contribute this in a later patch). To smooth this over, the allocator field now stores a void* instead of THAllocator* or THCDeviceAllocator*; to make this clear the field is renamed to allocatorVoidPtr. - Some THStorage functions which were generated per-scalar are now generalized, and thus moved out of the generic/ library. This way they can be called directly from a non-code-generated at::Storage - THCState is moved into a C++ header. This is actually not really related to this particular diff, but I'll need it soon to replace THAllocator/THCDeviceAllocator with at::Allocator (C++, so I can't mention it in a C header file.) - THPPointer needs to be adjusted, since there is no more type refinement between THStorage/THCStorage for it to template match over. This is a little tricky, because I can't refer to THCStorage_free unless we actually compile with CUDA. So there's two copies of the function now: one for the CPU build, one for the CUDA build. If we ever split CUDA/non-CUDA Python builds, you will have to indirect this through some dynamic dispatch. I want to soon replace the THCDeviceAllocator pointers in THCState with at::Allocator, but I can't reference a C++ namespaced type from C code, so THCState needs to move. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Closes pytorch#9087 Reviewed By: orionr Differential Revision: D8712072 Pulled By: ezyang fbshipit-source-id: c6e1ea236cd1df017b42a7fffb2dbff20d50a284

ezyang requested a review from gchanan July 2, 2018 01:41

ezyang requested review from apaszke, colesbury, soumith and zdevito as code owners July 2, 2018 01:41

facebook-github-bot reviewed Jul 2, 2018

View reviewed changes

ezyang mentioned this pull request Jul 2, 2018

Unify THAllocator and THCDeviceAllocator with at::Allocator #9089

Closed

gchanan reviewed Jul 2, 2018

View reviewed changes

apaszke reviewed Jul 2, 2018

View reviewed changes

aten/src/THC/THCGeneral.hpp

#include "THCGeneral.h"

/* Global state to be held in the cutorch table. */

This comment was marked as off-topic.

Sign in to view

gchanan approved these changes Jul 2, 2018

View reviewed changes

ezyang mentioned this pull request Jul 2, 2018

Unify THStorage and THCStorage structs. (#9087) #9107

Closed

ezyang closed this Jul 2, 2018


		#include "THCGeneral.h"

		/* Global state to be held in the cutorch table. */

Conversation

ezyang commented Jul 2, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

gchanan left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jul 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants