Skip to content

Make some basic THC operations thread-safe#519

Merged
soumith merged 1 commit intotorch:masterfrom
colesbury:thread_local
Sep 30, 2016
Merged

Make some basic THC operations thread-safe#519
soumith merged 1 commit intotorch:masterfrom
colesbury:thread_local

Conversation

@colesbury
Copy link
Contributor

Switching the device, setting the stream, and switching BLAS handles is
now thread-safe. Some other operations, like reserveStreams, are still
not thread-safe.

Switching the device, setting the stream, and switching BLAS handles is
now thread-safe. Some other operations, like reserveStreams, are still
not thread-safe.
colesbury added a commit to colesbury/cunn that referenced this pull request Sep 29, 2016
Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good to me 👍


THCState* state = (THCState*)malloc(sizeof(THCState));
memset(state, 0, sizeof(THCState));
THCState* state = THCState_alloc();
Copy link
Contributor

@apaszke apaszke Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THCState_new would be more consistent with existing APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's alloc because there's a separate initialization function. The _new functions initialize their data structure.


/* Allocator using cudaMallocHost. */
THAllocator* cudaHostAllocator;
THCDeviceAllocator cudaDeviceAllocator;
Copy link
Contributor

@apaszke apaszke Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the device allocator a member, while the host allocator is referenced via a pointer? Wouldn't it be more consistent to have both of them point to static structs with function pointers (like in TH)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THCDeviceAllocator can't be a pointer to a static struct because it has a state variable. Each THCState has a different THCDeviceAllocator.state.

state->rngState = (THCRNGState*)malloc(sizeof(THCRNGState));
THCRandom_init(state, numDevices, device);

state->cudaHostAllocator = (THAllocator*)malloc(sizeof(THAllocator));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a static struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you would be calling THCAllocator_init() on a static (global) variable. All of THC is per-THCState

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(you could make it a static struct that lives in THCAllocator.c, but this is all unrelated changes to THC. The short answer is that this is how it was before this PR)

#endif
}

void* THCThreadLocal_get(THCThreadLocal local)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an internal class, and we're using C++, so can we just use the templates to determine the held type. I understand that we might save on one or two malloc/free calls, but I'm not sure if it's worth complicating the code. Let's just store an int* pointing to a single value and return it a properly typed pointer/reference from get. If we really want to avoid the malloc, we can just specialize this class for small types (and hide all intptr_t casts inside this class).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not using C++ here or in THCGeneral.c. Adding templates doesn't seem like it'll reduce code complication, especially since you can't really specialize the POSIX or Windows thread-local API. They both take void*, not arbitrary types.

} THCState;
typedef struct THCState THCState;

THC_API THCState* THCState_alloc();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new?

#ifdef _WIN32
typedef DWORD THCThreadLocal;
#else
#include <pthread.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to add something like this to CMakeLists.txt if using pthread.

@soumith soumith merged commit 9032205 into torch:master Sep 30, 2016
soumith added a commit to torch/cunn that referenced this pull request Sep 30, 2016
colesbury added a commit to colesbury/pytorch-old that referenced this pull request Sep 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants