Manually call lazyInitCUDA in structured CUDA calls#61882
Manually call lazyInitCUDA in structured CUDA calls#61882ezyang wants to merge 1 commit intogh/ezyang/1048/basefrom
Conversation
If you directly call the native implementation that bypasses the initialization, which is bad! This probably slows things down a little though... Fixes problem uncovered by #61642 Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 09a3f41 (more details on the Dr. CI page and at hud.pytorch.org/pr/61882): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 Preview docs built from this PR This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
Do you know by how much it slows things down? |
|
I haven't run the performance. It's no worse than calling the fully dispatched However, your comment got me thinking about whether or not initializing CUDA in structured functions really is necessary. After all, if you are given a CUDA tensor, the invariant ought to be that CUDA is already initialized. Indeed #61642 only tickles the problem through a very particular case: what I bet is happening is that |
|
Drat! I can't easily force the initialization in the constructor because it's in ATen/core and the context is in ATen |
bhosmer
left a comment
There was a problem hiding this comment.
Makes sense. Is it worth checking the slowdown before landing, or do we not really have a choice if we want to fix it?
I think the correct terminal state is to not initialize here, and fix the caffe2-to-aten constructor. I'm going to go ahead and land this for now to unblock the other PRs though. |
Stack from ghstack:
If you directly call the native implementation that bypasses the
initialization, which is bad! This probably slows things down a little
though...
Fixes problem uncovered by #61642
Signed-off-by: Edward Z. Yang ezyang@fb.com
Differential Revision: D29783856