Move stream to thread local by dzhulgakov · Pull Request #13080 · pytorch/pytorch

dzhulgakov · 2018-10-24T22:11:37Z

Summary:
This is the first step to untangle this logic:

moves stream id to thread local mechanically
relies on the fact that the value of thread local is valid in conjunction with CUDAContext only until the next SwitchToDevice is called - we should move to proper RAII in the following diffs

Follow up diffs are going to move more stuff outside of CUDAContext (by making gpu_id thread local too) and simplify the CopyFrom.

The only expected change in behavior is that before CopyFrom would do copy on stream logical id 0 if the context was created on the fly and now it'd do so on the current stream. Since it'd block explicitly, I don't think it matters much.

Also, observers were semi-broken by waiting on the potentially wrong stream. It can be fixed later - I renamed the method to avoid abuse.

Differential Revision: D10525134

Summary: This is the first step to untangle this logic: - moves stream id to thread local mechanically - relies on the fact that the value of thread local is valid in conjunction with CUDAContext only until the next SwitchToDevice is called - we should move to proper RAII in the following diffs Follow up diffs are going to move more stuff outside of CUDAContext (by making gpu_id thread local too) and simplify the CopyFrom. The only expected change in behavior is that before CopyFrom would do copy on stream logical id 0 if the context was created on the fly and now it'd do so on the current stream. Since it'd block explicitly, I don't think it matters much. Also, observers were semi-broken by waiting on the potentially wrong stream. It can be fixed later - I renamed the method to avoid abuse. Differential Revision: D10525134 fbshipit-source-id: 7810c458cc6beef94f255fe11a044252cc38f4f9

ezyang

Thank you, this is very helpful.

Summary: Pull Request resolved: pytorch/pytorch#13080 This is the first step to untangle this logic: - moves stream id to thread local mechanically - relies on the fact that the value of thread local is valid in conjunction with CUDAContext only until the next SwitchToDevice is called - we should move to proper RAII in the following diffs Follow up diffs are going to move more stuff outside of CUDAContext (by making gpu_id thread local too) and simplify the CopyFrom. The only expected change in behavior is that before CopyFrom would do copy on stream logical id 0 if the context was created on the fly and now it'd do so on the current stream. Since it'd block explicitly, I don't think it matters much. Also, observers were semi-broken by waiting on the potentially wrong stream. It can be fixed later - I renamed the method to avoid abuse. Reviewed By: ezyang Differential Revision: D10525134 fbshipit-source-id: 5d495a21490bebe060a76389f1b47bdf12cbc59e

ezyang · 2018-10-26T14:33:39Z

This diff broke Caffe2 ROCm tests: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/629//console

Might just be a failure to apply this code change to the HIP code as well.

CC @iotamudelta @bddppq

bddppq · 2018-10-26T15:40:10Z

@ezyang Yep the context_gpu code change need to be applied to context_hip as well (the test is automatically hipified but the context_* source code is not). I'm working on hipifying caffe2/core in #13148, I just rebased my PR to include this and triggered a test run here https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-trigger-test/3525/, let's see the failure will go away.

New run with #gpu checks https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-trigger-test/3609/

Sign in to view

+  auto before_stream = context_outer.cuda_stream();
+
+  // try to mess up current device
+  CUDAContext context_different_device(1);


Summary: Pull Request resolved: pytorch#13080 This is the first step to untangle this logic: - moves stream id to thread local mechanically - relies on the fact that the value of thread local is valid in conjunction with CUDAContext only until the next SwitchToDevice is called - we should move to proper RAII in the following diffs Follow up diffs are going to move more stuff outside of CUDAContext (by making gpu_id thread local too) and simplify the CopyFrom. The only expected change in behavior is that before CopyFrom would do copy on stream logical id 0 if the context was created on the fly and now it'd do so on the current stream. Since it'd block explicitly, I don't think it matters much. Also, observers were semi-broken by waiting on the potentially wrong stream. It can be fixed later - I renamed the method to avoid abuse. Reviewed By: ezyang Differential Revision: D10525134 fbshipit-source-id: 5d495a21490bebe060a76389f1b47bdf12cbc59e

ezyang approved these changes Oct 25, 2018

View reviewed changes

facebook-github-bot closed this in f72f916 Oct 26, 2018

bddppq reviewed Oct 26, 2018

View reviewed changes

Comment thread caffe2/core/context_gpu_test.cc

auto before_stream = context_outer.cuda_stream();

// try to mess up current device

CUDAContext context_different_device(1);

This comment was marked as off-topic.

Sign in to view

ezyang added the merged label Jun 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move stream to thread local#13080

Move stream to thread local#13080
dzhulgakov wants to merge 1 commit intopytorch:masterfrom
dzhulgakov:export-D10525134

dzhulgakov commented Oct 24, 2018

Uh oh!

ezyang left a comment

Uh oh!

ezyang commented Oct 26, 2018

Uh oh!

bddppq commented Oct 26, 2018 •

edited

Loading

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dzhulgakov commented Oct 24, 2018

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Oct 26, 2018

Uh oh!

bddppq commented Oct 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bddppq commented Oct 26, 2018 •

edited

Loading