Skip to content

Run cached graph api should grab device lock first#4195

Merged
JackCaoG merged 1 commit intomasterfrom
jackcao/run_cached_graph_with_lock
Nov 15, 2022
Merged

Run cached graph api should grab device lock first#4195
JackCaoG merged 1 commit intomasterfrom
jackcao/run_cached_graph_with_lock

Conversation

@JackCaoG
Copy link
Copy Markdown
Collaborator

@JackCaoG JackCaoG commented Nov 12, 2022

We need to grab the device lock before execute the computation in _run_cached_graph. This is to prevent more than 1 graph being executed in the same time.

Current implementation we can only grab lock within tensor.cpp hence this change.

FYI @shunting314 @wconstab

Copy link
Copy Markdown
Collaborator

@alanwaketan alanwaketan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Comment thread torch_xla/csrc/tensor.cpp
torch::lazy::ComputationPtr computation,
c10::ArrayRef<torch::lazy::BackendDataPtr> arguments,
const torch::lazy::BackendDevice& device) {
std::vector<xla::util::ExceptionCleanup> unlocker;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing to count on LTC migration. haha.

@JackCaoG JackCaoG merged commit 08a9c5d into master Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants