Skip to content

[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device#49711

Closed
wayi1 wants to merge 3 commits intogh/SciPioneer/41/basefrom
gh/SciPioneer/41/head
Closed

[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device#49711
wayi1 wants to merge 3 commits intogh/SciPioneer/41/basefrom
gh/SciPioneer/41/head

Conversation

@wayi1
Copy link
Copy Markdown
Contributor

@wayi1 wayi1 commented Dec 21, 2020

Stack from ghstack:

torch.cuda.synchronize uses the current device by default. Explicitly specify this device for better readability.

Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202

Differential Revision: D25672267

…nchronize to the current device

`torch.cuda.synchronize` uses the current device by default. Explicitly specify this device for better readability.

Differential Revision: [D25672267](https://our.internmc.facebook.com/intern/diff/D25672267/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Dec 21, 2020

💊 CI failures summary and remediations

As of commit b288aee (more details on the Dr. CI page):


  • 4/5 failures possibly* introduced in this PR
    • 1/4 non-CircleCI failure(s)
  • 1/5 broken upstream at merge base 838d1f6 on Dec 21 from 1:01pm to 7:47pm

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (1/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at b288aee57d Update on "[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device"
+ git reset --hard b288aee57d1e940572635509a8133c59e82f82ad
HEAD is now at b288aee57d Update on "[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device"
+ git merge --allow-unrelated-histories --no-edit --no-ff 7b4a7661d6de659c8423015a2f3e93308eb83850
Auto-merging torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py
CONFLICT (content): Merge conflict in torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See CircleCI build pytorch_linux_xenial_cuda9_2_cudnn7_py3_gcc5_4_build (2/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at b288aee57d Update on "[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device"
+ git reset --hard b288aee57d1e940572635509a8133c59e82f82ad
HEAD is now at b288aee57d Update on "[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device"
+ git merge --allow-unrelated-histories --no-edit --no-ff 7b4a7661d6de659c8423015a2f3e93308eb83850
Auto-merging torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py
CONFLICT (content): Merge conflict in torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (3/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at b288aee57d Update on "[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device"
+ git reset --hard b288aee57d1e940572635509a8133c59e82f82ad
HEAD is now at b288aee57d Update on "[Gradient Compression] Explicitly restrict the scope of torch.cuda.synchronize to the current device"
+ git merge --allow-unrelated-histories --no-edit --no-ff 7b4a7661d6de659c8423015a2f3e93308eb83850
Auto-merging torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py
CONFLICT (content): Merge conflict in torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1


1 job timed out:

  • pytorch_linux_bionic_py3_8_gcc9_coverage_test1

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 9 times.

…rch.cuda.synchronize to the current device"


`torch.cuda.synchronize` uses the current device by default. Explicitly specify this device for better readability.

Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202

Differential Revision: [D25672267](https://our.internmc.facebook.com/intern/diff/D25672267/)

[ghstack-poisoned]
…rch.cuda.synchronize to the current device"


`torch.cuda.synchronize` uses the current device by default. Explicitly specify this device for better readability.

Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202

Differential Revision: [D25672267](https://our.internmc.facebook.com/intern/diff/D25672267/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in 88c33ff.

@facebook-github-bot facebook-github-bot deleted the gh/SciPioneer/41/head branch December 26, 2020 15:18
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
…nchronize to the current device (pytorch#49711)

Summary:
Pull Request resolved: pytorch#49711

`torch.cuda.synchronize` uses the current device by default. Explicitly specify this device for better readability.

Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression pytorch#47202
ghstack-source-id: 119017654

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_powerSGD_ddp_comm_hook_nccl

buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_DistributedDataParallel_powerSGD_ddp_comm_hook

Reviewed By: rohan-varma

Differential Revision: D25672267

fbshipit-source-id: 62a2266727a2ea76175f3c438daf20951091c771
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants