Skip to content

[TestZeroRedundancyOptimizer] Add multi gpu checker#53564

Closed
jaglinux wants to merge 1 commit intopytorch:masterfrom
jaglinux:origin/test_zero_redundancy_optimizer
Closed

[TestZeroRedundancyOptimizer] Add multi gpu checker#53564
jaglinux wants to merge 1 commit intopytorch:masterfrom
jaglinux:origin/test_zero_redundancy_optimizer

Conversation

@jaglinux
Copy link
Copy Markdown
Contributor

@jaglinux jaglinux commented Mar 8, 2021

The test test_collect_shards fails on single GPU setup.
Enabling the multi gpu checker.

Signed-off-by: Jagadish Krishnamoorthy jagdish.krishna@gmail.com

The test test_collect_shards fails on single GPU setup.
Enabling the multi gpu checker.

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
@facebook-github-bot facebook-github-bot added cla signed oncall: distributed Add this issue/PR to distributed oncall triage queue labels Mar 8, 2021
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Mar 8, 2021

💊 CI failures summary and remediations

As of commit 4b0df0b (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-scanned failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

@jaglinux
Copy link
Copy Markdown
Contributor Author

jaglinux commented Mar 8, 2021

cc @jeffdaily @arindamroy-eng

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 9, 2021

Codecov Report

Merging #53564 (4b0df0b) into master (067ad31) will decrease coverage by 0.42%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #53564      +/-   ##
==========================================
- Coverage   77.64%   77.22%   -0.43%     
==========================================
  Files        1869     1869              
  Lines      182321   182321              
==========================================
- Hits       141570   140793     -777     
- Misses      40751    41528     +777     

Copy link
Copy Markdown
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Rocm failure is test_reference_numerics_normal_tanh_cuda_complex64, looks unrelated

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohan-varma has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@jeffdaily
Copy link
Copy Markdown
Collaborator

Thanks! Rocm failure is test_reference_numerics_normal_tanh_cuda_complex64, looks unrelated

Yes, test is unrelated. A different PR made ROCm changes and was merged before passing CI. It has since been reverted. But I'd rather not go through a rebase and prolong this current PR landing.

@rohan-varma
Copy link
Copy Markdown
Contributor

@jeffdaily sounds good I'm landing the PR shortly.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@rohan-varma merged this pull request in 2cf9098.

xsacha pushed a commit to xsacha/pytorch that referenced this pull request Mar 31, 2021
Summary:
The test test_collect_shards fails on single GPU setup.
Enabling the multi gpu checker.

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>

Pull Request resolved: pytorch#53564

Reviewed By: H-Huang

Differential Revision: D26952325

Pulled By: rohan-varma

fbshipit-source-id: e8956f9277c7320024bece129767e83fbdf02b2c
malfet pushed a commit that referenced this pull request Mar 4, 2022
…zer.test_collect_shards` (#72923)

* [TestZeroRedundancyOptimizer] Add multi gpu checker (#53564)

Summary:
The test test_collect_shards fails on single GPU setup.
Enabling the multi gpu checker.

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>

Pull Request resolved: #53564

Reviewed By: H-Huang

Differential Revision: D26952325

Pulled By: rohan-varma

fbshipit-source-id: e8956f9277c7320024bece129767e83fbdf02b2c

* fix skip_if_not_multigpu

Co-authored-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
The test test_collect_shards fails on single GPU setup.
Enabling the multi gpu checker.

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>

Pull Request resolved: pytorch#53564

Reviewed By: H-Huang

Differential Revision: D26952325

Pulled By: rohan-varma

fbshipit-source-id: e8956f9277c7320024bece129767e83fbdf02b2c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants