Skip to content

[fix] OSS restore state to proper device#46

Merged
blefaudeux merged 5 commits intomasterfrom
oss_restore_device_state
Aug 20, 2020
Merged

[fix] OSS restore state to proper device#46
blefaudeux merged 5 commits intomasterfrom
oss_restore_device_state

Conversation

@blefaudeux
Copy link
Copy Markdown
Contributor

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
  • Did you read the contributor guideline?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Fixes the state pull and push behavior to being something more easily predictable, when a state is pulled it gets moved to CPU (failsafe if this is a reasonably big model). Upon restore the param_groups were not restored to the proper device. It looks like there are still probable duplicates in that field though

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@blefaudeux blefaudeux self-assigned this Aug 20, 2020
@blefaudeux blefaudeux requested a review from msbaines August 20, 2020 01:09
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 20, 2020
@blefaudeux blefaudeux requested a review from min-xu-ai August 20, 2020 01:09
@codecov
Copy link
Copy Markdown

codecov bot commented Aug 20, 2020

Codecov Report

Merging #46 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #46   +/-   ##
=======================================
  Coverage   94.18%   94.18%           
=======================================
  Files          35       35           
  Lines        2065     2065           
=======================================
  Hits         1945     1945           
  Misses        120      120           
Flag Coverage Δ
#Python 94.18% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
fairscale/optim/oss.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9d6c7b6...7120d57. Read the comment docs.


# Restore the global param_groups
self.param_groups = state_dict["param_groups"]
self.param_groups = recursive_copy_to_device(state_dict["param_groups"], non_blocking=True, device=self._device)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line too long?

Is there a test for this already? If so, should there be some assert in the test?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no test, it's a good point, I'll add that

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

@blefaudeux blefaudeux Aug 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's supposed to be automatic with vscode+black, it's pure FB tooling and set to 120 cols, this formatting crap is so annoying.. I'll fix that in the next PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah no ok, black is happy with that indeed, it's 121 cols but for some reason this passes (same for the init above, also 121). Anyway, all good then

Copy link
Copy Markdown
Contributor

@min-xu-ai min-xu-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@blefaudeux blefaudeux merged commit c2d6f4b into master Aug 20, 2020
@blefaudeux blefaudeux deleted the oss_restore_device_state branch September 2, 2020 17:43
myleott pushed a commit that referenced this pull request Feb 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants