[feat] Simple macro OSS benchmark by blefaudeux · Pull Request #47 · facebookresearch/fairscale

blefaudeux · 2020-08-20T18:41:03Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
[ ] Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Adds some numbers behind #42, for a workload relevant for computer vision.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Yes that was pretty fun 🙃

blefaudeux · 2020-08-20T20:22:04Z

example output:

Benchmark OSS
[0] : Epoch 0 - processed 104.37 img per sec
[0] : Epoch 1 - processed 106.04 img per sec
[0] : Epoch 2 - processed 104.27 img per sec
[0] : Epoch 3 - processed 105.76 img per sec
[0] : Epoch 4 - processed 107.96 img per sec
[0] : Epoch 5 - processed 107.64 img per sec
[0] : Epoch 6 - processed 106.97 img per sec
[0] : Epoch 7 - processed 107.19 img per sec
[0] : Epoch 8 - processed 106.15 img per sec
[0] : Epoch 9 - processed 105.56 img per sec
[1] Peak memory: 8252.5MiB
[0] : Training done. 105.76 img per sec overall
[0] Peak memory: 8247.7MiB
Benchmark vanilla SGD
[0] : Epoch 0 - processed 88.46 img per sec
[0] : Epoch 1 - processed 87.95 img per sec
[0] : Epoch 2 - processed 87.04 img per sec
[0] : Epoch 3 - processed 89.04 img per sec
[0] : Epoch 4 - processed 88.58 img per sec
[0] : Epoch 5 - processed 89.56 img per sec
[0] : Epoch 6 - processed 89.06 img per sec
[0] : Epoch 7 - processed 89.05 img per sec
[0] : Epoch 8 - processed 89.05 img per sec
[0] : Epoch 9 - processed 89.00 img per sec
[1] Peak memory: 8335.7MiB
[0] : Training done. 88.60 img per sec overall```

codecov · 2020-08-20T21:29:01Z

Codecov Report

Merging #47 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master      #47   +/-   ##
=======================================
  Coverage   94.18%   94.18%           
=======================================
  Files          35       35           
  Lines        2065     2065           
=======================================
  Hits         1945     1945           
  Misses        120      120

Flag	Coverage Δ
#Python	`94.18% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
fairscale/optim/adam.py	`93.04% <ø> (ø)`
fairscale/optim/oss.py	`100.00% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2d6f4b...a20fa51. Read the comment docs.

blefaudeux · 2020-08-20T23:46:14Z

ping @min-xu-ai or @msbaines, this should be simple enough (no change to the core lib, just an outsider benchmark)

msbaines

This looks great. Would you mind adding an assertion at the end so that we can use this as a regression test. To create the assertion, run the test 5 times (you can do this directly from circleci) and then calculating mean and standard deviation. Then set the threshold to mean - 3*standard deviation.

msbaines · 2020-08-21T16:45:31Z

This looks great. Would you mind adding an assertion at the end so that we can use this as a regression test. To create the assertion, run the test 5 times (you can do this directly from circleci) and then calculating mean and standard deviation. Then set the threshold to mean - 3*standard deviation.

Or you could just use the mean/stdev for the 10 epochs and set the threshold that way.

min-xu-ai

LGTM and like Mandeep's suggestions.

Set the torch seed for tests. xfail mixed precision and memory-efficient mixed-precision state_dict tests due to their states being cast to FP16 and back to FP32 during load_state_dict. Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

blefaudeux added 9 commits August 19, 2020 20:49

initial commit, dummy training loop, pure pytorch but not DDP

4ed074b

probably slightly broken, but rough DDP benchmark run

a167289

adding the torchvision requirement for testing

20b981d

brainfart

8a2377c

reduce the loss, do something slightly distributed

41dcf69

Some cleanup, distributing the training on two GPUs

b212dee

Merge remote-tracking branch 'upstream/master' into oss_benchmark

b149113

some cleanup + adding a vanilla run, still not good to go

b5cacbd

less silly defaults, gtg for a start I think

928791e

blefaudeux self-assigned this Aug 20, 2020

blefaudeux requested review from min-xu-ai and msbaines August 20, 2020 18:41

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 20, 2020

smaller batch to fit the smaller gpus used in the circleci rigs

e6a4756

blefaudeux mentioned this pull request Aug 20, 2020

[feat] batch broadcast requests into a configurable buffer #43

Closed

4 tasks

blefaudeux linked an issue Aug 20, 2020 that may be closed by this pull request

Faster OSS #42

Closed

msbaines reviewed Aug 21, 2020

View reviewed changes

min-xu-ai approved these changes Aug 21, 2020

View reviewed changes

blefaudeux and others added 3 commits August 21, 2020 15:03

Adding some options for the benchmark, and regression testing

0e64306

[test] set torch seed for Adam tests (#49)

bfbca2e

Set the torch seed for tests. xfail mixed precision and memory-efficient mixed-precision state_dict tests due to their states being cast to FP16 and back to FP32 during load_state_dict. Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

linting, I really need to automate this isort insanity

a20fa51

blefaudeux merged commit 46c3776 into master Aug 21, 2020

blefaudeux deleted the oss_benchmark branch August 21, 2020 22:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Simple macro OSS benchmark#47

[feat] Simple macro OSS benchmark#47
blefaudeux merged 13 commits intomasterfrom
oss_benchmark

blefaudeux commented Aug 20, 2020

Uh oh!

blefaudeux commented Aug 20, 2020 •

edited

Loading

Uh oh!

codecov bot commented Aug 20, 2020 •

edited

Loading

Uh oh!

blefaudeux commented Aug 20, 2020

Uh oh!

msbaines left a comment

Uh oh!

msbaines commented Aug 21, 2020

Uh oh!

min-xu-ai left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

blefaudeux commented Aug 20, 2020

Before submitting

What does this PR do?

PR review

Did you have fun?

Uh oh!

blefaudeux commented Aug 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

blefaudeux commented Aug 20, 2020

Uh oh!

msbaines left a comment

Choose a reason for hiding this comment

Uh oh!

msbaines commented Aug 21, 2020

Uh oh!

min-xu-ai left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

blefaudeux commented Aug 20, 2020 •

edited

Loading

codecov bot commented Aug 20, 2020 •

edited

Loading