[refactor] AsyncPipe: do not sub-class MultiProcessPipe by msbaines · Pull Request #370 · facebookresearch/fairscale

msbaines · 2021-02-06T00:10:01Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Fixes # (issue).

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

msbaines · 2021-02-08T18:41:49Z

ping

anj-s · 2021-02-08T19:03:37Z

fairscale/nn/pipe/async_pipeline.py



-class AsyncPipeline(MultiProcessPipeline):
+class AsyncPipeline:


Why do we not need to subclass this anymore?

I want to break the cyclic dependency between multiprocess_pipeline and async_schedule. The class hierarchy is a bit of a mess currently. I also don't like that the run function of *Pipeline takes a different argument type in the two classes.

Removing the inheritance should enables implication of MultiProcess pipe in order to understand what is really going on. I may add the inheritance back later, or add a common interface base-class, if it makes sense but right now it just gets in the way.

anj-s · 2021-02-08T19:04:55Z

fairscale/nn/pipe/async_pipe.py

+        model = Pipe(model, balance=[1, 1, 1, 1], chunks=8)
+        output = model(input)
+
+    .. _Pipe: https://arxiv.org/abs/1811.06965


Would be good to have this docstring mention what is different between MultiprocessPipe and AsyncPipe? I don't know if this doc string is relevant anymore.

anj-s · 2021-02-08T19:06:15Z

fairscale/nn/pipe/async_pipe.py

+            yield from partition.module
+
+    def forward(self, input: TensorOrTensors, *, event=None) -> TensorOrTensors:  # type: ignore
+        """:class:`MultiProcessPipe` is a fairly transparent module wrapper. It doesn't


change to Async?

sidgoyal78

Looks good. Thanks for the PR.

* [chore] Fix lint errors that broke master (#348) authored-by: Anjali Sridhar <anj@devfair0443.h2.fair> * [fix] ShardedDDP - cpu testfix - remove Gloo/CPU (#350) * no idea about the root issue, but it proved to be fairly narrowed (gloo+cpu+python3.8+no cuda installed) so I guess that's out of scope for fairscale * [feat][OSS] elastic and pytorch compatible checkpoints (#310) * adding a test to prove the inter operability with upstream pytorch * updating the changelog * eager state pruning * pytorch 1.5 compat * [fix] ShardedDDP - properly handle post device change (#353) * adding the .to(device) support + unit testing * doc update * [feat] Add AdaScaleWrapper (#347) * [feat] Add AdaScaleWrapper - This enables a different API for wrapping an optimizer with AdaScale. - This also enables AdaScale to be wrapped by OSS. - However, OSS wrapping AdaScale results in different optimization, which future research will be needed to study its effects. testing: add unit tests. * addressed comment: typo * [refactor] Refactor and enable multiprocess nn.Pipe benchmarks. (#319) * mp cleanup * round of multiprocess refactoring * test golden run * print cuda stats * fix lint errors * enable multiprocess pipe benchmarks * set world size to be available gpus * more changes * use synthetic loaders for intermediate pipeline stages * merged master * fix for the devices property * dataloader fix * modify rank check * print wps stats * enable verification * fix logging * fix flag name * fix flag name * check for rank * fix indent * pass args * pass args * modify golden data * remove unused print messsage * fix lint errors * add comments * fix benchmarks Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair> * [refactor] pipe: simplify balance and module checks (#346) * [chore] v0.1.5 (#355) * [chore] disheartening switch off of a OSS cpu test (#356) * precise skip, only if agent has only cpu * [feat][minor] OSS Benchmark - regression test + background testing new optims (#352) * restoring the regression test, adding a test of the for_each optims * fix the regression test on circleci * removing unused flags * [refactor] multiprocess_pipe: cleanup __init__ (#357) * [refactor] multiprocess_pipe: remove retain_graph __init__ param (#358) It is not currently being used so we can simplify the interface by removing it. * [refactor] multiprocess_pipe: focus on LazyModule usage (#360) * [feat] ShardedDDP : Adding a proper DDP parity / AMP unit test, overdue (#361) * Adding a proper ddp parity / AMP unit test, overdue * catch non-AMP pytorch * [perf][OSS] Clip grad norm : minor obvious speedup (#363) cache this iterator, easy speed up * [refactor] multiprocess_pipe: remove pipelined_backward (#362) * [perf] ShardedDDP - small memory use reduction - minor speedup (#366) * minor * minor * [fix] repro+fix (#365) fix a broken earlier commit, only worked for the first step * [refactor] OSS only use flat buffers (#371) * flat params all along, way simpler * updating the docstring * [refactor] AsyncPipe: do not sub-class MultiProcessPipe (#370) * [refactor] remove multiprocess dependency on async (#373) * [fix] Workaround need for pip --no-build-isolation (#375) * Add fairscale.nn.misc.checkpoint_activations (#376) * Add fairscale.utils.containers Co-authored-by: Min Xu <24926999+min-xu-ai@users.noreply.github.com> * Add fairscale.nn.misc.checkpoint_activations Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Min Xu <24926999+min-xu-ai@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * [chore] v0.1.6 (#377) * v0.1.6 Co-authored-by: anj-s <32556631+anj-s@users.noreply.github.com> Co-authored-by: Benjamin Lefaudeux <blefaudeux@users.noreply.github.com> Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair> Co-authored-by: msbaines <35972327+msbaines@users.noreply.github.com> Co-authored-by: Leonard Lausen <leonard@lausen.nl> Co-authored-by: Myle Ott <myleott@fb.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 6, 2021

msbaines force-pushed the mppipe branch from b585cdb to 37e3793 Compare February 6, 2021 00:24

[refactor] AsyncPipe{,line}: do not sub-class MultiProcessPipe{,line}

1027ab8

msbaines force-pushed the mppipe branch from 37e3793 to 1027ab8 Compare February 6, 2021 00:30

msbaines requested review from anj-s, min-xu-ai and sidgoyal78 and removed request for min-xu-ai February 6, 2021 00:44

msbaines marked this pull request as ready for review February 6, 2021 00:45

anj-s reviewed Feb 8, 2021

View reviewed changes

anj-s approved these changes Feb 8, 2021

View reviewed changes

Address @anj-s feedback

946734d

sidgoyal78 approved these changes Feb 8, 2021

View reviewed changes

msbaines merged commit 08c1099 into master Feb 8, 2021

msbaines deleted the mppipe branch February 8, 2021 20:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[refactor] AsyncPipe: do not sub-class MultiProcessPipe#370

[refactor] AsyncPipe: do not sub-class MultiProcessPipe#370
msbaines merged 2 commits intomasterfrom
mppipe

msbaines commented Feb 6, 2021

Uh oh!

msbaines commented Feb 8, 2021

Uh oh!

anj-s Feb 8, 2021

Uh oh!

msbaines Feb 8, 2021

Uh oh!

anj-s Feb 8, 2021

Uh oh!

anj-s Feb 8, 2021

Uh oh!

sidgoyal78 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants



		class AsyncPipeline(MultiProcessPipeline):
		class AsyncPipeline:

Conversation

msbaines commented Feb 6, 2021

Before submitting

What does this PR do?

PR review

Did you have fun?

Uh oh!

msbaines commented Feb 8, 2021

Uh oh!

anj-s Feb 8, 2021

Choose a reason for hiding this comment

Uh oh!

msbaines Feb 8, 2021

Choose a reason for hiding this comment

Uh oh!

anj-s Feb 8, 2021

Choose a reason for hiding this comment

Uh oh!

anj-s Feb 8, 2021

Choose a reason for hiding this comment

Uh oh!

sidgoyal78 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants