[FX] Added fuser tutorial by Chillee · Pull Request #1356 · pytorch/tutorials

Chillee · 2021-02-11T10:39:46Z

Not sure how to test it in notebook format.

Also, perhaps I'd like to bake in the output somehow? It would be somewhat embarassing if the fused version was slower due to noise :)

netlify · 2021-02-11T10:44:04Z

Deploy preview for pytorch-tutorials-preview ready!

Built with commit 14a7913

https://deploy-preview-1356--pytorch-tutorials-preview.netlify.app

intermediate_source/fx_conv_bn_fuser.py

gchanan · 2021-02-15T22:20:00Z

index.rst

+.. Code Transformations with FX
+.. customcarditem::
+   :header: Building a Convolution/Batch Norm fuser in FX
+   :card_description: Build a simple FX interpreter to record the runtime of op, module, and function calls and report statistics


are the card description and images correct? They look like they belong to other tutorials?

Description is wrong, but not sure what to put for the image. @jamesr66a is there any reason you chose this image for the performance profiling for FX? https://github.com/pytorch/tutorials/pull/1319/files#diff-54a294a5d016e1a8e98bc95668ed84a99a9edd5c10394d9a2b1ee848006e98a7R223

I put that because it's just a generic PyTorch logo:

And wasn't sure if I should copy that into a new filename or if we should come up with some logo to use

gchanan · 2021-02-15T22:24:02Z

index.rst


+.. Code Transformations with FX
+.. customcarditem::
+   :header: Building a Convolution/Batch Norm fuser in FX


I've seen this technique more commonly referred to as "folding" but both make sense (https://towardsdatascience.com/speed-up-inference-with-batch-normalization-folding-8a45a83a89d8, https://arxiv.org/abs/1611.09842 calls it "absorbing").

Might be nice to use different terminology in case we want to add a "first class" fusion tutorial later that e.g. directly calls into NNC.

I agree the terminology is confusing, but I think fusion is an acceptable (and more widely understood) term. If we add a fusion tutorial later I'd be glad to rename it to something to avoid name conflicts.

intermediate_source/fx_conv_bn_fuser.py

jamesr66a

Looks good to me!

intermediate_source/fx_conv_bn_fuser.py

jamesr66a · 2021-02-17T01:11:00Z

index.rst

+.. Code Transformations with FX
+.. customcarditem::
+   :header: Building a Convolution/Batch Norm fuser in FX
+   :card_description: Build a simple FX interpreter to record the runtime of op, module, and function calls and report statistics


I put that because it's just a generic PyTorch logo:

And wasn't sure if I should copy that into a new filename or if we should come up with some logo to use

jamesr66a · 2021-02-17T01:13:57Z

intermediate_source/fx_conv_bn_fuser.py

+# accessing the computational graph. FX resolves this problem by symbolically
+# tracing the actual operations called, so that we can track the computations
+# through the `forward` call, nested within Sequential modules, or wrapped in
+# an user-defined module.


nit: a user-defined module

🤔 Shouldn't it be an before user? Since user starts with a vowel?

the rule is based on the sound, not the letter.

intermediate_source/fx_conv_bn_fuser.py

jamesr66a · 2021-02-17T01:19:26Z

intermediate_source/fx_conv_bn_fuser.py

+
+fused_model = fuse(model)
+print(fused_model.code)
+inp = torch.randn(5, 1, 1, 1)


Should we run this on a more realistic input shape?

I just wrote all the conv/batch norm modules to operate on a [1,1,1] shape. We're not measuring the performance of this module so I don't think it matters.

intermediate_source/fx_conv_bn_fuser.py

* Update build.sh * Update audio tutorial for release pytorch 1.8 / torchaudio 0.8 (#1379) * [wip] replace audio tutorial * Update * Update * Update * fixup * Update requirements.txt * update * Update Co-authored-by: Brian Johnson <brianjo@fb.com> * [1.8 release] Switch to the new datasets in torchtext 0.9.0 release - text classification tutorial (#1352) * switch to the new dataset API * checkpoint * checkpoint * checkpoint * update docs * checkpoint * switch to legacy vocab * update to follow the master API * checkpoint * checkpoint * address reviewer's comments Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: Brian Johnson <brianjo@fb.com> * [1.8 release] Switch to LM dataset in torchtext 0.9.0 release (#1349) * switch to raw text dataset in torchtext 0.9.0 release * follow the new API in torchtext master Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: Brian Johnson <brianjo@fb.com> * [WIP][FX] CPU Performance Profiling with FX (#1319) Co-authored-by: Brian Johnson <brianjo@fb.com> * [FX] Added fuser tutorial (#1356) * Added fuser tutorial * updated index.rst * fixed conclusion * responded to some comments * responded to comments * respond Co-authored-by: Brian Johnson <brianjo@fb.com> * Update numeric_suite_tutorial.py * Tutorial combining DDP with Pipeline Parallelism to Train Transformer models (#1347) * Tutorial combining DDP with Pipeline Parallelism to Train Transformer models. Summary: Tutorial which places a pipe on GPUs 0 and 1 and another Pipe on GPUs 2 and 3. Both pipe replicas are replicated via DDP. One process drives GPUs 0 and 1 and another drives GPUs 2 and 3. * Polish out some of the docs. * Add thumbnail and address some comments. Co-authored-by: pritam <pritam.damania@fb.com> * More updates to numeric_suite * Even more updates * Update numeric_suite_tutorial.py Hopefully that's the last one * Update numeric_suite_tutorial.py Last one * Update build.sh Co-authored-by: moto <855818+mthrok@users.noreply.github.com> Co-authored-by: Guanheng George Zhang <6156351+zhangguanheng66@users.noreply.github.com> Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: James Reed <jamesreed@fb.com> Co-authored-by: Horace He <horacehe2007@yahoo.com> Co-authored-by: Pritam Damania <9958665+pritamdamania87@users.noreply.github.com> Co-authored-by: pritam <pritam.damania@fb.com> Co-authored-by: Nikita Shulga <nshulga@fb.com>

* Update build.sh * Update audio tutorial for release pytorch 1.8 / torchaudio 0.8 (pytorch#1379) * [wip] replace audio tutorial * Update * Update * Update * fixup * Update requirements.txt * update * Update Co-authored-by: Brian Johnson <brianjo@fb.com> * [1.8 release] Switch to the new datasets in torchtext 0.9.0 release - text classification tutorial (pytorch#1352) * switch to the new dataset API * checkpoint * checkpoint * checkpoint * update docs * checkpoint * switch to legacy vocab * update to follow the master API * checkpoint * checkpoint * address reviewer's comments Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: Brian Johnson <brianjo@fb.com> * [1.8 release] Switch to LM dataset in torchtext 0.9.0 release (pytorch#1349) * switch to raw text dataset in torchtext 0.9.0 release * follow the new API in torchtext master Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: Brian Johnson <brianjo@fb.com> * [WIP][FX] CPU Performance Profiling with FX (pytorch#1319) Co-authored-by: Brian Johnson <brianjo@fb.com> * [FX] Added fuser tutorial (pytorch#1356) * Added fuser tutorial * updated index.rst * fixed conclusion * responded to some comments * responded to comments * respond Co-authored-by: Brian Johnson <brianjo@fb.com> * Update numeric_suite_tutorial.py * Tutorial combining DDP with Pipeline Parallelism to Train Transformer models (pytorch#1347) * Tutorial combining DDP with Pipeline Parallelism to Train Transformer models. Summary: Tutorial which places a pipe on GPUs 0 and 1 and another Pipe on GPUs 2 and 3. Both pipe replicas are replicated via DDP. One process drives GPUs 0 and 1 and another drives GPUs 2 and 3. * Polish out some of the docs. * Add thumbnail and address some comments. Co-authored-by: pritam <pritam.damania@fb.com> * More updates to numeric_suite * Even more updates * Update numeric_suite_tutorial.py Hopefully that's the last one * Update numeric_suite_tutorial.py Last one * Update build.sh Co-authored-by: moto <855818+mthrok@users.noreply.github.com> Co-authored-by: Guanheng George Zhang <6156351+zhangguanheng66@users.noreply.github.com> Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: James Reed <jamesreed@fb.com> Co-authored-by: Horace He <horacehe2007@yahoo.com> Co-authored-by: Pritam Damania <9958665+pritamdamania87@users.noreply.github.com> Co-authored-by: pritam <pritam.damania@fb.com> Co-authored-by: Nikita Shulga <nshulga@fb.com>

Added fuser tutorial

f7e015d

facebook-github-bot added the cla signed label Feb 11, 2021

updated index.rst

e5319e3

Chillee force-pushed the master branch from ab34437 to e5319e3 Compare February 11, 2021 18:23

fixed conclusion

a49a8df

jamesr66a reviewed Feb 12, 2021

View reviewed changes

responded to some comments

439139c

gchanan reviewed Feb 15, 2021

View reviewed changes