Add flops profiler tutorial #682

cli99 · 2021-01-20T09:22:31Z

Added the flops profiler tutorial, configuration, and feature to the website. Also fixed names of some flops profiler function parameters.

samyam · 2021-02-03T20:30:45Z

deepspeed/profiling/flops_profiler/README.md

+        )
+        (transformer): ParallelTransformer(
+          12.61 M, 32.43% Params, 103.62 GMACs, 100.00% MACs, 4.4 ms, 13.22% time, 4.7e+01 TFLOPS,
+          (layers): ModuleList(


Why is time percent and TFlops 0 for ModuleList?

samyam · 2021-02-03T20:31:48Z

deepspeed/profiling/flops_profiler/README.md

+    model = models.alexnet()
    batch_size = 256
-    macs, params, steps = get_model_profile(model, # the PyTorch model to be profiled
+    macs, params = get_model_profile(model=model, # model


It would be good to how how the model is called without the profiling so that the input_res and input_consturctors are clearer. Maybe have:
if profile:
macs, params = get_model_profile
else:
output/loss = model(....)

samyam · 2021-02-03T20:33:10Z

deepspeed/profiling/flops_profiler/README.md

-    macs, params, steps = get_model_profile(
+    batch_size = 5
+    seq_len = 128
+    macs, params = get_model_profile(


It would be good to how how the model is called without the profiling so that the input_res and input_consturctors are clearer. Maybe have:
if profile:
macs, params = get_model_profile
else:
output/loss = model(....)

Made changes as suggested

samyam · 2021-02-03T20:34:56Z

deepspeed/profiling/flops_profiler/README.md

-# Output:
-# Number of multiply-adds:        21.74 GMACs
-# Number of parameters:           109.48 M
+Below is an example of this usage in a typical training workflow.


In this mode, is the profiler capturing only the forward, or forward backward and step? Can we make this more explicit?

The profiler only captures the forward. I clarify this through the README.

docs/_tutorials/flops-profiler.md

deepspeed/profiling/flops_profiler/README.md

…, udpate readme

* Dist testing backend fixes, etc. (deepspeedai#708) * set_batch_fn and remove old sanity check (deepspeedai#712) * properly set engine.local_rank if it's set to -1 * Add executable permission to `ds_elastic` and `ds_report` in `bin`. (deepspeedai#711) * Add executable permission to `ds_elastic` and `ds_report` in `bin`. * Automatic `ds_elastic` formatting Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * local rank of -1 means not set (deepspeedai#720) * bump to 0.3.11 * [launcher] look ma, no more zombies (deepspeedai#714) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * Improve starred expressions (deepspeedai#696) * Improve starred expressions `deepspeed/profiling/flops_profiler/profiler.py` uses starred expressions that are no longer valid with [PEP 617][1]. The new Python parser is in 3.9, and this change allows DeepSpeed to run with the newest Python version. I have not checked all locations that has this issue. However, this change allows me to run simple examples. [1]: https://www.python.org/dev/peps/pep-0617/ * Match style for "Improve starred expressions", although readability suffers The style guide might need to be updated for this new use case of expressions. Python [Issue 40631][1] includes more discussion on the change. [1]: https://bugs.python.org/issue40631 Co-authored-by: Cheng Li <pistasable@gmail.com> * Fixed typo in Readme. (deepspeedai#737) * 1bit_adam dependencies (deepspeedai#742) * Clickable screenshots (deepspeedai#746) * Fix docstring * Make screenshots clickable for easier viewing * Add flops profiler tutorial (deepspeedai#682) * work on flops profiler tutorial * update flops profiler tutorial * add flops profiler tutorial and fix names * work on flops profiler tutorial * update flops profiler tutorial * add flops profiler tutorial and fix names * fix tailing ws * fix names * remove multistep profiling and update docs * fix cases where functionals and submodules coexist in a parent module, update readme * fix typo * always invoke post hook function * fix module flops sum and update tests * update tutorial * Only initialize distributed if required (deepspeedai#734) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Jon Eyolfson <eyolfson@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Cheng Li <pistasable@gmail.com> Co-authored-by: TheDudeFromCI <thedudefromci@gmail.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Sean Naren <sean@grid.ai>

* work on flops profiler tutorial * update flops profiler tutorial * add flops profiler tutorial and fix names * work on flops profiler tutorial * update flops profiler tutorial * add flops profiler tutorial and fix names * fix tailing ws * fix names * remove multistep profiling and update docs * fix cases where functionals and submodules coexist in a parent module, update readme * fix typo * always invoke post hook function * fix module flops sum and update tests * update tutorial

Nuclear6 · 2023-12-22T11:16:16Z

In the inference module, can I add the performance analysis tutorial of llama2?

cli99 added 30 commits October 19, 2020 20:09

add flops count profiler

86a33ca

add run script for the small test model

eedb939

fix flops sum and add batch counter

0b0b727

remove hook handles

558ad13

add conv2d flops

22ae752

add flops compute for major functionals and rnn modules

b093b71

work

5ce1d1f

fix flops count cal in post hook

2575dd3

add flops of embedding and dropout as 0

f5d1892

fix

d4d332e

add duration and throughput

b3914c7

add time and throughput

dbc24c8

add basic tracer for wall clock time breakdown

3ad7bc1

Added top module info summary

e8bb9f4

refactor

e67c650

refactor and add readme

4c9db1a

reorg folders

01db943

update readme

f91b552

fix xsp init import

964c689

update readme and rename batch to step

68b5260

fix multiple steps calc

9f38c8d

update readme

83ba5ed

rename pytorch-profiler to flops-profiler

9bd30d3

rename pytorch-profiler to flops-profiler

a061bf3

fix steps calc and update readme

c6a6a3f

update readme

9c4f3ad

update ds

10aee86

remove tracer code

aecccaf

fix incorrect merging

c6dfa07

fix formatting

b48ed30

cli99 requested review from RezaYazdaniAminabadi, ShadenSmith, minjiaz, niumanar, samyam and tjruwase as code owners January 20, 2021 18:57

cli99 added 2 commits January 27, 2021 08:14

remove multistep profiling and update docs

629bc8d

Merge branch 'master' into cheng/flops-profiler-tutorial

de8e56c

samyam reviewed Feb 3, 2021

View reviewed changes

docs/_tutorials/flops-profiler.md Outdated Show resolved Hide resolved

samyam reviewed Feb 3, 2021

View reviewed changes

deepspeed/profiling/flops_profiler/README.md Outdated Show resolved Hide resolved

cli99 added 2 commits February 5, 2021 02:42

fix cases where functionals and submodules coexsit in a parent module…

a137037

…, udpate readme

Merge branch 'master' into cheng/flops-profiler-tutorial

04026ac

cli99 mentioned this pull request Feb 9, 2021

Flops Profiler breaks after running #739

Closed

cli99 added 4 commits February 9, 2021 12:11

Merge branch 'master' into cheng/flops-profiler-tutorial

70317e4

fix typo

c5689d3

always invoke post hook function

c22f1b7

fix module flops sum and update tests

9db4b1a

samyam approved these changes Feb 10, 2021

View reviewed changes

cli99 added 2 commits February 11, 2021 00:54

update tutorial

cea03ed

Merge branch 'master' into cheng/flops-profiler-tutorial

cd88be1

cli99 merged commit e2dfe0d into deepspeedai:master Feb 11, 2021

cli99 deleted the cheng/flops-profiler-tutorial branch March 25, 2021 21:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add flops profiler tutorial #682

Add flops profiler tutorial #682

Uh oh!

cli99 commented Jan 20, 2021

Uh oh!

samyam Feb 3, 2021

Uh oh!

cli99 Feb 5, 2021

Uh oh!

samyam Feb 3, 2021 •

edited

Loading

Uh oh!

samyam Feb 3, 2021

Uh oh!

cli99 Feb 5, 2021

Uh oh!

samyam Feb 3, 2021

Uh oh!

cli99 Feb 5, 2021

Uh oh!

Uh oh!

Uh oh!

Nuclear6 commented Dec 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add flops profiler tutorial #682

Add flops profiler tutorial #682

Uh oh!

Conversation

cli99 commented Jan 20, 2021

Uh oh!

samyam Feb 3, 2021

Choose a reason for hiding this comment

Uh oh!

cli99 Feb 5, 2021

Choose a reason for hiding this comment

Uh oh!

samyam Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samyam Feb 3, 2021

Choose a reason for hiding this comment

Uh oh!

cli99 Feb 5, 2021

Choose a reason for hiding this comment

Uh oh!

samyam Feb 3, 2021

Choose a reason for hiding this comment

Uh oh!

cli99 Feb 5, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Nuclear6 commented Dec 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

samyam Feb 3, 2021 •

edited

Loading