Add the flops profiler which measures the time, number of estimated flops and parameters of each module in a model #544

cli99 · 2020-11-20T01:35:36Z

No description provided.

jeffra

I see DeepSpeedExamples (DSE) has lots of changes here, is this because we're wanting to point to an updated version of DSE that uses the profiler? or just pointing to a different commit than what the current master branch points to?

jeffra · 2020-11-20T20:53:26Z

deepspeed/profiling/flops_profiler/examples/bert.py

@@ -0,0 +1,43 @@
+from functools import partial


I wonder if we should move these examples to DSE? or maybe still in the DeepSpeed repo but not in the deepspeed package? I am not sure we want them to be installed with the library itself (e.g., hidden away as a library like
/usr/local/lib/python3.6/dist-packages/deepspeed) where they are hard to find and run directly.

I moved the examples to readme and tests/unit/test_flops_profiler.py

ShadenSmith · 2020-11-20T18:11:16Z

deepspeed/profiling/flops_profiler/examples/bert.py

@@ -0,0 +1,43 @@
+from functools import partial


I love that we have these examples. We might want to move them to the DSE repo? And we could also include them in a page on readthedocs if you think that would be good.

removed the examples. I will add them to dse and readthedocs.

ShadenSmith · 2020-11-20T18:13:14Z

tests/small_model_debugging/run.sh

@@ -0,0 +1,12 @@
+#! /bin/bash


Can we move this into our unit tests?

ShadenSmith · 2020-11-20T18:13:51Z

tests/small_model_debugging/run.sh

+export WORLD_SIZE=$(($GPUS_PER_NODE*$NNODES))
+
+# python test_model.py
+python test_model.py --flops_profiler true --profile_start_step=5 --profile_end_step=8


It would be great to have some unit tests with models that we know the expected flop counts for. Then we could run the flops profiler to ensure our counts stay consistent?

ShadenSmith · 2020-11-20T18:16:21Z

tests/small_model_debugging/test_model.py

    parser = argparse.ArgumentParser()
    parser.add_argument("--local_rank", type=int, default=0)
    parser.add_argument('--zero', type=int, default=0)
+    parser.add_argument('--flops_profiler', type=bool, default=False)


Same comment, it would be awesome to move this into our unit tests (tests/unit/).

ShadenSmith · 2020-11-20T18:26:13Z

deepspeed/profiling/flops_profiler/__init__.py

@@ -0,0 +1 @@
+from .profiler import get_model_profile, add_profile_methods, print_model_profile, print_model_aggregated_profile, flops_to_string, duration_to_string, params_to_string


With this many exports, we could also just from .profiler import * and then in profiler be sure to follow public/private naming conventions. Methods starting with _ won't be included.

I put all the functions in a FlopsProfiler class which will be the only thing to import

ShadenSmith · 2020-11-20T22:57:43Z

deepspeed/profiling/flops_profiler/profiler.py

+
+
+# FC
+F.linear = wrapFunc(F.linear, linear_flops_compute)


Will the below wraps occur each time the module is imported? Can things get double-wrapped?

I moved wrapping functionals into the profiler.start() and added unwrapping functionals in the profile.end().

ShadenSmith · 2020-11-20T22:59:01Z

deepspeed/profiling/flops_profiler/profiler.py

+F.linear = wrapFunc(F.linear, linear_flops_compute)
+
+# convolutions
+F.conv1d = wrapFunc(F.conv1d, conv_flops_compute)


Do we also unwrap these functionals after profiling is complete? The polite thing might be to only wrap while profiling is active, if that's possible. I see we do that with modules in end_profile?

ShadenSmith · 2020-11-20T23:00:04Z

deepspeed/profiling/flops_profiler/profiler.py

+F.avg_pool2d = wrapFunc(F.avg_pool2d, pool_flops_compute)
+F.avg_pool3d = wrapFunc(F.avg_pool3d, pool_flops_compute)
+F.max_pool1d = wrapFunc(F.max_pool1d, pool_flops_compute)
+F.max_pool2d = wrapFunc(F.max_pool2d, pool_flops_compute)


It would be cool to have a data structure that maps a list of functionals to the compute wrapper, and then at wrap time the engine code just traverses that data structure to do this work.

Monkey patching does not work if the target function is referenced as an item in a loop.

ShadenSmith · 2020-11-20T23:02:31Z

deepspeed/profiling/flops_profiler/profiler.py

+        return logits, probs
+
+
+if __name__ == "__main__":


This looks left over from development.

ShadenSmith · 2020-11-20T23:04:17Z

deepspeed/runtime/engine.py

 from ..ops.adam import DeepSpeedCPUAdam
 from ..ops.adam import FusedAdam

+from deepspeed.profiling.flops_profiler.profiler import add_profile_methods, print_model_profile, print_model_aggregated_profile, flops_to_string, params_to_string


Since we manage exports within each subpackage, we can just from deepspeed.profiling import * to expose the high-level API I think. That's a bit nicer than having a long list like this.

Since we do have this pattern of exporting a lot of methods, what about wrapping them up in a FlopsProfiler class?

samyam

I want to just leave a high level review here based on some our earlier discussion. I think it would be very cool to do the profiling using profiler.start() and profiler.stop() instead of using get_model_profile(...). This is aligned more closely with general training workflows, and additionally the it would not require the user to create an input constructor. Overall, it would be a more natural and seamless way to use the profiler. What do you think @cli99 ?

ShadenSmith · 2020-11-23T22:28:01Z

I want to just leave a high level review here based on some our earlier discussion. I think it would be very cool to do the profiling using profiler.start() and profiler.stop() instead of using get_model_profile(...). This is aligned more closely with general training workflows, and additionally the it would not require the user to create an input constructor. Overall, it would be a more natural and seamless way to use the profiler. What do you think @cli99 ?

This would also go towards addressing the import side effect we chatted about @cli99. It might be a good opportunity to use contextlib ?

RezaYazdaniAminabadi · 2020-11-20T18:48:24Z

deepspeed/profiling/flops_profiler/profiler.py

+    return p
+
+
+def linear_flops_compute(input, weight, bias=None):


Do we wanna account for the bias flops when it is not None?

what would be the flops for computing bias? I thought it's zero.

You can consider the dimension of bias (like out_features), which is the number of addition. But, this will be very small probably, so that's okay if not considering that.

RezaYazdaniAminabadi · 2020-11-20T18:49:04Z

deepspeed/profiling/flops_profiler/profiler.py

+
+def linear_flops_compute(input, weight, bias=None):
+    out_features = weight.shape[0]
+    return torch.numel(input) * out_features


Shouldn't we multiply this by 2 to account for multiply-add?

the flops profiler outputs MACs and the user can multiply that by 2 or anything. The throughput of a module is computed as TFLOPS where I do MACs*2/time.

I see, but I was thinking that this is not always MAC. I am not sure if we want to take into account the point-wise operations into the profiler like bias-add or normalize.

samyam · 2020-11-23T22:29:53Z

deepspeed/profiling/flops_profiler/README.md

+for step, batch in enumerate(data_loader):
+  # start profiling at training step "profile_step"
+  if step == profile_start_step:
+    model.start_profile()


Im wondering if its more transparent to not overwrite the model into a profiler, and keep the model intact? Something like

model_profiler = prof.add_profile_method(model)

model_profiler.start()
y=model(x)
model_profiler.stop()

Also for getting the info we can then do

model_profiler.print_all()
flops = model_profiler.get_flops()

I actually don't know which one is better. Maybe we should have a discussion on this?

The readme shown here is out of date. The latest code does this. Happy to chat more about it.

jeffra · 2021-01-13T00:49:19Z

Closing this one after rebasing/squashing, moved to #664

cli99 added 28 commits October 19, 2020 20:09

add flops count profiler

86a33ca

add run script for the small test model

eedb939

fix flops sum and add batch counter

0b0b727

remove hook handles

558ad13

add conv2d flops

22ae752

add flops compute for major functionals and rnn modules

b093b71

work

5ce1d1f

fix flops count cal in post hook

2575dd3

add flops of embedding and dropout as 0

f5d1892

fix

d4d332e

add duration and throughput

b3914c7

add time and throughput

dbc24c8

add basic tracer for wall clock time breakdown

3ad7bc1

Added top module info summary

e8bb9f4

refactor

e67c650

refactor and add readme

4c9db1a

reorg folders

01db943

update readme

f91b552

fix xsp init import

964c689

update readme and rename batch to step

68b5260

fix multiple steps calc

9f38c8d

update readme

83ba5ed

rename pytorch-profiler to flops-profiler

9bd30d3

rename pytorch-profiler to flops-profiler

a061bf3

fix steps calc and update readme

c6a6a3f

update readme

9c4f3ad

update ds

10aee86

remove tracer code

aecccaf

cli99 requested review from arashashari and awan-10 as code owners November 20, 2020 01:35

cli99 marked this pull request as ready for review November 20, 2020 03:46

jeffra reviewed Nov 20, 2020

View reviewed changes

ShadenSmith reviewed Nov 20, 2020

View reviewed changes

cli99 marked this pull request as draft November 23, 2020 21:17

samyam suggested changes Nov 23, 2020

View reviewed changes

cli99 and others added 7 commits November 24, 2020 00:26

fix sort value divided by steps

6e5ad79

print info when steps is 0

03a521c

Merge branch 'master' into cheng/flops_profiler

9b1a7b6

add flops profiler class and unit test

8ad406b

delete examples

99e97e1

add reload functionals in end profile

8403aaf

use config dict for flops profiler config

d1aaaf9

RezaYazdaniAminabadi reviewed Nov 30, 2020

View reviewed changes

add doc strings and update readme

b26f2fb

cli99 marked this pull request as ready for review November 30, 2020 21:42

cli99 requested review from RezaYazdaniAminabadi, ShadenSmith, jeffra and samyam November 30, 2020 21:42

cli99 changed the title ~~Cheng/flops profiler~~ Add the flops profiler which measures the time, number of estimated flops and parameters of each module in a model Nov 30, 2020

Merge branch 'master' into cheng/flops_profiler

cb2dfa0

samyam approved these changes Dec 11, 2020

View reviewed changes

cli99 added 2 commits December 11, 2020 12:56

Merge branch 'master' into cheng/flops_profiler

78c4cc3

Merge branch 'master' into cheng/flops_profiler

0a1b04f

cli99 mentioned this pull request Jan 13, 2021

Add the flops profiler which measures the time, number of estimated flops and parameters of each module in a model #664

Merged

jeffra closed this Jan 13, 2021

cli99 deleted the cheng/flops_profiler branch March 25, 2021 21:20

		@@ -0,0 +1 @@
		from .profiler import get_model_profile, add_profile_methods, print_model_profile, print_model_aggregated_profile, flops_to_string, duration_to_string, params_to_string

Add the flops profiler which measures the time, number of estimated flops and parameters of each module in a model #544

Add the flops profiler which measures the time, number of estimated flops and parameters of each module in a model #544

Uh oh!

Conversation

cli99 commented Nov 20, 2020

Uh oh!

jeffra left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samyam left a comment

Choose a reason for hiding this comment

Uh oh!

ShadenSmith commented Nov 23, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeffra commented Jan 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants