CUDA Fuser instrumentation by tlemo · Pull Request #324 · csarofeen/pytorch

tlemo · 2020-08-25T17:03:28Z

A prototype for a lightweight Fuser instrumentation.

…entation

jjsjann123 · 2020-09-02T19:35:26Z

+#define FUSER_MACRO_CONCAT(a, b) FUSER_MACRO_CONCAT2(a, b)
+#define FUSER_ANONYMOUS(prefix) FUSER_MACRO_CONCAT(prefix, __COUNTER__)
+
+#define FUSER_PERF_SCOPE(name) \


Can we enable a debug level on the marker?
So we can easily turn in/out certain level of marker.

Can you please elaborate? Do you mean levels for individual markers? I'm curious to understand the use cases you have in mind.

…entation

jjsjann123

LGTM, thanks for putting all the instrument there already.

jjsjann123 · 2020-09-16T16:25:56Z

 #include <torch/csrc/jit/codegen/cuda/lower2device.h>

 #include <c10/util/Optional.h>
+#include <c10/util/flat_hash_map.h>


Don't see where we are using this.

leftover from a local experiment, removed (thanks for catching it)

(flat_hash_map is potentially a much faster alternative to std::unordered_map)

jjsjann123 · 2020-09-16T16:28:59Z

 /build_*
 .build_debug/*
 .build_release/*
+.build_profile/*


Any reason we are adding an entry to ignore?

it's useful for building RelWithDebInfo builds

Wasn't this supposed to be separated from a fuser upstream PR? Or, have we decided to sneak it in?

I have an upstream PR (pytorch#44399). It's approved and imported, although I'm not sure what the "imported" means, but I can't merge the PR due to unrelated CI failures?

In the meantime, I have this change in multiple branches, but I'll try to clean it up so it will not show up in PRs (I'll remove it here as well)

jjsjann123 · 2020-09-16T17:05:58Z

+void Trace::logEvent(char ph, const char* name, char sep) {
+  const std::chrono::duration<double> d = Clock::now() - start_timestamp_;
+  const double elapsed = d.count() * 1e6;
+  const unsigned int pid = 0;


pid and tid has not been looked up.

yes, they are just placeholders for now. I've added a TODO to add support for tracing multi-process & multi-threaded execution (which is not critical for us at this point, and it requires a bit of research to see if we have any Pytorch helpers for portable TID/PIDs)

tlemo added 16 commits August 24, 2020 16:58

Built-in fuser instrumentation prototype

4909d39

Instrument more scopes

9a1a47d

Instrument global buffer allocation

fa3fc98

Merge remote-tracking branch 'origin/20_8_18_devel' into perf_instrum…

452e185

…entation

Instrument FusionExecutor::computeSharedMemory()

36bc093

clang-format

641fd6b

Reorganize the instrumentation code

157be47

Minor change to please clang-tidy

9184a77

Merge remote-tracking branch 'origin/20_8_18_devel' into perf_instrum…

c5b2c7c

…entation

Ad hoc benchmark

2bc62b8

Format results

ff54077

swap std::unordered_map with ska::flat_hash_map

a2f78b9

Merge branch 'perf_instrumentation' into exp_benchmark

03c732b

Update .gitignore

488dfe7

Adding a missing perf scope

8f6e7d3

Tweak the trace format

7cb904b

jjsjann123 reviewed Sep 2, 2020

View reviewed changes

tlemo added 7 commits September 2, 2020 13:12

Merge branch 'exp_benchmark' into perf_instrumentation

201d7da

Merge remote-tracking branch 'origin/20_8_18_devel' into perf_instrum…

1895361

…entation

Fix merge conflict

de3a6c6

Merge remote-tracking branch 'origin/20_8_18_devel' into perf_instrum…

321c560

…entation

Documentation comments

fcfca6c

clang-format

fc77d14

Revisited FUSER_PERF_SCOPE markers

b483870

tlemo requested review from kevinstephano, naoyam and rdspring1 September 15, 2020 20:39

tlemo marked this pull request as ready for review September 15, 2020 20:39

tlemo changed the title ~~[WIP] Fuser instrumentation prototype~~ CUDA Fuser instrumentation Sep 15, 2020

A few more comments

f408e74

tlemo added 3 commits September 15, 2020 14:00

Fix namespace

ddf36a8

Merge remote-tracking branch 'origin/20_8_18_devel' into perf_instrum…

794e932

…entation

Instrument the new generateCudaKernel()

af836bd

jjsjann123 approved these changes Sep 16, 2020

View reviewed changes

tlemo added 2 commits September 17, 2020 10:37

Minor changes based on code review feedback

9765010

Revert .gitignore changes

fbd9177

tlemo merged commit ee6a20a into 20_8_18_devel Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA Fuser instrumentation#324

CUDA Fuser instrumentation#324
tlemo merged 29 commits into20_8_18_develfrom
perf_instrumentation

tlemo commented Aug 25, 2020 •

edited

Loading

Uh oh!

jjsjann123 Sep 2, 2020

Uh oh!

tlemo Sep 3, 2020

Uh oh!

jjsjann123 left a comment

Uh oh!

jjsjann123 Sep 16, 2020

Uh oh!

tlemo Sep 17, 2020

Uh oh!

jjsjann123 Sep 16, 2020

Uh oh!

tlemo Sep 17, 2020

Uh oh!

naoyam Sep 17, 2020

Uh oh!

tlemo Sep 17, 2020

Uh oh!

Uh oh!

jjsjann123 Sep 16, 2020

Uh oh!

tlemo Sep 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tlemo commented Aug 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjsjann123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tlemo commented Aug 25, 2020 •

edited

Loading