refactor profiling optional by jjsjann123 · Pull Request #47667 · pytorch/pytorch

jjsjann123 · 2020-11-10T12:04:45Z

Stack from ghstack:

profile ivalue for nvfuser #47668 profile ivalue for nvfuser
refactor profiling optional #47667 refactor profiling optional

Differential Revision: D25255572

[ghstack-poisoned]

dr-ci · 2020-11-10T14:20:28Z

💊 CI failures summary and remediations

As of commit 9ceca35 (more details on the Dr. CI page):

2/2 failures possibly* introduced in this PR
- 2/2 non-CircleCI failure(s)

Extra GitHub checks: 2 failed

Failed: GitHub Actions - clang-tidy
Failed: GitHub Actions - clang-format

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

eellison

This looks good, i'm a little wary of making changes just because of how easy it is to break something on the backward pass without knowing that you did it. Do you mind running a set of benchmarks to validate that this doesn't regress anything ?

Ideally that infrastructure would be automated, but not yet...

eellison · 2020-11-17T20:04:20Z

+}
+
+void InsertProfileNodesForSpecializeAutogradZero(ProfilingRecord* pr) {
+  insertProfileNodesForSpecializeAutogradZero_(


nit: remove _ suffix, i dont htink that follows naming convention

eellison · 2020-11-18T18:53:33Z

    auto copy = graph->copy();
    runProfilingInsensitiveOptimizations(copy);
    pr_ = ProfilingRecord::instrumentGraph(copy);
+    // TODO: order for now is important; we should make


What was the context of this ?

My bad on this one during rebase.
The comment should be added when we add a second pass to insert profile_ivalue nodes for nvfuser https://github.com/pytorch/pytorch/pull/47668/files#diff-4b6a31b50172ed0fce36027a913fab9e59e496954264ad5e021324c74355a9c2R498-R500

cc'ing @Krovatkin

@eellison
If I remember correctly if we run InsertProfileNodesForCUDAFuser first it will insert a profile node on a list input (which is also a graph input) to aten::grad_sum_to_size which would in turn break the logic in InsertProfileNodesForSpecializeAutogradZero that checks that there's a direct use of the graph input that is also aten::grad_sum_to_size : https://github.com/pytorch/pytorch/pull/47667/files#diff-d606514ce9e63ab23356691f3f6df0616222cd50b8a41239add4601f14a53392R16

Ideally, it would be nice if different passes could insert profile nodes in whatever order, but I wasn't quite convinced of the merit of that approach, so I just decided to document it.

eellison · 2020-11-18T18:53:50Z

  for (auto it = b->nodes().begin(); it != b->nodes().end(); it++) {
-    if (it->kind() == prim::profile || it->kind() == prim::profile_optional) {
+    if (it->kind() == prim::profile || it->kind() == prim::profile_optional ||
+        it->kind() == prim::profile_ivalue) {


Could we remove all references to profile_optional ?

jjsjann123 · 2020-11-20T00:21:15Z

I quickly tried this with a very simple graph:

def t(x : torch.Tensor, y : torch.Tensor):
  o = torch.relu(x)
  o = o + y
  return o

I tried input x/y with / without broadcast and I can verify behavior and performance on _grad_sum_to_size

The refactor of this PR only changes the symbol from profile_optional to profile_ivalue, while the logic remains largely the same.

ghstack-source-id: bc60515 Pull Request resolved: pytorch#47667

[ghstack-poisoned]

eellison

Looks good, thanks for doing this!

Mostly small comments/nits, but I would like to retain the original flow or specialize_autogradzero before landing. for context, there have been very seemingly-benign changes that completely destroyed perf for hard-to-detect reasons. e.g.: #43900

backwards-graph doesn't have great visibility or testing and i would prefer not to make changes unrelated to this PR.

eellison · 2020-12-02T23:27:30Z


-    // we only check prim::profile and not prim::profile_optional bc profile
-    // is inserted on each use, while profile_optional is inserted on the def
+    // we only check prim::profile and not prim::profile_ivalue bc profile


The reason profile_optional was inserted on def and not per use is because we were just checking whether or not a value is None, which stays constant for each use.

General ivalue profiling doesn't have the same property, since you could profile mutable values like List which change over each use. So, this comment will be out of date once we have more uses of profile_ivalue.

i also wonder whether there's some value in sharing implementation, but having a different node name. depends i guess on the number of different ivalue profiling we have

I agree that we need to update the comment (and the checks as well?)

sharing implementation, but having a different node name
I think this is suggesting to have profile_ivalue as an extension to profile_optional, which is similar to what I have in the original PR prior to Nick's refactor.

IIUC, the refactor is to have a unique profile_ivalue node, where we would hopefully work towards a coherent profiling strategy that different passes relying on deploying/exploiting profiling ivalues could work consequently with each other. Granted that this is not what we have today, at least the order of passes has to be arranged in certain way to work. More so that it's tricky to cover all the corner cases.

I think the suggested middle ground to have different node name could make it easier to prevent profiling nodes inserted by different passes to breach into each other's territory for now. And we can merge them at later point when we figure out a universal profiling strategy.

@eellison

i also wonder whether there's some value in sharing implementation, but having a different node name. depends i guess on the number of different ivalue profiling we have

at the very least, it would give a better readability and slightly easier check since you only have to look for a specific symbol (rather a symbol + attribute)

On the other hand, the way we are adding profiling nodes, switching to different node names would result in a small explosion of copy-pasted classes in ir.h.
Also, if we decide in future to merge profiling node chains into a single profiling node with all the attributes, I think, the benefits of having just one single node on a use outweighs differently named nodes.
I would hold off on this approach.

@jjsjann123 @eellison

Granted that this is not what we have today, at least the order of passes has to be arranged in certain way to work

I would like to emphasize that this is no way the limitation of the current design! We could update how SpecializeAutogradZero inserts profiling nodes to account for a possibility that some other pass might have already inserted profiling nodes in between. Given we only have 3 passes that insert profiling nodes I decided to just document the order.

If we can figure out a way to have a different node-naming without a ton of copy-pasta, that's the best. I don't really know how to do that; maybe someone really good at C++ can.

Also, if we decide in future to merge profiling node chains into a single profiling node with all the attributes, I think, the benefits of having just one single node on a use outweighs differently named nodes.

i don't think this is coming anytime soon (if at all). I don't think we should over-consider this.

Granted that this is not what we have today, at least the order of passes has to be arranged in certain way to work

I think most things should be profiling per-use, which composes pretty well. specialize_autograd_zero is pretty unique

eellison · 2020-12-02T23:31:48Z

    }
  }

+  static Node* getUseWithAttribute(Value* inp, Symbol kind) {


nit: getUseWithAttribute doesn't fully reflect the implementation, since we're also filtering out non profile_ivalue uses

eellison · 2020-12-02T23:34:24Z

          checks.push_back(check->output());
          profiled_none_.insert(inp);
        }
-        removeProfiledOptionalUses(inp);


I'm not convinced that changing removeProfiledOptionalUses(inp) to after guardSpecializations doesn't have some edge-case effect somewhere. Could we keep the code here as it was to minimize changes ?

I think there were multiple reasons why I moved it after guardSpecializations. One of them was that there could be arbitrary long chains of prim::profiles on some graph inputs. Even for a very simple case I considered where **both ** TE and SpecializeAutogradZero want to profile the optional list argument to aten::grad_sum_to_size, rewriting removeProfiledOptionalUses looked very ugly and I hit some edge cases there.
In contrast, walking a graph and removing after guardSpecializations is done looked so much simpler and it also looked quite similar to what we do elsewhere.

I tested this with fastrnns and our CI tests and didn't hit any regressions or crashes.
If you feel very strongly that removeProfiledOptionalUses(inp) should stay in this specific spot, we could try figuring it out.

If "rewriting removeProfiledOptionalUses looked very ugly" that is a good sign that pass writing is becoming more difficult as a result of the overloaded profile_ivalue cc @jjsjann123.

i don't feel that strongly, but I do prefer it as it was. IMO when there isn't a large overhead of doing so, we should be cleaning the graph as we are manipulating it in the pass. It's the same reason I added

pytorch/torch/csrc/jit/passes/specialize_autogradzero.cpp

Line 301 in 1b8d922

// Remove the dead original graph

: it was extremely confusing to look at the pass midway through and see three copies of the graph. Why was it there ? Every time I debugged the pass I had to re-remind myself and mentally edit out the dead copy of the graph. Similarly, after we've made the decision whether or not to specialize on None values, why not remove the nodes right there? They are just visual noise for the duration of guardSpecializations.

The logic right now isn't quite right, either.

def foo(x): x = profile_ivalue[None=1, Present=1] return x

Here we won't insert any specializations, so the profile_ivalue nodes will remain in the graph. Granted this isn't likely to happen, but it's sort of a fallout of delaying logic past when it needs to be.

I am maybe more opinionated about this than I should be because in the past I've had to deal with passes that had incorrect IR (like when we never updated the requires_grad property) or other passes that similarly left things in confusing states throughout the pass.

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

…or nvfuser. We tried to go around PR pytorch#47667 refactor profiling optional, since upstream is still working on it at this time.

* This is a cherry-pick from upstream PR pytorch#47668 profile ivalue for nvfuser. We tried to go around PR pytorch#47667 refactor profiling optional, since upstream is still working on it at this time. createConditionalConstant supports profile ivalue including bool, int_list and size New guard to check conditional constant at runtime size_eq_guard op to facilitate comparison of dynamic sizes sum_to_size & _grad_sum_to_size added in integration

Krovatkin

looks like we addressed @eellison's feedback (e.g. moving removeProfiledOptionalUses back to its original spot and double checking we aren't seeing any obvious regressions on fastrnns).

@jjsjann123 could you please fix clang-format failures and rebase this PR and I'll try to merge it?

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

jjsjann123 · 2021-01-12T12:55:07Z

clang-format seems to be broken, I'll try rebase again tomorrow to see if someone got it fixed.

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

jjsjann123 · 2021-01-13T05:44:41Z

Hmm, even though my local clang-format is still failing... CI magically works :)
But the PR apparently is giving me merge conflicts.. :(

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

jjsjann123 · 2021-01-17T02:11:33Z

clang-tidy error is not on code that's modified by this PR. Should I try to tidy them up here? rocm failure doesn't seem to be relevant.

@Krovatkin Feel free to yell at me to fix clang-tidy, otherwise, this one is good to go.

Krovatkin · 2021-01-17T06:04:28Z

@jjsjann123 that should be fine. @eellison wanted to take one last look, he should be getting to it shortly.

eellison · 2021-01-19T17:40:22Z

LGTM !

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

facebook-github-bot · 2021-01-22T22:49:37Z

@Krovatkin merged this pull request in dd1c2a0.

Summary: Pull Request resolved: pytorch#47667 Test Plan: Imported from OSS Reviewed By: anjali411, ngimel Differential Revision: D25255572 Pulled By: Krovatkin fbshipit-source-id: d0152c9ef5b1994e27be9888bcb123dca3ecd88f

refactor profiling optional

62b6003

[ghstack-poisoned]

jjsjann123 requested a review from apaszke as a code owner November 10, 2020 12:04

This was referenced Nov 10, 2020

Remove output used only by sizes (#448) #47665

Closed

adding profile_ivalue #47666

Closed

profile ivalue for nvfuser #47668

Closed

facebook-github-bot added cla signed oncall: jit Add this issue/PR to JIT oncall triage queue labels Nov 10, 2020

jjsjann123 mentioned this pull request Nov 10, 2020

[WIP] Profile ivalue pr #47495

Closed

pytorchbot added the open source label Nov 10, 2020

Krovatkin requested a review from eellison November 13, 2020 19:14

eellison reviewed Nov 18, 2020

View reviewed changes

jjsjann123 added a commit to jjsjann123/pytorch that referenced this pull request Nov 20, 2020

refactor profiling optional

8f88cae

ghstack-source-id: bc60515 Pull Request resolved: pytorch#47667

jjsjann123 added 2 commits November 19, 2020 18:01

Update on "refactor profiling optional"

7b2a6a5

[ghstack-poisoned]

Update on "refactor profiling optional"

1b8d922

[ghstack-poisoned]

jjsjann123 requested a review from eellison November 20, 2020 16:58

eellison reviewed Dec 2, 2020

View reviewed changes

jjsjann123 added 4 commits December 3, 2020 15:22

Update on "refactor profiling optional"

42a7558

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Update on "refactor profiling optional"

a4ab34e

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Update on "refactor profiling optional"

fdbbe16

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Update on "refactor profiling optional"

2686b37

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Krovatkin approved these changes Jan 11, 2021

View reviewed changes

Update on "refactor profiling optional"

382f711

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

jjsjann123 added 2 commits January 12, 2021 14:04

Update on "refactor profiling optional"

146c0d2

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Update on "refactor profiling optional"

1f41e16

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

jjsjann123 added 3 commits January 12, 2021 21:47

Update on "refactor profiling optional"

9ceca35

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Update on "refactor profiling optional"

4b89c9d

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Update on "refactor profiling optional"

2f7bb89

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

jjsjann123 added 2 commits January 19, 2021 15:22

Update on "refactor profiling optional"

efb12eb

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

Update on "refactor profiling optional"

e6a0a61

Differential Revision: [D25255572](https://our.internmc.facebook.com/intern/diff/D25255572) [ghstack-poisoned]

facebook-github-bot closed this in dd1c2a0 Jan 22, 2021

facebook-github-bot added the Merged label Jan 22, 2021

facebook-github-bot deleted the gh/jjsjann123/3/head branch January 26, 2021 15:21

Conversation

jjsjann123 commented Nov 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci Bot commented Nov 10, 2020 • edited by facebook-github-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Extra GitHub checks: 2 failed

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjsjann123 commented Nov 20, 2020

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Krovatkin Dec 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Krovatkin left a comment

Choose a reason for hiding this comment

Uh oh!

jjsjann123 commented Jan 12, 2021

Uh oh!

jjsjann123 commented Jan 13, 2021

Uh oh!

jjsjann123 commented Jan 17, 2021

Uh oh!

Krovatkin commented Jan 17, 2021

Uh oh!

eellison commented Jan 19, 2021

Uh oh!

facebook-github-bot commented Jan 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jjsjann123 commented Nov 10, 2020 •

edited

Loading

dr-ci Bot commented Nov 10, 2020 •

edited by facebook-github-bot

Loading

Krovatkin Dec 3, 2020 •

edited

Loading