Use dynamo fake tensor mode in aot_autograd, move aot_autograd compilation to lowering time [Merger of 89672 and 89773] by voznesenskym · Pull Request #90039 · pytorch/pytorch

voznesenskym · 2022-12-02T02:12:33Z

Stack from ghstack (oldest at bottom):

After all of the preparatory commits, this is a subset of the
changes in #89392 that actually
change us to propagating fake tensors to backends.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

This is the merger of Ed's PR #89672, which is a rewrite of an older PR of mine (#89392), with CI Fixes on top of it (#89773)

cc @mlazos @soumith @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

… aot_autograd compilation to lowering time After all of the preparatory commits, this is a subset of the changes in #89392 that actually change us to propagating fake tensors to backends. Signed-off-by: Edward Z. Yang <ezyangfb.com> [WIP] Commit to run CI on vaguely sus C++ changes uncovered during fixing upstream prototype Wip Random stuff Fix tests Fix char make shape prop fake tensor friendly Fix Test fixes Feedback, test fixes, xla shenanigins xla maddness xla maddness Feedback undo undo Fix test rm stupid stuff [ghstack-poisoned]

pytorch-bot · 2022-12-02T02:12:35Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90039

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures, 2 Pending

As of commit fb21875:

The following jobs have failed:

manywheel-py3_7-cuda11_6-test / build

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… aot_autograd compilation to lowering time After all of the preparatory commits, this is a subset of the changes in #89392 that actually change us to propagating fake tensors to backends. Signed-off-by: Edward Z. Yang <ezyangfb.com> [WIP] Commit to run CI on vaguely sus C++ changes uncovered during fixing upstream prototype Wip Random stuff Fix tests Fix char make shape prop fake tensor friendly Fix Test fixes Feedback, test fixes, xla shenanigins xla maddness xla maddness Feedback undo undo Fix test rm stupid stuff ghstack-source-id: 13f3e9b Pull Request resolved: #90039

ezyang · 2022-12-02T03:36:27Z

test/dynamo/test_dynamic_shapes.py

    # Cannot call sizes() on tensor with symbolic sizes/strides
 )

+


WHITESPAAAACE

ezyang · 2022-12-02T03:38:01Z

test/dynamo/test_modules.py


+    # torch._subclasses.fake_tensor.UnsupportedFakeTensorException: meta converter nyi
+    # due to custom subclass (TensorProxy)
+    @unittest.expectedFailure


I have a fix for this in a local patchset, this xfail AOK

ezyang · 2022-12-02T03:39:19Z

test/dynamo/test_verify_correctness.py


        self.assertIsNotNone(r1)
-        self.assertTrue(same(r1, r2))
-        self.assertTrue(same(r1, r3))


Why don't you just assert same(r2, r3) and call it a day lol

we care about r1 as well

ezyang · 2022-12-02T03:43:33Z

torch/fx/passes/shape_prop.py

+                self.module = self.fake_module
+
            result = super().run_node(n)
+            self.module = self.real_module


If run_node raises an exception this won't reset the module.

There's no need to do a swap like this. The only thing you need to override is a few ops (get_attr, call_module I think?) to fetch from fake module rather than real module.

If run_node raises an exception this won't reset the module.

good call

There's no need to do a swap like this. The only thing you need to override is a few ops (get_attr, call_module I think?) to fetch from fake module rather than real module.

Yeah but we already override this, this feels simpler. No strong opinion tho.

ezyang · 2022-12-02T03:53:32Z

torch/_dynamo/utils.py

+    # TODO: this is questionable
+    if isinstance(x, torch._subclasses.FakeTensor):
+        # this func fails on fake tensors in __torch_dispatch__
+        return x


I'll put up with this because I think my pending refactor will resolve the confusion here

ezyang · 2022-12-02T03:54:12Z

torch/_dynamo/utils.py

+            if fake_mode is None:
+                fake_mode = flat_input.fake_mode
+            else:
+                assert fake_mode == flat_input.fake_mode


is, not equality, plz

ezyang · 2022-12-02T03:58:14Z

torch/_dynamo/optimizations/backends.py

+
+    def fwd(*args):
+        nonlocal compiled_graph
+        model = subgraph.model


nit: this will keep subgraph permanently alive, whereas previously it could have been GC'ed after compilation. You should del subgraph being done with compilation

ezyang · 2022-12-02T04:00:17Z

torch/_dynamo/optimizations/distributed.py

                self.compiler = compiler

-            def compile_submod(self, submod, args, kwargs):
+            def compile_submod(self, input_mod, args, kwargs):


The variable renaming makes it harder for code reviewers

like, afaict, there's literally no change from lines 211 to 250 but I had to carefully audit to make sure there weren't typos lol

This is from your PR?

which is originally from https://github.com/pytorch/pytorch/pull/89392/files#diff-ebfc1054a8736abfc050c0a51c1f64f323a8cc3088f3f621d628cf7b421401faR219 lol

ezyang · 2022-12-02T04:05:25Z

torch/_dynamo/optimizations/distributed.py


            def run_node(self, n: Node) -> Any:
                with fx_traceback.append_stack_trace(n.stack_trace):
                    args, kwargs = self.fetch_args_kwargs_from_env(n)


It would be really nice to have a comment here explaining what's going on

ezyang · 2022-12-02T04:11:27Z

torch/_dynamo/optimizations/distributed.py

+                        if isinstance(arg, torch.Tensor) and not isinstance(
+                            arg, torch._subclasses.FakeTensor
+                        ):
+                            new_args.append(fake_mode.from_tensor(arg))


It would be much better if we could assume all the args are already fakeified (and we just maintain the invariant that all the intermediates are fake tensors). In particular, fakeifying a tensor can trigger the allocation of new symints, but if they're not properly associated as inputs we may not be able to instantiate those variables on subsequent runs.

tl;dr I suspect arg is always a fake tensor here

Im not sure we can, I'm pretty sure we saw non fakes in here, but maybe that was in an intermediate state of this PR. I can check again.

you can check this by pushing a separate PR that tightens the invariant and see if it fails or not

pytorchmergebot · 2022-12-03T01:19:51Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

clee2000 · 2022-12-04T21:55:45Z

@pytorchbot revert -m "broke xla tests https://hud.pytorch.org/pytorch/pytorch/commit/ef0c7ec958439caf44a98fb7b70d920c6c2264b9 https://github.com/pytorch/pytorch/actions/runs/3606308473/jobs/6077646142" -c landrace

pytorchmergebot · 2022-12-04T21:57:25Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2022-12-04T21:57:34Z

@voznesenskym your PR has been successfully reverted.

…d compilation to lowering time [Merger of 89672 and 89773] (#90039)" This reverts commit ef0c7ec. Reverted #90039 on behalf of https://github.com/clee2000 due to broke xla tests https://hud.pytorch.org/pytorch/pytorch/commit/ef0c7ec958439caf44a98fb7b70d920c6c2264b9 https://github.com/pytorch/pytorch/actions/runs/3606308473/jobs/6077646142

voznesenskym · 2022-12-04T22:34:42Z

@pytorchbot revert -m "broke xla tests https://hud.pytorch.org/pytorch/pytorch/commit/ef0c7ec958439caf44a98fb7b70d920c6c2264b9 https://github.com/pytorch/pytorch/actions/runs/3606308473/jobs/6077646142" -c landrace

I'd rather we have forward fixed, its a 1 liner to fix. Alas.

ezyang · 2022-12-05T00:11:35Z

the way to correctly land xla changes is to update the hash in the PR to your branch on xla. Once landed, merge the xla branch into master. The automatic head updating process will eventually reset xla's commit id to master.

voznesenskym · 2022-12-05T00:30:36Z

the way to correctly land xla changes is to update the hash in the PR to your branch on xla. Once landed, merge the xla branch into master. The automatic head updating process will eventually reset xla's commit id to master.

Yes. But I did it in a way that should not need XLA changes.

voznesenskym · 2022-12-05T01:55:13Z

@pytorchbot merge -f "weird unrelated py3.7 pip install bug in manywheel"

pytorchmergebot · 2022-12-05T01:56:46Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

voznesenskym · 2022-12-05T18:00:35Z

@pytorchmergebot / @pytorchbot merged the wrong commit >:(

pytorch-bot · 2022-12-05T18:00:37Z

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: '/' (choose from 'merge', 'revert', 'rebase', 'label', 'drci')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci} ...

Try @pytorchbot --help for more info.

…ation to lowering time [Merger of 89672 and 89773] (#90039) After all of the preparatory commits, this is a subset of the changes in #89392 that actually change us to propagating fake tensors to backends. Signed-off-by: Edward Z. Yang <ezyangfb.com> This is the merger of Ed's PR #89672, which is a rewrite of an older PR of mine (#89392), with CI Fixes on top of it (#89773) Pull Request resolved: #90039 Approved by: https://github.com/ezyang fix

…ation to lowering time [Merger of 89672 and 89773] (pytorch#90039) After all of the preparatory commits, this is a subset of the changes in pytorch#89392 that actually change us to propagating fake tensors to backends. Signed-off-by: Edward Z. Yang <ezyangfb.com> This is the merger of Ed's PR pytorch#89672, which is a rewrite of an older PR of mine (pytorch#89392), with CI Fixes on top of it (pytorch#89773) Pull Request resolved: pytorch#90039 Approved by: https://github.com/ezyang

…d compilation to lowering time [Merger of 89672 and 89773] (pytorch#90039)" This reverts commit ef0c7ec. Reverted pytorch#90039 on behalf of https://github.com/clee2000 due to broke xla tests https://hud.pytorch.org/pytorch/pytorch/commit/ef0c7ec958439caf44a98fb7b70d920c6c2264b9 https://github.com/pytorch/pytorch/actions/runs/3606308473/jobs/6077646142

…ation to lowering time [Merger of 89672 and 89773] (pytorch#90039) After all of the preparatory commits, this is a subset of the changes in pytorch#89392 that actually change us to propagating fake tensors to backends. Signed-off-by: Edward Z. Yang <ezyangfb.com> This is the merger of Ed's PR pytorch#89672, which is a rewrite of an older PR of mine (pytorch#89392), with CI Fixes on top of it (pytorch#89773) Pull Request resolved: pytorch#90039 Approved by: https://github.com/ezyang

pytorch-bot bot added the release notes: fx release notes category label Dec 2, 2022

github-actions bot added ciflow/inductor module: dynamo module: inductor labels Dec 2, 2022

github-actions bot requested review from Chillee, SherlockNoMad, albanD, anjali411, antoniojkim, bdhirsh, ezyang, miladm and wconstab December 2, 2022 02:12

voznesenskym changed the title ~~[UPDATED PROTOTYPE] Use dynamo fake tensor mode in aot_autograd, move aot_autograd compilation to lowering time~~ Use dynamo fake tensor mode in aot_autograd, move aot_autograd compilation to lowering time [Merger of 89672 and 89773] Dec 2, 2022

voznesenskym mentioned this pull request Dec 2, 2022

Get CI passing #89773

Closed

voznesenskym added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category labels Dec 2, 2022

ezyang reviewed Dec 2, 2022

View reviewed changes

pytorchmergebot added the Merged label Dec 3, 2022

pytorchmergebot closed this in ef0c7ec Dec 3, 2022

pytorchmergebot added the Reverted label Dec 4, 2022

voznesenskym reopened this Dec 4, 2022

github-actions bot requested a review from ezyang December 4, 2022 22:34

Fix XLA

fb21875

ezyang approved these changes Dec 5, 2022

View reviewed changes

pytorchmergebot closed this in 41c3b41 Dec 5, 2022

seemethere mentioned this pull request Dec 6, 2022

Hotfix to unblock TRT unit tests internally #90313

Closed

XiaobingSuper mentioned this pull request Dec 12, 2022

[Inductor] [CPU] Low passrate in model bench with ShapeProp error pytorch/torchdynamo#1978

Closed

facebook-github-bot deleted the gh/voznesenskym/22/head branch June 8, 2023 19:01

		# Cannot call sizes() on tensor with symbolic sizes/strides
		)

Conversation

voznesenskym commented Dec 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90039

❌ 1 Failures, 2 Pending

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pytorchmergebot commented Dec 3, 2022

Merge started

Uh oh!

clee2000 commented Dec 4, 2022

Uh oh!

pytorchmergebot commented Dec 4, 2022

Uh oh!

pytorchmergebot commented Dec 4, 2022

Uh oh!

voznesenskym commented Dec 4, 2022

Uh oh!

ezyang commented Dec 5, 2022

Uh oh!

voznesenskym commented Dec 5, 2022

Uh oh!

voznesenskym commented Dec 5, 2022

Uh oh!

pytorchmergebot commented Dec 5, 2022

Merge started

Uh oh!

voznesenskym commented Dec 5, 2022

Uh oh!

pytorch-bot bot commented Dec 5, 2022

Uh oh!

Reviewers

Assignees

voznesenskym commented Dec 2, 2022 •

edited

Loading

pytorch-bot bot commented Dec 2, 2022 •

edited

Loading