Add FakeTensorMode by eellison · Pull Request #77972 · pytorch/pytorch

eellison · 2022-05-20T15:42:25Z

Stack from ghstack (oldest at bottom):

This adds a mode which will intercept calls to __torch__dispatch__ even if the inputs are not already FakeTensors. This mimics the convenient prior existing usage. It does so by wrapping input tensors to Fake Tensors and then continuing to run the operators.

Not Yet Implemented:

I still need to memoize conversion of non-fake tensors to fake tensors (and internally, to meta devices) following along with the class here.

One open question is what should be the duration of the FakeTensorConverter. IMO, it would make sense & be convenient for it to live for the duration of FakeTensorMode. Since we shouldn't be allocating any new Tensors with actual data (just on meta devices) it is probably fine for those tensors to live for the duration of FakeTensorMode.

If that is not sufficient, we could try using weakref.WeakKeyDictionary mapping tensors to their fake equivalents. I looked into this a bit and there are at least some a few incompatibilities that need to be dealt with.

[ghstack-poisoned]

facebook-github-bot · 2022-05-20T15:42:45Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/77972
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours
↩️ [fb-only] Re-run with SSH instructions

❌ 1 New Failures

As of commit 791b39f (more details on the Dr. CI page):

Expand to see more

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

pull / linux-focal-py3.7-gcc7-mobile-lightweight-dispatch-build / build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

2022-05-31T14:14:47.7623462Z ##[error]Process completed with exit code 137.

2022-05-31T14:12:32.4887724Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/x8-lut/gen/lut-avx512skx-vpshufb-x256.c.o�[0m
2022-05-31T14:12:32.5971334Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2-k-over-64.c.o�[0m
2022-05-31T14:12:32.6515761Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2-k-over-2048.c.o�[0m
2022-05-31T14:12:32.7050982Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-4.c.o�[0m
2022-05-31T14:12:32.7556644Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-8.c.o�[0m
2022-05-31T14:12:32.8127663Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-16.c.o�[0m
2022-05-31T14:12:32.8659333Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-64.c.o�[0m
2022-05-31T14:12:32.9194429Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-2048.c.o�[0m
2022-05-31T14:12:32.9933254Z [ 98%] Built target all_microkernels
2022-05-31T14:12:32.9999721Z [ 98%] �[32mBuilding CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCodegenUnboxedKernels_7.cpp.o�[0m
2022-05-31T14:14:47.7623462Z ##[error]Process completed with exit code 137.
2022-05-31T14:14:47.8232475Z Prepare all required actions
2022-05-31T14:14:47.8314724Z ##[group]Run ./.github/actions/teardown-linux
2022-05-31T14:14:47.8315081Z with:
2022-05-31T14:14:47.8315340Z env:
2022-05-31T14:14:47.8315615Z   IN_CI: 1
2022-05-31T14:14:47.8315896Z   IS_GHA: 1
2022-05-31T14:14:47.8316167Z ##[endgroup]
2022-05-31T14:14:47.8360161Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2022-05-31T14:14:47.8360635Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2022-05-31T14:14:47.9517213Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

[ghstack-poisoned]

ghstack-source-id: cf2089f Pull Request resolved: #77972

[ghstack-poisoned]

ghstack-source-id: 62ea367 Pull Request resolved: #77972

ezyang · 2022-05-21T02:42:20Z

torch/_subclasses/fake_tensor.py

        return common_device
+
+class FakeTensorMode(FakeTensor):
+    context = no_dispatch


Can we do a modern style mode instead pretty please :)

You didn't actually use context, AFAICT?

What is a modern style mode ?

inherit from TorchDispatchMode

will do.. mind linking me differences between two why one should do modern style over existing ?

the main difference is you can store instance variables on the mode, since it is an actual object not a class

nice... i probably don't need setup_mode then

ezyang · 2022-05-21T19:26:31Z

torch/_subclasses/fake_tensor.py

+    # TODO: no real reason to restrict multiple outputs
+    return (
+        len(schema.returns) == 1 and schema.returns[0].type is torch._C.TensorType.get()
+    )


Tag opportunity :) cc @anjali411

ezyang · 2022-05-21T19:27:47Z

torch/_subclasses/fake_tensor.py

+                func, args=args, kwargs=kwargs, normalize_to_only_use_kwargs=True
+            )
+            # cpu is default device if none is specified
+            out_device = new_kwargs.pop("device", torch.device("cpu"))


technically it's torch.get_default_tensor_type() lol but ok

which is immutable 😛

This adds a mode which will intercept calls to `__torch__dispatch__` even if the inputs are not already `FakeTensors`. This mimics the convenient [prior existing usage](https://pytorch.org/torchdistx/latest/fake_tensor.html). It does so by wrapping input tensors to Fake Tensors and then continuing to run the operators. Not Yet Implemented: I still need to memoize conversion of non-fake tensors to fake tensors (and internally, to `meta` devices) following along with the [class here](https://github.com/pytorch/pytorch/blob/master/test/test_meta.py#L70). One open question is what should be the duration of the `FakeTensorConverter`. IMO, it would make sense & be convenient for it to live for the duration of `FakeTensorMode`. Since we shouldn't be allocating any new Tensors with actual data (just on `meta` devices) it is probably fine for those tensors to live for the duration of `FakeTensorMode`. If that is not sufficient, we could try using `weakref.WeakKeyDictionary` mapping tensors to their fake equivalents. I looked into this a bit and there are at least some a few incompatibilities that need to be dealt with. [ghstack-poisoned]

eellison · 2022-05-24T02:24:18Z

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

eellison · 2022-05-24T15:52:36Z

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

This adds a mode which will intercept calls to `__torch__dispatch__` even if the inputs are not already `FakeTensors`. This mimics the convenient [prior existing usage](https://pytorch.org/torchdistx/latest/fake_tensor.html). It does so by wrapping input tensors to Fake Tensors and then continuing to run the operators. Not Yet Implemented: I still need to memoize conversion of non-fake tensors to fake tensors (and internally, to `meta` devices) following along with the [class here](https://github.com/pytorch/pytorch/blob/master/test/test_meta.py#L70). One open question is what should be the duration of the `FakeTensorConverter`. IMO, it would make sense & be convenient for it to live for the duration of `FakeTensorMode`. Since we shouldn't be allocating any new Tensors with actual data (just on `meta` devices) it is probably fine for those tensors to live for the duration of `FakeTensorMode`. If that is not sufficient, we could try using `weakref.WeakKeyDictionary` mapping tensors to their fake equivalents. I looked into this a bit and there are at least some a few incompatibilities that need to be dealt with. Differential Revision: [D36618464](https://our.internmc.facebook.com/intern/diff/D36618464) [ghstack-poisoned]

ghstack-source-id: 00439da Pull Request resolved: #77972

eellison · 2022-05-24T16:26:08Z

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

eellison · 2022-05-24T16:58:13Z

I actually think the current behavior makes sense. The easiest mental model IMO is that everything is simulated, and that no real tensors will be affected. I think allowing inplacing views and other things to actually affect the input tensors would be a mistake.
If you do you would get into a situation where
input.t_() does affect your input, but input.add_(fake_tensor) doesn't...

Additionally, the behavior above with resize_ should just work because the first use of that tensor will get converted to fake, and then any subsequent uses the cached version with the resize_ applied will be used, which correctly simulates what compute you would have done.

This adds a mode which will intercept calls to `__torch__dispatch__` even if the inputs are not already `FakeTensors`. This mimics the convenient [prior existing usage](https://pytorch.org/torchdistx/latest/fake_tensor.html). It does so by wrapping input tensors to Fake Tensors and then continuing to run the operators. Not Yet Implemented: I still need to memoize conversion of non-fake tensors to fake tensors (and internally, to `meta` devices) following along with the [class here](https://github.com/pytorch/pytorch/blob/master/test/test_meta.py#L70). One open question is what should be the duration of the `FakeTensorConverter`. IMO, it would make sense & be convenient for it to live for the duration of `FakeTensorMode`. Since we shouldn't be allocating any new Tensors with actual data (just on `meta` devices) it is probably fine for those tensors to live for the duration of `FakeTensorMode`. If that is not sufficient, we could try using `weakref.WeakKeyDictionary` mapping tensors to their fake equivalents. I looked into this a bit and there are at least some a few incompatibilities that need to be dealt with. Differential Revision: [D36618464](https://our.internmc.facebook.com/intern/diff/D36618464) [ghstack-poisoned]

ezyang · 2022-05-25T13:21:25Z

We discussed this in person and we decided that for torchdynamo the easiest thing will be to just wrap all the tensors as fake tensors before running the computation. That means we don't have to support in place modifying non-fake tensors.

This adds a mode which will intercept calls to `__torch__dispatch__` even if the inputs are not already `FakeTensors`. This mimics the convenient [prior existing usage](https://pytorch.org/torchdistx/latest/fake_tensor.html). It does so by wrapping input tensors to Fake Tensors and then continuing to run the operators. Not Yet Implemented: I still need to memoize conversion of non-fake tensors to fake tensors (and internally, to `meta` devices) following along with the [class here](https://github.com/pytorch/pytorch/blob/master/test/test_meta.py#L70). One open question is what should be the duration of the `FakeTensorConverter`. IMO, it would make sense & be convenient for it to live for the duration of `FakeTensorMode`. Since we shouldn't be allocating any new Tensors with actual data (just on `meta` devices) it is probably fine for those tensors to live for the duration of `FakeTensorMode`. If that is not sufficient, we could try using `weakref.WeakKeyDictionary` mapping tensors to their fake equivalents. I looked into this a bit and there are at least some a few incompatibilities that need to be dealt with. Differential Revision: [D36618464](https://our.internmc.facebook.com/intern/diff/D36618464) [ghstack-poisoned]

This adds a mode which will intercept calls to `__torch__dispatch__` even if the inputs are not already `FakeTensors`. This mimics the convenient [prior existing usage](https://pytorch.org/torchdistx/latest/fake_tensor.html). It does so by wrapping input tensors to Fake Tensors and then continuing to run the operators. Not Yet Implemented: I still need to memoize conversion of non-fake tensors to fake tensors (and internally, to `meta` devices) following along with the [class here](https://github.com/pytorch/pytorch/blob/master/test/test_meta.py#L70). One open question is what should be the duration of the `FakeTensorConverter`. IMO, it would make sense & be convenient for it to live for the duration of `FakeTensorMode`. Since we shouldn't be allocating any new Tensors with actual data (just on `meta` devices) it is probably fine for those tensors to live for the duration of `FakeTensorMode`. If that is not sufficient, we could try using `weakref.WeakKeyDictionary` mapping tensors to their fake equivalents. I looked into this a bit and there are at least some a few incompatibilities that need to be dealt with. [ghstack-poisoned]

eellison · 2022-05-31T16:25:30Z

@pytorchbot merge this please

github-actions · 2022-05-31T16:27:41Z

Hey @eellison.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Pull Request resolved: #77972 Approved by: https://github.com/ezyang ghstack-source-id: de497bb

Summary: Pull Request resolved: #77972 Approved by: https://github.com/ezyang Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/cea7dd1646ab147edac8f0e22f0aa85cf3136fef Reviewed By: seemethere Differential Revision: D36784784 Pulled By: seemethere fbshipit-source-id: 55175d158483e4b388402a4ddcc273b69ef403c7

anjali411 · 2022-06-13T22:23:56Z

torch/_subclasses/fake_tensor.py

+def _is_tensor_constructor(func: OpOverload):
+    assert isinstance(func, OpOverload)
+    schema = func._schema
+    if any(contains_tensor_types(arg.type) for arg in schema.arguments):


what about _like ops?

those got added later

I mean _is_tensor_constructor would still return false for them but it should return true right?

ya. _no_tensor_arg_constructor is a better name

Add FakeTensorMode

766791e

[ghstack-poisoned]

This was referenced May 20, 2022

[temp] add slow path for is_contiguous #77683

Closed

Add slow path for device #77684

Closed

Fake Tensor Part 1 #77969

Closed

Change FakeTensor constructor to use _make_subclass #77970

Closed

add support for type_as/_to_copy #77971

Closed

facebook-github-bot added the cla signed label May 20, 2022

Update on "Add FakeTensorMode"

44ba2f6

[ghstack-poisoned]

eellison pushed a commit that referenced this pull request May 20, 2022

Add FakeTensorMode

9814554

ghstack-source-id: cf2089f Pull Request resolved: #77972

eellison requested review from Chillee and ezyang May 20, 2022 16:18

Update on "Add FakeTensorMode"

18a13d6

[ghstack-poisoned]

Update on "Add FakeTensorMode"

eaf78ef

[ghstack-poisoned]

Update on "Add FakeTensorMode"

04516f0

[ghstack-poisoned]

eellison pushed a commit that referenced this pull request May 20, 2022

Add FakeTensorMode

9cefb9c

ghstack-source-id: 62ea367 Pull Request resolved: #77972

ezyang reviewed May 21, 2022

View reviewed changes

ezyang approved these changes May 21, 2022

View reviewed changes

eellison mentioned this pull request May 23, 2022

Add Caching of Conversion to Fake/Meta tensors in FakeTensorMode #78090

Closed

Elias Ellison added 4 commits May 23, 2022 13:28

eellison pushed a commit that referenced this pull request May 24, 2022

Add FakeTensorMode

e599045

ghstack-source-id: 00439da Pull Request resolved: #77972

Elias Ellison added 3 commits May 24, 2022 15:59

This was referenced May 25, 2022

add non-kwarg device #78270

Closed

Modernize FakeTensorMode, throw on non-fake inputs #78516

Closed

add _like constructors #78517

Closed

This was referenced May 31, 2022

Add CPU Fallback #78522

Closed

refactor op handling to use register pattern #78523

Closed

eellison mentioned this pull request May 31, 2022

[WIP] non-fake inputs #78524

Closed

pytorchmergebot added the Merged label May 31, 2022

pytorchmergebot closed this in cea7dd1 May 31, 2022

dzdang added a commit that referenced this pull request May 31, 2022

Add FakeTensorMode

43b1d78

Pull Request resolved: #77972 Approved by: https://github.com/ezyang ghstack-source-id: de497bb

anjali411 reviewed Jun 13, 2022

View reviewed changes

Conversation

eellison commented May 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented May 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

❌ 1 New Failures

🕵️ 1 new failure recognized by patterns

pull / linux-focal-py3.7-gcc7-mobile-lightweight-dispatch-build / build (1/1)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eellison May 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eellison commented May 24, 2022

Uh oh!

eellison commented May 24, 2022

Uh oh!

eellison commented May 24, 2022

Uh oh!

eellison commented May 24, 2022

Uh oh!

ezyang commented May 25, 2022

Uh oh!

eellison commented May 31, 2022

Uh oh!

github-actions bot commented May 31, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anjali411 Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

eellison commented May 20, 2022 •

edited

Loading

facebook-github-bot commented May 20, 2022 •

edited

Loading

eellison May 23, 2022 •

edited

Loading

anjali411 Jun 14, 2022 •

edited

Loading