functionalize(): make "additionally removing views" toggleable by bdhirsh · Pull Request #678 · pytorch/functorch

bdhirsh · 2022-04-08T13:36:23Z

This PR goes with the core-side change here: pytorch/pytorch#75302, which updates the functionalization pass to be able to turn view ops into view_copy ops. Mobile mentioned that they would like to be able to use the functionalize() transform to trace models for running on mobile, and removing view ops (cc @ZolotukhinM)

This PR updates functionalize() to be toggleable: functionalize(remove='mutations') (the default) will remove mutations but preserve views. functionalize(remove='mutations_and_views') will remove mutations, and additionally convert view operators into their corresponding view_copy operators.

Some extra stuff in the PR:

added more testing for out= mutations, which I fixed in a corresponding core PR: fix out= op handling for functionalization pytorch#75818
remove an unnecessary "regenerate_from_base()" call that gives you an extra view_copy call in some cases. Goes with core PR: functionalization: avoid some unnecessary view_copy calls pytorch#75819

bdhirsh · 2022-04-08T13:38:42Z

functorch/csrc/DynamicLayer.cpp

      }
-    );
-  }
+      return tensor;


Apparently you need to compile with the -Werror=return-type flag, or else you get silent UB if you forget to return from a non-void function (aka this bug) :(

:(, we used to have -Werror turned on for all errors but it turns out PyTorch doesn't and PyTorch would introduce warnings

… view_copy operators" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

…tors" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

bdhirsh · 2022-04-13T20:49:39Z

functorch/csrc/init.cpp

 }

 Tensor _unwrap_functional_tensor(const Tensor& self) {
-  auto* functional = dynamic_cast<FunctionalTensorWrapper*>(self.unsafeGetTensorImpl());


sorry for missing this the first time around :)

… view_copy operators" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

…tors" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

… view_copy operators" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

…tors" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

… view_copy operators" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

…tors" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

functorch/_src/eager_transforms.py

functorch/csrc/init.cpp

functorch/csrc/DynamicLayer.h

test/test_eager_transforms.py

zou3519

Code LGTM.

This needs a rebase and I want to bikeshed the API and defaults a little more (maybe there shouldn't be a default yet)

… view_copy operators" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

…tors" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

… view_copy operators" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

…tors" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

bdhirsh · 2022-04-15T22:36:14Z

Pushed some more changes. Updates:

(1) API bikeshed. I liked your suggestion of functionalize(remove='mutations|mutations_and_views'), where you get a nice error message otherwise (and we default to the functorch default which is mutations)

(2) I ended up killing the FunctionalizeAddBackViews dispatch key, and used an extra piece of TLS instead. This saves us a dispatch key.

CI will fail on this PR though until my stack lands from pytorch/pytorch#75913

… view_copy operators" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

…tors" This PR splits the functionalization codegen into 2 pieces: (1) Vanilla functionalization will now always turn view ops into "view_copy" ops. (2) For functorch to "reapply views underneath the pass", I added a new dispatch key, "FunctionalizeAddBackViews". I codegen a kernel to that key for every view_copy operator that just calls back into the view op. All other ops get a fallthrough kernel. Also - the codegen will now unconditionally register CompositeImplicitAutograd kernels directly to the functionalization keys, so we "always decompose" before hitting functionalization. Otherwise, we might break and accidentally send "view" calls to the backend, if we decompose an op into a view underneath the functionalization pass. The important changes are in `gen.py` and `gen_functionalization_type.py` - most of the other changes are just plumbing `{view}_copy` everywhere. I also updated `test_functionalization.py`, and added expecttests for the "add back views" case. One thing about the `AddBackViews` key - right now, I add it into the TLS include set. The other option would be to try to add it directly to the tensors, but that's kind of hard: putting it on the `FunctionalTensorWrapper` doesn't help, because the functionalization pass will unwrap when it calls back into the dispatcher, and run on the "inner tensor" (maybe we could modify the inner tensor's keyset and add the `AddBackViews` key when functionalization happens, instead?) I also have an accompanying functorch change here: pytorch/functorch#678 Differential Revision: [D35419652](https://our.internmc.facebook.com/intern/diff/D35419652) [ghstack-poisoned]

zou3519

LGTM, you probably want to change the PR body (it still mentions FunctionalizeAddBackViews). Also it would be good to wait until the CI turns green

zou3519 · 2022-04-18T22:37:37Z

@bdhirsh btw, functorch CI runs off of the PyTorch nightly binary, so we have two options:

Wait until your pytorch-side change makes it to PyTorch nightlies (hopefully this will happen tonight)
Yolo merge this and temporarily break functorch CI

bdhirsh · 2022-04-18T22:42:20Z

Lol you probably saw me blindly kicking off CI again hoping something would happen (although at least python test/test_eager_transforms.py passes for me locally).

I'm ok with either, depending on how urgently @ZolotukhinM would like this change for mobile

bdhirsh · 2022-04-19T13:26:06Z

CI is red, but I think I'm seeing the same set of test failures on main: https://app.circleci.com/pipelines/github/pytorch/functorch/2432/workflows/4657bb2a-cba6-4ce4-b409-e72e5229e3c4/jobs/14516/tests

zou3519 · 2022-04-19T14:38:39Z

@bdhirsh feel free to merge, the errors look pre-existing and we'll figure them out as we go along today

…eable (pytorch/functorch#678) * functionalize() move AddBackViews logic to a separate key * make functionalize() toggleable when adding back views * fix unnecessary view reapply, add tests for out= * fix * change functionalize() API, also use the new internal TLS * rebase and fix tests

facebook-github-bot added the cla signed label Apr 8, 2022

bdhirsh commented Apr 8, 2022

View reviewed changes

bdhirsh mentioned this pull request Apr 8, 2022

split out functionalization codegen to use view_copy operators pytorch/pytorch#75302

Closed

bdhirsh commented Apr 13, 2022

View reviewed changes

bdhirsh requested a review from zou3519 April 14, 2022 19:16