use codegen'd inplace kernels, and delete manually written inplace ke…#2962
Merged
use codegen'd inplace kernels, and delete manually written inplace ke…#2962
Conversation
9e91c22 to
a285fcb
Compare
49791eb to
c223040
Compare
a285fcb to
dc43fe2
Compare
JackCaoG
reviewed
May 25, 2021
Collaborator
JackCaoG
left a comment
There was a problem hiding this comment.
Mostly LGTM, some minor questions.
test/cpp/test_tensor.cpp
Outdated
| at::Tensor input = at::zeros({32, 20, 4, 4}, at::TensorOptions(at::kFloat)); | ||
| at::Tensor one = at::tensor(1.0, at::TensorOptions(at::kFloat)); | ||
| at::Tensor output = input.view({-1, 320}); | ||
| at::Tensor output = input.view({-1, 8}); |
Collaborator
There was a problem hiding this comment.
Out of curiosity, why change 320->8. Is it just a performance issue?
Contributor
Author
There was a problem hiding this comment.
yeah that was a mistake, I changed it when I was debugging and forgot to put it back 😛 it's fixed now.
dc43fe2 to
5990200
Compare
JackCaoG
approved these changes
May 25, 2021
Base automatically changed from
make_codegen_backend_agnostic_minus_fallbacks
to
master
May 26, 2021 20:23
…e kernels when possible
f60faf9 to
c967eb1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR deletes the
inplaceimplementations of all lowerings that can be auto-generated from the codegen, removing ~1000 LoC.It applies to any operator that XLA already has a functional lowering for (e.g.
add.Tensor), and that has a "trio" of operators in the pytorch codebase (in this case,add.Tensor,add_.Tensor, andadd.outare all valid pytorch ops). For example, it doesn't apply torelu_, because we don't have arelu.outoperator in PyTorch.@JackCaoG Other than a small test change that I had to make, all of the tests are still passing. One thing that's worth double checking after this lands is that you don't see any major perf divergences in the performance dashboard. I don't think we should expect a perf change, since the generated kernels are all implemented by just calling the functional operator, followed by a call to
at::_copy_from()to move the result intoself. But definitely worth confirming.For each inplace lowering, I removed:
xla_native_functions.yamlaten_xla_type.cpptensor.htensor_methods.cppYou can see the full list of inplace kernels that I removed in
xla_native_functions.yaml, but the list is: