Adding view and reduction tags#153342
Adding view and reduction tags#153342AlonSardas wants to merge 2 commits intopytorch:viable/strictfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153342
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
test/test_ops.py
Outdated
There was a problem hiding this comment.
Looks great ! Would it be possible to add a test along the lines of https://github.com/pytorch/pytorch/pull/90029/files?show-viewed-files=true&file-filters%5B%5D=#diff-d183f2afc51d6a59bc70094e8f476d2468c45e415500f6eb60abad955e065156R155-R237? wdyt ?
There was a problem hiding this comment.
Yes, I added similar tests for reduction and view.
For reduction, the test checks that the output shape is reduced as expected (I needed the reduced_shape function for that).
For view, the test checks that the output tensor is indeed a view of the base tensor.
I also added a similar test for views on regular tensors (not in fake mode), because I didn't see such a test in other places.
What do you think?
There just seems to be a large overhead for these tests (about 1min each) since they iterate over all the operators and not just the relevant view/reduction ops.
There was a problem hiding this comment.
I renamed reduced_shape->compute_reduced_shape since it conflicted with variable name inside test_reductions.py. I also added missing documentation for treat_empty_dim_as_none argument
75477ec to
9c14c55
Compare
9c14c55 to
0d5fb10
Compare
| - tag: view | ||
| desc: | | ||
| This tag indicates that the operator creates a pure view/alias Tensor and has an explicit | ||
| derivative formula in derivates.yaml. |
There was a problem hiding this comment.
hmm view tag is probably less critical / a bit redundant, given that OpOverload objects already tell you whether they are a view based off of their schema (you can run op_overload.is_view, see link)
I guess the counterargument is that tags are more a general-purpose of grouping ops into categories, and it's easier for a pass writer to use tags for everything rather than tags for some metadata and schema info for other metadata. So I'm not too opinionated, although we should at least ensure that our OpInfo tests assert that the tag always matches the op_overload.is_view "source of truth". cc @eellison
There was a problem hiding this comment.
I think there should be one source of truth for views. That source of truth should be the schema, not tags. People are not going to add the tags to custom operators.
There was a problem hiding this comment.
I'm still concerned about operators whose schemas imply they are views but don't really behave like pure views. This includes reshape, contiguous, resolve_conj, resolve_neg, copy, lift_fresh (and maybe a few more).
Perhaps we should change their schemas to reflect that they're not pure view operators?
|
@eellison it would also be nice to clean up this big hardcoded list of ops in the min-cut partitioner around what is recompute-friendly. I think it is basically a collective of pointwise/reduction/view ops, although someone would probably need to go through the list carefully to ensure that if we port it over to tags we don't change behavior: https://github.com/pytorch/pytorch/blob/main/torch/_functorch/partitioners.py#L1910 |
|
@bdhirsh - agreed, we could probably get rid of a lot of those, given that we have added pytorch/torch/_functorch/partitioners.py Line 2040 in 3b38989 |
zou3519
left a comment
There was a problem hiding this comment.
I'm not convinced we need a view tag. There should be a single source of truth for if an operator is a view, and that is currently the schema
| ] # Fix linalg.vector_norm | ||
|
|
||
| # Additional reduction operators | ||
| reduction_op_names += [ |
There was a problem hiding this comment.
nit: instead of all of the string parsing here, can we use OpOverload objects directly? That way you can just check func in reduction_ops instead of having to parse the names:
reduction_op_names += [
torch.ops.aten.norm.ScalarOpt_dtype,
torch.ops.aten.norm.Scalar,
]
The downside is that you need to manually write out each overload too, although I'd argue this is good since it makes things more explicit (you needed to manually ad the tag for each OpOverload anyway)
| def test_view_tag_coverage(self): | ||
| # These operators have the inferred property is_view according to their declaration in native_functions.yaml | ||
| # but they are not pure view operators since they create a copy under certain conditions | ||
| not_view_operators = ["to", "copy", "lift_fresh"] |
There was a problem hiding this comment.
There was a problem hiding this comment.
There is the overload copy.t with copy.t.is_view==True, which is registered in register_prim_ops.cpp
| manually_registered_overloads = [ | ||
| "select.t", | ||
| "numpy_T.a", | ||
| "split.default", |
There was a problem hiding this comment.
I did a quick audit, and I don't see split.default in that file? I do see a split.str though: https://github.com/pytorch/pytorch/blob/main/torch/csrc/jit/runtime/register_prim_ops.cpp#L1778
There was a problem hiding this comment.
split.default does have alias annotations. Should this test have failed?
Mentioned above, but I think a nice addition to the test would be to assert that op.is_view == Tag.view in op.tags for every op that we have OpInfo testing for!
There was a problem hiding this comment.
-
I think that
split.defaultis registered in -
During the discussion we said we want to target only the view operators that are CompositeExplicitAutograd Expand Tag Set: views & reductions #129020 (comment)
This is why the test asserts thatop.is_view == Tag.view in op.tagsbut alsonot overload.has_kernel_for_dispatch_key(DispatchKey.CompositeImplicitAutograd)
eellison
left a comment
There was a problem hiding this comment.
Could we land this with just the reduction tag ? I have a use case for it now.
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
@AlonSardas mind rebasing with just the reduction changes ? otherwise, i can push to the pr. |
|
I'm currently busy and won't be able to rebase it. Feel free to use just the reduction changes. |
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
operator variants across 21 operator families (sum, mean, min, max,
argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).
This tag categorizes operators that perform reduction operations,
computing aggregate values across one or more dimensions of input
tensor(s).
Based on PR #153342
ghstack-source-id: 6c9c42f
Pull Request resolved: #165146
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction operator variants across 21 operator families (sum, mean, min, max, argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std, std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm). This tag categorizes operators that perform reduction operations, computing aggregate values across one or more dimensions of input tensor(s). This categorization can be useful for analysis, optimization, and compilation tasks. Note: Only dimensional reduction variants (e.g., min.dim, max.dim) are tagged. Simple unary aggregations (min(Tensor), max(Tensor)) and binary/ elementwise operations (min.other, max.other) are excluded. Based on PR #153342 ghstack-source-id: c0e25de Pull Request resolved: #165155
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction operator variants across 21 operator families (sum, mean, min, max, argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std, std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm). This tag categorizes operators that perform reduction operations, computing aggregate values across one or more dimensions of input tensor(s). This categorization can be useful for analysis, optimization, and compilation tasks. Note: Only dimensional reduction variants (e.g., min.dim, max.dim) are tagged. Simple unary aggregations (min(Tensor), max(Tensor)) and binary/ elementwise operations (min.other, max.other) are excluded. Based on PR #153342 ghstack-source-id: cbb7930 Pull Request resolved: #165155
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction operator variants across 21 operator families (sum, mean, min, max, argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std, std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm). This tag categorizes operators that perform reduction operations, computing aggregate values across one or more dimensions of input tensor(s). This categorization can be useful for analysis, optimization, and compilation tasks. Note: Only dimensional reduction variants (e.g., min.dim, max.dim) are tagged. Simple unary aggregations (min(Tensor), max(Tensor)) and binary/ elementwise operations (min.other, max.other) are excluded. Based on PR #153342 ghstack-source-id: 2f58603 Pull Request resolved: #165155
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction operator variants across 21 operator families (sum, mean, min, max, argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std, std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm). This tag categorizes operators that perform reduction operations, computing aggregate values across one or more dimensions of input tensor(s). Based on PR #153342 - co-written with @AlonSardas. Just as we have pointwise tag - this can be useful for compiler passes, or for opting into sharding rules. Pull Request resolved: #165155 Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/mlazos
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction operator variants across 21 operator families (sum, mean, min, max, argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std, std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm). This tag categorizes operators that perform reduction operations, computing aggregate values across one or more dimensions of input tensor(s). Based on PR pytorch#153342 - co-written with @AlonSardas. Just as we have pointwise tag - this can be useful for compiler passes, or for opting into sharding rules. Pull Request resolved: pytorch#165155 Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/mlazos
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction operator variants across 21 operator families (sum, mean, min, max, argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std, std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm). This tag categorizes operators that perform reduction operations, computing aggregate values across one or more dimensions of input tensor(s). Based on PR pytorch#153342 - co-written with @AlonSardas. Just as we have pointwise tag - this can be useful for compiler passes, or for opting into sharding rules. Pull Request resolved: pytorch#165155 Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/mlazos
Fixes #129020
Here's lists of the operator names annotated with view and reduction tags:
viewoverloads:as_strided, detach, view_as_real, view_as_complex, diagonal, select, slice, transpose, split, t, expand, view, unsqueeze, unfold, squeeze, permute, unbind, split_with_sizes, alias
reductionoverloads:sum, mean, amin, amax, argmin, argmax, prod, all, norm, var, std, aminmax, nansum, logsumexp, any, std_mean, var_mean, count_nonzero, linalg_vector_norm, max, min