Skip to content

Adding view and reduction tags#153342

Closed
AlonSardas wants to merge 2 commits intopytorch:viable/strictfrom
AlonSardas:pr129020-reduction-view-tags
Closed

Adding view and reduction tags#153342
AlonSardas wants to merge 2 commits intopytorch:viable/strictfrom
AlonSardas:pr129020-reduction-view-tags

Conversation

@AlonSardas
Copy link

@AlonSardas AlonSardas commented May 11, 2025

Fixes #129020

Here's lists of the operator names annotated with view and reduction tags:
view overloads:
as_strided, detach, view_as_real, view_as_complex, diagonal, select, slice, transpose, split, t, expand, view, unsqueeze, unfold, squeeze, permute, unbind, split_with_sizes, alias

reduction overloads:
sum, mean, amin, amax, argmin, argmax, prod, all, norm, var, std, aminmax, nansum, logsumexp, any, std_mean, var_mean, count_nonzero, linalg_vector_norm, max, min

@pytorch-bot
Copy link

pytorch-bot bot commented May 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153342

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

test/test_ops.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added similar tests for reduction and view.
For reduction, the test checks that the output shape is reduced as expected (I needed the reduced_shape function for that).
For view, the test checks that the output tensor is indeed a view of the base tensor.
I also added a similar test for views on regular tensors (not in fake mode), because I didn't see such a test in other places.
What do you think?
There just seems to be a large overhead for these tests (about 1min each) since they iterate over all the operators and not just the relevant view/reduction ops.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed reduced_shape->compute_reduced_shape since it conflicted with variable name inside test_reductions.py. I also added missing documentation for treat_empty_dim_as_none argument

@AlonSardas AlonSardas force-pushed the pr129020-reduction-view-tags branch from 75477ec to 9c14c55 Compare May 20, 2025 15:12
@AlonSardas AlonSardas force-pushed the pr129020-reduction-view-tags branch from 9c14c55 to 0d5fb10 Compare May 21, 2025 18:40
Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great ! Thank you for doing this ! Maybe lets get one more pytorch review before landing. cc @bdhirsh, @zou3519 mind weighing in ?

@eellison eellison requested a review from zou3519 May 21, 2025 21:40
- tag: view
desc: |
This tag indicates that the operator creates a pure view/alias Tensor and has an explicit
derivative formula in derivates.yaml.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm view tag is probably less critical / a bit redundant, given that OpOverload objects already tell you whether they are a view based off of their schema (you can run op_overload.is_view, see link)

I guess the counterargument is that tags are more a general-purpose of grouping ops into categories, and it's easier for a pass writer to use tags for everything rather than tags for some metadata and schema info for other metadata. So I'm not too opinionated, although we should at least ensure that our OpInfo tests assert that the tag always matches the op_overload.is_view "source of truth". cc @eellison

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be one source of truth for views. That source of truth should be the schema, not tags. People are not going to add the tags to custom operators.

Copy link
Author

@AlonSardas AlonSardas Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still concerned about operators whose schemas imply they are views but don't really behave like pure views. This includes reshape, contiguous, resolve_conj, resolve_neg, copy, lift_fresh (and maybe a few more).

Perhaps we should change their schemas to reflect that they're not pure view operators?

@bdhirsh
Copy link
Collaborator

bdhirsh commented May 28, 2025

@eellison it would also be nice to clean up this big hardcoded list of ops in the min-cut partitioner around what is recompute-friendly. I think it is basically a collective of pointwise/reduction/view ops, although someone would probably need to go through the list carefully to ensure that if we port it over to tags we don't change behavior: https://github.com/pytorch/pytorch/blob/main/torch/_functorch/partitioners.py#L1910

@eellison
Copy link
Contributor

@bdhirsh - agreed, we could probably get rid of a lot of those, given that we have added

default_recomputable_ops += pointwise_ops()
.

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced we need a view tag. There should be a single source of truth for if an operator is a view, and that is currently the schema

] # Fix linalg.vector_norm

# Additional reduction operators
reduction_op_names += [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of all of the string parsing here, can we use OpOverload objects directly? That way you can just check func in reduction_ops instead of having to parse the names:

reduction_op_names += [
    torch.ops.aten.norm.ScalarOpt_dtype,
    torch.ops.aten.norm.Scalar,
]

The downside is that you need to manually write out each overload too, although I'd argue this is good since it makes things more explicit (you needed to manually ad the tag for each OpOverload anyway)

def test_view_tag_coverage(self):
# These operators have the inferred property is_view according to their declaration in native_functions.yaml
# but they are not pure view operators since they create a copy under certain conditions
not_view_operators = ["to", "copy", "lift_fresh"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the overload copy.t with copy.t.is_view==True, which is registered in register_prim_ops.cpp

manually_registered_overloads = [
"select.t",
"numpy_T.a",
"split.default",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick audit, and I don't see split.default in that file? I do see a split.str though: https://github.com/pytorch/pytorch/blob/main/torch/csrc/jit/runtime/register_prim_ops.cpp#L1778

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split.default does have alias annotations. Should this test have failed?

Mentioned above, but I think a nice addition to the test would be to assert that op.is_view == Tag.view in op.tags for every op that we have OpInfo testing for!

Copy link
Author

@AlonSardas AlonSardas Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think that split.default is registered in

    "aten::split(Tensor(a -> *) self, int[] split_sizes, int dim=0) -> Tensor(a)[]"),

  2. During the discussion we said we want to target only the view operators that are CompositeExplicitAutograd Expand Tag Set: views & reductions #129020 (comment)
    This is why the test asserts that op.is_view == Tag.view in op.tags but also not overload.has_kernel_for_dispatch_key(DispatchKey.CompositeImplicitAutograd)

Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we land this with just the reduction tag ? I have a use case for it now.

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Sep 22, 2025
@eellison
Copy link
Contributor

@AlonSardas mind rebasing with just the reduction changes ? otherwise, i can push to the pr.

@AlonSardas
Copy link
Author

I'm currently busy and won't be able to rebase it. Feel free to use just the reduction changes.

eellison added a commit that referenced this pull request Oct 10, 2025
    Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
    operator variants across 21 operator families (sum, mean, min, max,
    argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
    std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).

    This tag categorizes operators that perform reduction operations,
    computing aggregate values across one or more dimensions of input
    tensor(s).

    Based on PR #153342

ghstack-source-id: 6c9c42f
Pull Request resolved: #165146
eellison added a commit that referenced this pull request Oct 10, 2025
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
operator variants across 21 operator families (sum, mean, min, max,
argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).

This tag categorizes operators that perform reduction operations,
computing aggregate values across one or more dimensions of input
tensor(s). This categorization can be useful for analysis, optimization,
and compilation tasks.

Note: Only dimensional reduction variants (e.g., min.dim, max.dim) are
tagged. Simple unary aggregations (min(Tensor), max(Tensor)) and binary/
elementwise operations (min.other, max.other) are excluded.

Based on PR #153342

ghstack-source-id: c0e25de
Pull Request resolved: #165155
eellison added a commit that referenced this pull request Oct 10, 2025
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
operator variants across 21 operator families (sum, mean, min, max,
argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).

This tag categorizes operators that perform reduction operations,
computing aggregate values across one or more dimensions of input
tensor(s). This categorization can be useful for analysis, optimization,
and compilation tasks.

Note: Only dimensional reduction variants (e.g., min.dim, max.dim) are
tagged. Simple unary aggregations (min(Tensor), max(Tensor)) and binary/
elementwise operations (min.other, max.other) are excluded.

Based on PR #153342

ghstack-source-id: cbb7930
Pull Request resolved: #165155
eellison added a commit that referenced this pull request Oct 21, 2025
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
operator variants across 21 operator families (sum, mean, min, max,
argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).

This tag categorizes operators that perform reduction operations,
computing aggregate values across one or more dimensions of input
tensor(s). This categorization can be useful for analysis, optimization,
and compilation tasks.

Note: Only dimensional reduction variants (e.g., min.dim, max.dim) are
tagged. Simple unary aggregations (min(Tensor), max(Tensor)) and binary/
elementwise operations (min.other, max.other) are excluded.

Based on PR #153342

ghstack-source-id: 2f58603
Pull Request resolved: #165155
pytorchmergebot pushed a commit that referenced this pull request Oct 21, 2025
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
operator variants across 21 operator families (sum, mean, min, max,
argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).

This tag categorizes operators that perform reduction operations,
computing aggregate values across one or more dimensions of input
tensor(s).

Based on PR #153342 - co-written with @AlonSardas.

Just as we have pointwise tag - this can be useful for compiler passes, or for opting into sharding rules.

Pull Request resolved: #165155
Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/mlazos
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
operator variants across 21 operator families (sum, mean, min, max,
argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).

This tag categorizes operators that perform reduction operations,
computing aggregate values across one or more dimensions of input
tensor(s).

Based on PR pytorch#153342 - co-written with @AlonSardas.

Just as we have pointwise tag - this can be useful for compiler passes, or for opting into sharding rules.

Pull Request resolved: pytorch#165155
Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/mlazos
zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Oct 22, 2025
Add a new 'reduction' tag to tags.yaml and apply it to 98 reduction
operator variants across 21 operator families (sum, mean, min, max,
argmin, argmax, amin, amax, aminmax, prod, all, any, norm, var, std,
std_mean, var_mean, nansum, logsumexp, count_nonzero, linalg_vector_norm).

This tag categorizes operators that perform reduction operations,
computing aggregate values across one or more dimensions of input
tensor(s).

Based on PR pytorch#153342 - co-written with @AlonSardas.

Just as we have pointwise tag - this can be useful for compiler passes, or for opting into sharding rules.

Pull Request resolved: pytorch#165155
Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/mlazos
@github-actions github-actions bot closed this Nov 1, 2025
@CloseChoice CloseChoice mentioned this pull request Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants