Fix tensor subclass + dynamic shapes in torch.compile + aot autograd by guilhermeleobas · Pull Request #125941 · pytorch/pytorch

guilhermeleobas · 2024-05-10T18:19:56Z

Stack from ghstack (oldest at bottom):

cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @rec @XilunWu @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @tianyu-l @peterbell10

[ghstack-poisoned]

pytorch-bot · 2024-05-10T18:19:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125941

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c9db0c5 with merge base 3b0f393 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 7e20cc5 Pull Request resolved: #125941

guilhermeleobas · 2024-05-10T18:24:03Z

~~Pretty much a work in progress. I just want to see what is currently breaking.~~

Fixes issue: #124619

Changes

This PR addresses a bug in tensor subclasses and symbolic execution.
For each subclass, it appends the sizes to the list of arguments and
returns the computed shapes at runtime.

Most of the changes are in the unwrap_tensor_subclasses function. It
takes two extra flags: append_extra and is_runtime. While tracing, if
append_extra is true and we are tracing for the forward graph, extra arguments
are added.

An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta.
This field stores the offset from right to left for the sizes associated with a
tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes],
where offset is the extra field and #sizes is the number of sizes for the given subclass.

Test plan

Add tests for two different subclasses: TwoTensor and DoubleTensor. The
latter is a wrapper that behaves as if the inner tensor were twice its
original size.

The set of tests is composed of functions that return a mix of subclasses
and plain tensors.

[ghstack-poisoned]

ghstack-source-id: 2628f08 Pull Request resolved: #125941

ezyang · 2024-05-17T12:49:17Z

What exactly is the algorithmic strategy here?

torch/_functorch/_aot_autograd/subclass_utils.py

[ghstack-poisoned]

ghstack-source-id: 79088d6 Pull Request resolved: #125941

[ghstack-poisoned]

ghstack-source-id: e605f8a Pull Request resolved: #125941

[ghstack-poisoned]

ghstack-source-id: 992ccbb Pull Request resolved: #125941

[ghstack-poisoned]

test/dynamo/test_subclasses.py

torch/_functorch/_aot_autograd/runtime_wrappers.py

torch/_functorch/_aot_autograd/subclass_utils.py

bdhirsh · 2024-10-24T21:20:37Z

@guilhermeleobas the new recursive size/stride handling looks mostly good to me. I was still a bit worried about edge cases, so I tried a few tests locally and I got a failure involving nested subclasses:

(I wanted to stress test a bit so I used a version of TwoTensor that relaxes the constraint that the shapes of the two inner tensors are the same)

import torch
import torch.utils._pytree as pytree


# A simple tensor subclass that holds two tensors internally, and runs every op on both tensors.
class TwoTensor(torch.Tensor):
    @staticmethod
    def __new__(cls, a, b, outer_size=None, outer_stride=None):
        if outer_size is None:
            outer_size = a.size()
        if outer_stride is None:
            outer_stride = a.stride()

        assert (
            a.device == b.device
            and a.layout == b.layout
            and a.requires_grad == b.requires_grad
            and a.dtype == b.dtype
        )
        shape = outer_size
        kwargs = {}
        kwargs["strides"] = outer_stride
        kwargs["storage_offset"] = a.storage_offset()
        kwargs["device"] = a.device
        kwargs["layout"] = a.layout
        kwargs["requires_grad"] = a.requires_grad
        kwargs["dtype"] = a.dtype
        out = torch.Tensor._make_wrapper_subclass(cls, shape, **kwargs)

        return out

    def __init__(self, a, b, outer_size=None, outer_stride=None):
        self.a = a
        self.b = b

    def __repr__(self):
        a_repr = repr(self.a)
        b_repr = repr(self.b)
        return f"TwoTensor({a_repr}, {b_repr})"

    def __tensor_flatten__(self):
        return ["a", "b"], None

    @staticmethod
    def __tensor_unflatten__(inner_tensors, meta, outer_size, outer_stride):
        assert meta is None
        a, b = inner_tensors["a"], inner_tensors["b"]
        if type(a) is torch.Tensor:
            assert outer_size is not None
            assert outer_stride is not None
        return TwoTensor(a, b, outer_size, outer_stride)

    @classmethod
    def __torch_dispatch__(cls, func, types, args, kwargs):
        if kwargs is None:
            kwargs = {}
        args_a = pytree.tree_map_only(TwoTensor, lambda x: x.a, args)
        args_b = pytree.tree_map_only(TwoTensor, lambda x: x.b, args)

        kwargs_a = pytree.tree_map_only(TwoTensor, lambda x: x.a, kwargs)
        kwargs_b = pytree.tree_map_only(TwoTensor, lambda x: x.b, kwargs)

        out_a = func(*args_a, **kwargs_a)
        out_b = func(*args_b, **kwargs_b)
        out_a_flat, spec = pytree.tree_flatten(out_a)
        out_b_flat = pytree.tree_leaves(out_b)
        # for aten ops that return non-tensors, just assume that
        # our two inner tensors return the same value
        out_flat = [
            TwoTensor(o_a, o_b) if isinstance(o_a, torch.Tensor) else o_a
            for o_a, o_b in zip(out_a_flat, out_b_flat)
        ]
        out = pytree.tree_unflatten(out_flat, spec)
        return out

@torch.compile(dynamic=True)
def f(x, y):
    tmp1 = x.sin()
    tmp2 = y.sin()
    return tmp1.sum(), tmp2.sum()


x = TwoTensor(
    TwoTensor(
        torch.randn(3, 4),
        torch.randn(5, 6, 7),
    ),
    TwoTensor(
        torch.randn(4),
        torch.randn(2, 3),
    )
)

y = TwoTensor(
    torch.randn(2, 3, 4, 5),
    TwoTensor(
        torch.randn(3, 4),
        torch.randn(5),
    )
)

out = f(x, y)

The fails for me, with:

  File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 630, in inner_fn
    unwrapped_args = runtime_unwrap_tensor_subclasses(
  File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/subclass_utils.py", line 239, in runtime_unwrap_tensor_subclasses
    xs_inner.extend(flatten_subclass(typing.cast(Tensor, x), meta))
  File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/subclass_utils.py", line 206, in flatten_subclass
    tensors_and_sizes.extend(flatten_subclass(inner_tensor, inner_meta))
  File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/subclass_utils.py", line 221, in flatten_subclass
    assert len(stride) == len(symint_placeholders)
AssertionError

guilhermeleobas · 2024-10-25T14:18:33Z

The fails for me, with:
File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 630, in inner_fn
unwrapped_args = runtime_unwrap_tensor_subclasses(
File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/subclass_utils.py", line 239, in runtime_unwrap_tensor_subclasses
xs_inner.extend(flatten_subclass(typing.cast(Tensor, x), meta))
File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/subclass_utils.py", line 206, in flatten_subclass
tensors_and_sizes.extend(flatten_subclass(inner_tensor, inner_meta))
File "/home/hirsheybar/local/c/pytorch/torch/_functorch/_aot_autograd/subclass_utils.py", line 221, in flatten_subclass
assert len(stride) == len(symint_placeholders)
AssertionError

Oops. It was just one small mistake where I mistype the name of a variable. Should be good now.

[ghstack-poisoned]

guilhermeleobas · 2024-10-25T14:37:08Z

I just notice @IvanKobzarev PR (#138498) does a micro optimization on unwrap_tensor_subclasses, and since this PR changes this function a lot, I can adapt the code to include his changes. What do you think?

bdhirsh · 2024-10-25T14:54:25Z

ah yes that would be great. @IvanKobzarev has been looking into subclass runtime overhead, and it would be nice if we can avoid this PR making it too much worse

[ghstack-poisoned]

guilhermeleobas · 2024-10-25T16:25:59Z

ah yes that would be great. @IvanKobzarev has been looking into subclass runtime overhead, and it would be nice if we can avoid this PR making it too much worse

@IvanKobzarev, did you use any code to benchmark #138498? If so, can you share it with me?

bdhirsh · 2024-10-25T19:09:36Z

torch/_functorch/_aot_autograd/subclass_utils.py


        if subclass_metas is None:
-            xs_inner.extend(get_plain_tensors(typing.cast(Tensor, x)))
+            get_plain_tensors(typing.cast(Tensor, x), out_append_list=xs_inner)


cc @IvanKobzarev

test/dynamo/test_aot_autograd_cache.py

bdhirsh · 2024-10-25T19:19:15Z

torch/_subclasses/fake_tensor.py

-    subclass: Tensor, out_append_list: Optional[List[Tensor]] = None
-) -> List[Tensor]:
+    subclass: Tensor, out_append_list: Optional[List[Union[Tensor, int, SymInt]]] = None
+) -> List[Union[Tensor, int, SymInt]]:


hmm, the type signature here is a bit confusing, since we never actually append ints/SymInts to the list in this function. I guess you needed this because in the out_append_list= case, the list we pass in might have symints in it already?

Instead, what do you think of: just refactoring this function to always accept an output list to append to, and mandating that anybody using this API must pass in their own list (from a quick grep there are only 2 call sites of this function, both within AOTAutograd)

bdhirsh

left a few more comments, but otherwise I think this is ready to land. Thanks for all the hard work!

[ghstack-poisoned]

IvanKobzarev · 2024-10-28T15:10:35Z

ah yes that would be great. @IvanKobzarev has been looking into subclass runtime overhead, and it would be nice if we can avoid this PR making it too much worse

@IvanKobzarev, did you use any code to benchmark #138498? If so, can you share it with me?

Hi,
Sorry for delay with reply.

At the moment I use for profiling :
1/ Not landed PR #136478 which uses James's profiling and then in the test I can take the times from logger.

And just manual average counting of time.time_ns() in global variable of unwrap_tensor_subclasses() in runtime_wrappers.py

[ghstack-poisoned]

guilhermeleobas · 2024-10-28T19:31:59Z

@pytorchbot merge

pytorchmergebot · 2024-10-28T19:33:51Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

mlazos · 2024-11-05T23:24:20Z

Hi @mlazos, it is. But there's one test that it is failing if I remove the maybe_enable_thunkify call. I'll sync with @bdhirsh tomorrow.

@guilhermeleobas can that call be removed now? I think it's still there with the note that it can be removed after this PR closed.

bdhirsh · 2024-11-05T23:31:30Z

oh @mlazos our current hypothesis is that this context manager was only needed because there were some tests that did tensor * nested_int compute in a compiled region, which @jbschlosser has since banned as part of #138496 (independently of this PR). So I think it's worth a try to kill that code and see if CI is green

mlazos · 2024-11-05T23:35:55Z

Awesome I will try that

Update

97bf44d

[ghstack-poisoned]

pytorch-bot bot added the ciflow/inductor label May 10, 2024

guilhermeleobas added a commit that referenced this pull request May 10, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

366d569

ghstack-source-id: 7e20cc5 Pull Request resolved: #125941

pytorchbot added the open source label May 10, 2024

Update

c144a98

[ghstack-poisoned]

Update

346afe2

[ghstack-poisoned]

guilhermeleobas added a commit that referenced this pull request May 16, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

60b85f6

ghstack-source-id: 2628f08 Pull Request resolved: #125941

guilhermeleobas requested a review from bdhirsh May 16, 2024 16:30

guilhermeleobas marked this pull request as ready for review May 16, 2024 16:30

guilhermeleobas requested review from Chillee and ezyang as code owners May 16, 2024 16:30

bdhirsh reviewed May 17, 2024

View reviewed changes

torch/_functorch/_aot_autograd/subclass_utils.py Outdated Show resolved Hide resolved

Update

42ff816

[ghstack-poisoned]

guilhermeleobas added a commit that referenced this pull request May 23, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

07f18d8

ghstack-source-id: 79088d6 Pull Request resolved: #125941

guilhermeleobas marked this pull request as draft May 23, 2024 13:30

Update

9eccab3

[ghstack-poisoned]

pytorch-bot bot added the release notes: fx release notes category label May 28, 2024

Update

167e5a1

[ghstack-poisoned]

pytorch-bot bot added the module: dynamo label May 29, 2024

guilhermeleobas added a commit that referenced this pull request May 29, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

dcd25e7

ghstack-source-id: e605f8a Pull Request resolved: #125941

Update

eb79b82

[ghstack-poisoned]

guilhermeleobas added a commit that referenced this pull request May 30, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

9774b8e

ghstack-source-id: 992ccbb Pull Request resolved: #125941

Update

aa6323d

[ghstack-poisoned]

Update

e1dfac1

[ghstack-poisoned]

guilhermeleobas added 3 commits October 23, 2024 12:10

Update

062d395

[ghstack-poisoned]

Update

c016020

[ghstack-poisoned]

Update

c2d62ea

[ghstack-poisoned]

bdhirsh reviewed Oct 24, 2024

View reviewed changes

test/dynamo/test_subclasses.py Outdated Show resolved Hide resolved

bdhirsh reviewed Oct 24, 2024

View reviewed changes

test/dynamo/test_subclasses.py Outdated Show resolved Hide resolved

bdhirsh reviewed Oct 24, 2024

View reviewed changes

torch/_functorch/_aot_autograd/runtime_wrappers.py Show resolved Hide resolved

bdhirsh reviewed Oct 24, 2024

View reviewed changes

torch/_functorch/_aot_autograd/subclass_utils.py Outdated Show resolved Hide resolved

Update

5a88038

[ghstack-poisoned]

guilhermeleobas added 2 commits October 25, 2024 12:49

Update

2ebfe57

[ghstack-poisoned]

Update

9350ed1

[ghstack-poisoned]

bdhirsh reviewed Oct 25, 2024

View reviewed changes

test/dynamo/test_aot_autograd_cache.py Outdated Show resolved Hide resolved

bdhirsh reviewed Oct 25, 2024

View reviewed changes

bdhirsh approved these changes Oct 25, 2024

View reviewed changes

guilhermeleobas added 2 commits October 25, 2024 16:54

Update

3a4de7a

[ghstack-poisoned]

Update

fd80d32

[ghstack-poisoned]

Update

c9db0c5

[ghstack-poisoned]

mlazos mentioned this pull request Nov 5, 2024

[Dynamo] Disable Torch Function Subclasses with eager backend, they should've been traced #137822

Closed

mlazos mentioned this pull request Nov 6, 2024

Remove thunkify call in nested tensor torch function #139820

Closed

Conversation

guilhermeleobas commented May 10, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125941

✅ No Failures

Uh oh!

guilhermeleobas commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Test plan

Uh oh!

ezyang commented May 17, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bdhirsh commented Oct 24, 2024

Uh oh!

guilhermeleobas commented Oct 25, 2024

Uh oh!

guilhermeleobas commented Oct 25, 2024

Uh oh!

bdhirsh commented Oct 25, 2024

Uh oh!

guilhermeleobas commented Oct 25, 2024

Uh oh!

bdhirsh Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bdhirsh Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

bdhirsh left a comment

Choose a reason for hiding this comment

Uh oh!

IvanKobzarev commented Oct 28, 2024

Uh oh!

guilhermeleobas commented Oct 28, 2024

Uh oh!

pytorchmergebot commented Oct 28, 2024

Merge started

Uh oh!

mlazos commented Nov 5, 2024

Uh oh!

bdhirsh commented Nov 5, 2024

Uh oh!

mlazos commented Nov 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

guilhermeleobas commented May 10, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented May 10, 2024 •

edited

Loading

guilhermeleobas commented May 10, 2024 •

edited

Loading