preserve node stacktraces from compiled autograd through AOTDispatcher, due to GmWrapper#133574
preserve node stacktraces from compiled autograd through AOTDispatcher, due to GmWrapper#133574bdhirsh wants to merge 1 commit intogh/bdhirsh/605/basefrom
Conversation
…r, due to GmWrapper [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133574
Note: Links to docs will display an error until the docs builds have been completed. ❌ 14 New Failures, 1 Unrelated FailureAs of commit 50d5811 with merge base 454713f ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
CI is unhappy - from a quick look, I'm failing to ensure that the other args to the compiled backward (like hooks) are properly accounted for in the graph |
| # to ensure args are boxed. | ||
| assert params_len == 0 | ||
| assert len(kwargs) == 0 | ||
| out = PropagateUnbackedSymInts(mod_).run(args) |
There was a problem hiding this comment.
there's some logic in GmWrapper.forward that we'll need here:
pytorch/torch/_dynamo/utils.py
Lines 2907 to 2909 in 90d2593
There was a problem hiding this comment.
You should probably just have a "middleware" wrapper that uniformly takes care of unwrapping GmWrapper and modifying the calling convention, should be cleaner.
@xmfan is there a reason we HAVE to have a GmWrapper? Shouldn't custom GraphModule prelude/postlude be enough here?
There was a problem hiding this comment.
What is the prelude/postlude? We use GmWrapper to work around the dynamo GraphModule needing boxed inputs, but AOTDispatcher always tracing the GraphModule with flat inputs
There was a problem hiding this comment.
There was a problem hiding this comment.
I can look into it, but we probably still need GmWrapper for non-dynamo frontends that are passing in non-overriden graphs
| # https://github.com/pytorch/pytorch/issues/103569 | ||
|
|
||
| def functional_call(*args, **kwargs): | ||
| nonlocal mod |
There was a problem hiding this comment.
Why nonlocal? Are you assigning over mod?
| mod_, pytree.tree_unflatten(args[:params_len], params_spec) | ||
| ), maybe_disable_thunkify(): | ||
| if isinstance(mod, torch.fx.GraphModule): | ||
| if isinstance(mod, (torch.fx.GraphModule, torch._dynamo.utils.GmWrapper)): |
There was a problem hiding this comment.
Why not do the test here on mod_?
|
@bdhirsh I believe this would be super useful for compiled autograd debugging in general! |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
Fixes #133567
New log output from the repro:
The problem was that we expect the input to AOTAutograd to be a GraphModule in order to do all of the fancy stacktrace preservation logic, but we now need to handle compiled autograd passing in a
GmWrapperinstead (which it uses to try to preserve input boxing, so inductor can properly free activations)Stack from ghstack (oldest at bottom):