Skip to content

[invoke_subgraph] Don't run the graph twice when autograd enabled#167245

Closed
angelayi wants to merge 9 commits intogh/angelayi/133/basefrom
gh/angelayi/133/head
Closed

[invoke_subgraph] Don't run the graph twice when autograd enabled#167245
angelayi wants to merge 9 commits intogh/angelayi/133/basefrom
gh/angelayi/133/head

Conversation

@angelayi
Copy link
Contributor

@angelayi angelayi commented Nov 6, 2025

In the previous PR we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from subgraph's node.meta if it exists.

Stack from ghstack (oldest at bottom):

Differential Revision: D87392740

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 6, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167245

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

angelayi added a commit that referenced this pull request Nov 6, 2025
@angelayi angelayi marked this pull request as draft November 6, 2025 22:39
@angelayi angelayi requested a review from anijain2305 November 6, 2025 22:39
…enabled"

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.




[ghstack-poisoned]
…enabled"

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.




[ghstack-poisoned]
Copy link
Contributor

@anijain2305 anijain2305 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

…enabled"

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.




[ghstack-poisoned]
angelayi added a commit that referenced this pull request Nov 14, 2025
This was referenced Nov 14, 2025
…enabled"

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.




[ghstack-poisoned]
angelayi added a commit that referenced this pull request Nov 17, 2025
…enabled"

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.




[ghstack-poisoned]
…enabled"

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.




[ghstack-poisoned]
@angelayi angelayi added the topic: not user facing topic category label Nov 18, 2025
@angelayi
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 18, 2025
@angelayi
Copy link
Contributor Author

@pytorchbot merge -f "failure looks unrelated"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@yangw-dev
Copy link
Contributor

@pytorchbot revert -m "the base pr is broken internal tests in the stack" -c ghfirst

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Nov 20, 2025
…bled (#167245)"

This reverts commit 789240b.

Reverted #167245 on behalf of https://github.com/yangw-dev due to the base pr is broken  internal tests in the stack ([comment](#167245 (comment)))
@pytorchmergebot
Copy link
Collaborator

@angelayi your PR has been successfully reverted.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Nov 20, 2025
…enabled"

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.


Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740)

[ghstack-poisoned]
@angelayi
Copy link
Contributor Author

@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@angelayi
Copy link
Contributor Author

@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #167363

pytorchmergebot pushed a commit that referenced this pull request Nov 24, 2025
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.

The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.

Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.

Before:
```
def forward(self, token, x):
    repeated_subgraph0 = self.repeated_subgraph0
    invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
    getitem = invoke_subgraph[0]  # output token
    getitem_1 = invoke_subgraph[1]
    return (getitem, getitem_1)

def repeated_subgraph(self, token, x):
    with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
    getitem = with_effects[0]  # output token
    add = torch.ops.aten.add(x, x)
    return  (getitem, add)
```

After:
```
def forward(self, x):
    token = torch.ops.prims._make_token.default()
    repeated_subgraph0 = self.repeated_subgraph0
    invoke_subgraph = torch.ops.higher_order.with_effects(
        token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
    )
    getitem = invoke_subgraph[0]  # output token
    getitem_1 = invoke_subgraph[1]
    _ = torch.ops.prims._sink_tokens.default([getitem])
    return (getitem_1,)

def repeated_subgraph(self, x):
    token = torch.ops.prims._make_token.default()
    with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
    getitem = with_effects[0]  # output token
    add = torch.ops.aten.add(x, x)
    _ = torch.ops.prims._sink_tokens.default([getitem])
    return  (add,)
```

Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: #167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: #167231, #167245
zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Nov 25, 2025
…torch#167245)

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.

Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740)
Pull Request resolved: pytorch#167245
Approved by: https://github.com/anijain2305
ghstack dependencies: pytorch#167231
zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Nov 25, 2025
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.

The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.

Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.

Before:
```
def forward(self, token, x):
    repeated_subgraph0 = self.repeated_subgraph0
    invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
    getitem = invoke_subgraph[0]  # output token
    getitem_1 = invoke_subgraph[1]
    return (getitem, getitem_1)

def repeated_subgraph(self, token, x):
    with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
    getitem = with_effects[0]  # output token
    add = torch.ops.aten.add(x, x)
    return  (getitem, add)
```

After:
```
def forward(self, x):
    token = torch.ops.prims._make_token.default()
    repeated_subgraph0 = self.repeated_subgraph0
    invoke_subgraph = torch.ops.higher_order.with_effects(
        token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
    )
    getitem = invoke_subgraph[0]  # output token
    getitem_1 = invoke_subgraph[1]
    _ = torch.ops.prims._sink_tokens.default([getitem])
    return (getitem_1,)

def repeated_subgraph(self, x):
    token = torch.ops.prims._make_token.default()
    with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
    getitem = with_effects[0]  # output token
    add = torch.ops.aten.add(x, x)
    _ = torch.ops.prims._sink_tokens.default([getitem])
    return  (add,)
```

Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: pytorch#167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: pytorch#167231, pytorch#167245
JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
…bled (#167245)"

This reverts commit 789240b.

Reverted #167245 on behalf of https://github.com/yangw-dev due to the base pr is broken  internal tests in the stack ([comment](#167245 (comment)))
JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
…67245)

In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists.

Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740)
Pull Request resolved: #167245
Approved by: https://github.com/anijain2305
ghstack dependencies: #167231
JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.

The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.

Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.

Before:
```
def forward(self, token, x):
    repeated_subgraph0 = self.repeated_subgraph0
    invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
    getitem = invoke_subgraph[0]  # output token
    getitem_1 = invoke_subgraph[1]
    return (getitem, getitem_1)

def repeated_subgraph(self, token, x):
    with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
    getitem = with_effects[0]  # output token
    add = torch.ops.aten.add(x, x)
    return  (getitem, add)
```

After:
```
def forward(self, x):
    token = torch.ops.prims._make_token.default()
    repeated_subgraph0 = self.repeated_subgraph0
    invoke_subgraph = torch.ops.higher_order.with_effects(
        token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
    )
    getitem = invoke_subgraph[0]  # output token
    getitem_1 = invoke_subgraph[1]
    _ = torch.ops.prims._sink_tokens.default([getitem])
    return (getitem_1,)

def repeated_subgraph(self, x):
    token = torch.ops.prims._make_token.default()
    with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
    getitem = with_effects[0]  # output token
    add = torch.ops.aten.add(x, x)
    _ = torch.ops.prims._sink_tokens.default([getitem])
    return  (add,)
```

Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: #167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: #167231, #167245
tiendatngcs pushed a commit to tiendatngcs/pytorch-Dec25 that referenced this pull request Dec 10, 2025
tiendatngcs pushed a commit to tiendatngcs/pytorch-Dec25 that referenced this pull request Dec 10, 2025
@github-actions github-actions bot deleted the gh/angelayi/133/head branch December 25, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/trunk Trigger trunk jobs on your pull request Merged Reverted topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants