[invoke_subgraph] Don't run the graph twice when autograd enabled#167245
[invoke_subgraph] Don't run the graph twice when autograd enabled#167245angelayi wants to merge 9 commits intogh/angelayi/133/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167245
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…enabled" [ghstack-poisoned]
…enabled" In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. [ghstack-poisoned]
…enabled" In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. [ghstack-poisoned]
…enabled" In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. [ghstack-poisoned]
…enabled" In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. [ghstack-poisoned]
…enabled" In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. [ghstack-poisoned]
…enabled" In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. [ghstack-poisoned]
|
@pytorchbot merge |
|
@pytorchbot merge -f "failure looks unrelated" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "the base pr is broken internal tests in the stack" -c ghfirst |
|
@pytorchbot successfully started a revert job. Check the current status here. |
…bled (#167245)" This reverts commit 789240b. Reverted #167245 on behalf of https://github.com/yangw-dev due to the base pr is broken internal tests in the stack ([comment](#167245 (comment)))
|
@angelayi your PR has been successfully reverted. |
…enabled" In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740) [ghstack-poisoned]
|
@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Starting merge as part of PR stack under #167363 |
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.
The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.
Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.
Before:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
After:
```
def forward(self, x):
token = torch.ops.prims._make_token.default()
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.with_effects(
token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
_ = torch.ops.prims._sink_tokens.default([getitem])
return (getitem_1,)
def repeated_subgraph(self, x):
token = torch.ops.prims._make_token.default()
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
_ = torch.ops.prims._sink_tokens.default([getitem])
return (add,)
```
Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: #167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: #167231, #167245
…torch#167245) In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740) Pull Request resolved: pytorch#167245 Approved by: https://github.com/anijain2305 ghstack dependencies: pytorch#167231
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.
The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.
Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.
Before:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
After:
```
def forward(self, x):
token = torch.ops.prims._make_token.default()
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.with_effects(
token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
_ = torch.ops.prims._sink_tokens.default([getitem])
return (getitem_1,)
def repeated_subgraph(self, x):
token = torch.ops.prims._make_token.default()
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
_ = torch.ops.prims._sink_tokens.default([getitem])
return (add,)
```
Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: pytorch#167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: pytorch#167231, pytorch#167245
…bled (#167245)" This reverts commit 789240b. Reverted #167245 on behalf of https://github.com/yangw-dev due to the base pr is broken internal tests in the stack ([comment](#167245 (comment)))
…67245) In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740) Pull Request resolved: #167245 Approved by: https://github.com/anijain2305 ghstack dependencies: #167231
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.
The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.
Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.
Before:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
After:
```
def forward(self, x):
token = torch.ops.prims._make_token.default()
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.with_effects(
token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
_ = torch.ops.prims._sink_tokens.default([getitem])
return (getitem_1,)
def repeated_subgraph(self, x):
token = torch.ops.prims._make_token.default()
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
_ = torch.ops.prims._sink_tokens.default([getitem])
return (add,)
```
Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: #167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: #167231, #167245
ghstack-source-id: abf87b5 Pull Request resolved: pytorch/pytorch#167245
ghstack-source-id: 44c172c Pull Request resolved: pytorch/pytorch#167245
In the previous PR we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from
subgraph'snode.metaif it exists.Stack from ghstack (oldest at bottom):
Differential Revision: D87392740