[hoo] Invoke subgraph + effect#167231
[hoo] Invoke subgraph + effect#167231angelayi wants to merge 10 commits intogh/angelayi/132/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167231
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cc ezyang EikanWang jgong5 wenzhe-nrv [ghstack-poisoned]
cc ezyang EikanWang jgong5 wenzhe-nrv [ghstack-poisoned]
This PR adds support for effectful ops within invoke_subgraphs.
* Most of the logic is in `invoke_subgraph.py_functionalize_impl`.
* In the functionalization metadata collection phase, we note the tokens before going further down the dispatcher, and then note the tokens after coming back from the dispatcher. If there are nodes in the invoke_subgraph subgraph that contain effects, the number of effects should change, or the tokens used for an effect should.
* We will store this effect difference in the `InvokeSubgraphCache` where the key is the identifier and value is the effect. For now we only support one effect within a subgraph.
* During the tracing part of AOTAutograd, we will then wrap the subgraph to take in and output a token.
Before:
```
def forward(self, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', x)
return invoke_subgraph
def repeated_subgraph(self, x):
record_memory = torch.ops.mylib.record_memory.default("forward", "N")
add = torch.ops.aten.add(x, x)
return add
```
After:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
* Then there is a bunch of logic within `_remove_effect_tokens` to handle removing the effects from the invoke_subgraph subgraph
cc ezyang EikanWang jgong5 wenzhe-nrv
[ghstack-poisoned]
This PR adds support for effectful ops within invoke_subgraphs.
* Most of the logic is in `invoke_subgraph.py_functionalize_impl`.
* In the functionalization metadata collection phase, we note the tokens before going further down the dispatcher, and then note the tokens after coming back from the dispatcher. If there are nodes in the invoke_subgraph subgraph that contain effects, the number of effects should change, or the tokens used for an effect should.
* We will store this effect difference in the `InvokeSubgraphCache` where the key is the identifier and value is the effect. For now we only support one effect within a subgraph.
* During the tracing part of AOTAutograd, we will then wrap the subgraph to take in and output a token.
Before:
```
def forward(self, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', x)
return invoke_subgraph
def repeated_subgraph(self, x):
record_memory = torch.ops.mylib.record_memory.default("forward", "N")
add = torch.ops.aten.add(x, x)
return add
```
After:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
* Then there is a bunch of logic within `_remove_effect_tokens` to handle removing the effects from the invoke_subgraph subgraph
cc ezyang EikanWang jgong5 wenzhe-nrv
[ghstack-poisoned]
| assert all( | ||
| isinstance(o, (torch.Tensor, int, torch.SymInt, torch.Generator)) | ||
| for o in operands | ||
| if o is not None |
There was a problem hiding this comment.
when do you see None as input?
There was a problem hiding this comment.
The effect tokens are passed in as None here since we will eventually just discard these inputs.
This PR adds support for effectful ops within invoke_subgraphs.
* Most of the logic is in `invoke_subgraph.py_functionalize_impl`.
* In the functionalization metadata collection phase, we note the tokens before going further down the dispatcher, and then note the tokens after coming back from the dispatcher. If there are nodes in the invoke_subgraph subgraph that contain effects, the number of effects should change, or the tokens used for an effect should.
* We will store this effect difference in the `InvokeSubgraphCache` where the key is the identifier and value is the effect. For now we only support one effect within a subgraph.
* During the tracing part of AOTAutograd, we will then wrap the subgraph to take in and output a token.
Before:
```
def forward(self, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', x)
return invoke_subgraph
def repeated_subgraph(self, x):
record_memory = torch.ops.mylib.record_memory.default("forward", "N")
add = torch.ops.aten.add(x, x)
return add
```
After:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
* Then there is a bunch of logic within `_remove_effect_tokens` to handle removing the effects from the invoke_subgraph subgraph
cc ezyang EikanWang jgong5 wenzhe-nrv
[ghstack-poisoned]
|
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit f49833d. Reverted #167231 on behalf of https://github.com/yangw-dev due to the diff breaks tests internally ([comment](#167231 (comment)))
|
@angelayi your PR has been successfully reverted. |
This PR adds support for effectful ops within invoke_subgraphs.
* Most of the logic is in `invoke_subgraph.py_functionalize_impl`.
* In the functionalization metadata collection phase, we note the tokens before going further down the dispatcher, and then note the tokens after coming back from the dispatcher. If there are nodes in the invoke_subgraph subgraph that contain effects, the number of effects should change, or the tokens used for an effect should.
* We will store this effect difference in the `InvokeSubgraphCache` where the key is the identifier and value is the effect. For now we only support one effect within a subgraph.
* During the tracing part of AOTAutograd, we will then wrap the subgraph to take in and output a token.
Before:
```
def forward(self, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', x)
return invoke_subgraph
def repeated_subgraph(self, x):
record_memory = torch.ops.mylib.record_memory.default("forward", "N")
add = torch.ops.aten.add(x, x)
return add
```
After:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
* Then there is a bunch of logic within `_remove_effect_tokens` to handle removing the effects from the invoke_subgraph subgraph
cc ezyang EikanWang jgong5 wenzhe-nrv
Differential Revision: [D87392741](https://our.internmc.facebook.com/intern/diff/D87392741)
[ghstack-poisoned]
|
@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Starting merge as part of PR stack under #167363 |
…67245) In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740) Pull Request resolved: #167245 Approved by: https://github.com/anijain2305 ghstack dependencies: #167231
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.
The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.
Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.
Before:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
After:
```
def forward(self, x):
token = torch.ops.prims._make_token.default()
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.with_effects(
token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
_ = torch.ops.prims._sink_tokens.default([getitem])
return (getitem_1,)
def repeated_subgraph(self, x):
token = torch.ops.prims._make_token.default()
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
_ = torch.ops.prims._sink_tokens.default([getitem])
return (add,)
```
Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: #167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: #167231, #167245
This PR adds support for effectful ops within invoke_subgraphs.
* Most of the logic is in `invoke_subgraph.py_functionalize_impl`.
* In the functionalization metadata collection phase, we note the tokens before going further down the dispatcher, and then note the tokens after coming back from the dispatcher. If there are nodes in the invoke_subgraph subgraph that contain effects, the number of effects should change, or the tokens used for an effect should.
* We will store this effect difference in the `InvokeSubgraphCache` where the key is the identifier and value is the effect. For now we only support one effect within a subgraph.
* During the tracing part of AOTAutograd, we will then wrap the subgraph to take in and output a token.
Before:
```
def forward(self, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', x)
return invoke_subgraph
def repeated_subgraph(self, x):
record_memory = torch.ops.mylib.record_memory.default("forward", "N")
add = torch.ops.aten.add(x, x)
return add
```
After:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
* Then there is a bunch of logic within `_remove_effect_tokens` to handle removing the effects from the invoke_subgraph subgraph
Differential Revision: [D87392741](https://our.internmc.facebook.com/intern/diff/D87392741)
Pull Request resolved: pytorch#167231
Approved by: https://github.com/anijain2305
…torch#167245) In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740) Pull Request resolved: pytorch#167245 Approved by: https://github.com/anijain2305 ghstack dependencies: pytorch#167231
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.
The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.
Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.
Before:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
After:
```
def forward(self, x):
token = torch.ops.prims._make_token.default()
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.with_effects(
token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
_ = torch.ops.prims._sink_tokens.default([getitem])
return (getitem_1,)
def repeated_subgraph(self, x):
token = torch.ops.prims._make_token.default()
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
_ = torch.ops.prims._sink_tokens.default([getitem])
return (add,)
```
Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: pytorch#167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: pytorch#167231, pytorch#167245
This reverts commit f49833d. Reverted #167231 on behalf of https://github.com/yangw-dev due to the diff breaks tests internally ([comment](#167231 (comment)))
This PR adds support for effectful ops within invoke_subgraphs.
* Most of the logic is in `invoke_subgraph.py_functionalize_impl`.
* In the functionalization metadata collection phase, we note the tokens before going further down the dispatcher, and then note the tokens after coming back from the dispatcher. If there are nodes in the invoke_subgraph subgraph that contain effects, the number of effects should change, or the tokens used for an effect should.
* We will store this effect difference in the `InvokeSubgraphCache` where the key is the identifier and value is the effect. For now we only support one effect within a subgraph.
* During the tracing part of AOTAutograd, we will then wrap the subgraph to take in and output a token.
Before:
```
def forward(self, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', x)
return invoke_subgraph
def repeated_subgraph(self, x):
record_memory = torch.ops.mylib.record_memory.default("forward", "N")
add = torch.ops.aten.add(x, x)
return add
```
After:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
* Then there is a bunch of logic within `_remove_effect_tokens` to handle removing the effects from the invoke_subgraph subgraph
Differential Revision: [D87392741](https://our.internmc.facebook.com/intern/diff/D87392741)
Pull Request resolved: #167231
Approved by: https://github.com/anijain2305
…67245) In the [previous PR](https://github.com/pytorch/pytorch/pull/167231/files#diff-e2b74af5d8b538a7d07d18507d27010703742ddad5f819992b55f5abc6d9a502R964-R966) we found that the autograd eager impl of invoke_subgraph calls the subgraph twice. If the subgraph contains effects then effects will be run twice, which is bad. This PR fixes the issue by getting the output metadata from `subgraph`'s `node.meta` if it exists. Differential Revision: [D87392740](https://our.internmc.facebook.com/intern/diff/D87392740) Pull Request resolved: #167245 Approved by: https://github.com/anijain2305 ghstack dependencies: #167231
Updates the implementation of `unlift_tokens` to handle unlifting invoke_subgraph.
The context of `unlift_tokens` is currently tokens are threaded as inputs and outputs of the toplevel graph produced by AOTAutograd. However we don't want the inductor traced graph to have any notion of effects/tokens, just that the tokens should introduce some extra dependency behavior. So, we unlift the tokens from the toplevel graph. Instead of placeholder nodes the tokens will come from a `_make_token` call, and instead of outputting the tokens we will sink all tokens into `_sink_tokens`.
Similarly, we want the invoke_subgraph subgraph to not have any notion of tokens, so we will also remove the tokens from the inputs of the invoke_subgraph subgraph. However, we still need a way mark the invoke_subgraph call as being effectful at the toplevel module to prevent invoke_subgraph calls from being reordered, so I wrap the invoke_subgraph with an effects.
Before:
```
def forward(self, token, x):
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.invoke_subgraph(repeated_subgraph0, 'subgraph_0', token, x)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
return (getitem, getitem_1)
def repeated_subgraph(self, token, x):
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
return (getitem, add)
```
After:
```
def forward(self, x):
token = torch.ops.prims._make_token.default()
repeated_subgraph0 = self.repeated_subgraph0
invoke_subgraph = torch.ops.higher_order.with_effects(
token, torch.ops.higher_order.invoke_subgraph, repeated_subgraph0, 'subgraph_0', token, x
)
getitem = invoke_subgraph[0] # output token
getitem_1 = invoke_subgraph[1]
_ = torch.ops.prims._sink_tokens.default([getitem])
return (getitem_1,)
def repeated_subgraph(self, x):
token = torch.ops.prims._make_token.default()
with_effects = torch.ops.higher_order.with_effects(token, torch.ops.mylib.record_memory.default, 'forward', 'N')
getitem = with_effects[0] # output token
add = torch.ops.aten.add(x, x)
_ = torch.ops.prims._sink_tokens.default([getitem])
return (add,)
```
Differential Revision: [D87668981](https://our.internmc.facebook.com/intern/diff/D87668981)
Pull Request resolved: #167363
Approved by: https://github.com/fxdawnn
ghstack dependencies: #167231, #167245
ghstack-source-id: b97d03a Pull Request resolved: pytorch/pytorch#167231
ghstack-source-id: 251aa71 Pull Request resolved: pytorch/pytorch#167231
Summary: Previously pytorch#167231 modified ep.module() such that it deepcopies ep before applying ep.module() so that ep.module() does not affect the original ep. However we didn't deepcopy the graph signature which caused some issues. This PR addresses it. We also add an additional remove_effect_tokens_pass before returning the new ep so that this ep can be serialized by sigmoid. Differential Revision: D89209276
Previously #167231 modified ep.module() such that it deepcopies ep before applying ep.module() so that ep.module() does not affect the original ep. However we didn't deepcopy the graph signature which caused some issues. Differential Revision: D89209276 Pull Request resolved: #170461 Approved by: https://github.com/ydwu4
Previously pytorch#167231 modified ep.module() such that it deepcopies ep before applying ep.module() so that ep.module() does not affect the original ep. However we didn't deepcopy the graph signature which caused some issues. Differential Revision: D89209276 Pull Request resolved: pytorch#170461 Approved by: https://github.com/ydwu4
Previously pytorch#167231 modified ep.module() such that it deepcopies ep before applying ep.module() so that ep.module() does not affect the original ep. However we didn't deepcopy the graph signature which caused some issues. Differential Revision: D89209276 Pull Request resolved: pytorch#170461 Approved by: https://github.com/ydwu4
This PR adds support for effectful ops within invoke_subgraphs.
invoke_subgraph.py_functionalize_impl.InvokeSubgraphCachewhere the key is the identifier and value is the effect. For now we only support one effect within a subgraph.Before:
After:
_remove_effect_tokensto handle removing the effects from the invoke_subgraph subgraphStack from ghstack (oldest at bottom):
cc @ezyang @EikanWang @jgong5 @wenzhe-nrv
Differential Revision: D87392741