Conversation
dcfe61d to
6abe7ce
Compare
| fx_graph = torch.fx.symbolic_trace(emptyLike()) | ||
| unexpected_ops_seen, expected_ops_unseen = lower_graph_testing( | ||
| fx_graph, | ||
| inputs, | ||
| expected_ops=expected_ops, | ||
| unexpected_ops=unexpected_ops, | ||
| min_block_size=1, | ||
| ) |
There was a problem hiding this comment.
Could you show a printout of what the original and final graphs look like in this case? I want to verify that there is not a circular issue where empty_permuted generates empty_like, and vice versa
There was a problem hiding this comment.
With the empty_permute decomposition the graph is this
Pre-AOT Autograd graph:=============
graph():
%l_x_ : torch.Tensor [num_users=1] = placeholder[target=L_x_]
%add : [num_users=2] = call_function[target=torch.ops.aten.add](args = (%l_x_, %l_x_), kwargs = {})
%empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {})
%add_1 : [num_users=1] = call_function[target=operator.add](args = (%empty_like_default, %add), kwargs = {})
return (add_1,)
Post-AOT Autograd graph:=======
graph():
%arg0_1 : [num_users=1] = placeholder[target=arg0_1]
%clone : [num_users=1] = call_function[target=torch.ops.aten.clone.default](args = (%arg0_1,), kwargs = {})
%add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%clone, %clone), kwargs = {})
%empty : [num_users=1] = call_function[target=torch.ops.aten.empty.memory_format](args = ([3, 2],), kwargs = {dtype: torch.float32,
layout: torch.strided, device: cuda:0, pin_memory: False})
%permute : [num_users=1] = call_function[target=torch.ops.aten.permute.default](args = (%empty, [0, 1]), kwargs = {})
%add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%permute, %add), kwargs = {})
return (add_1,)
Graph after constant folding:
graph():
%arg0_1 : [num_users=1] = placeholder[target=arg0_1]
%add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {})
%_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0]
%add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {})
return (add_1,)
Post-lowering passes Autograd graph:=======
graph():
%arg0_1 : [num_users=1] = placeholder[target=arg0_1]
%add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {})
%_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0]
%add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {})
return (add_1,)
Without the decomposition, the graph is
Pre-AOT Autograd graph:=============
graph():
%l_x_ : torch.Tensor [num_users=1] = placeholder[target=L_x_]
%add : [num_users=2] = call_function[target=torch.ops.aten.add](args = (%l_x_, %l_x_), kwargs = {})
%empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {})
%add_1 : [num_users=1] = call_function[target=operator.add](args = (%empty_like_default, %add), kwargs = {})
return (add_1,)
Post-AOT Autograd graph:=======
graph():
%arg0_1 : [num_users=1] = placeholder[target=arg0_1]
%clone : [num_users=1] = call_function[target=torch.ops.aten.clone.default](args = (%arg0_1,), kwargs = {})
%add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%clone, %clone), kwargs = {})
%empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([3, 2], [0, 1]), kwargs = {dt
ype: torch.float32, layout: torch.strided, device: cuda:0, pin_memory: False})
%add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%empty_permuted, %add), kwargs = {})
return (add_1,)
Graph after constant folding:
graph():
%arg0_1 : [num_users=1] = placeholder[target=arg0_1]
%add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {})
%_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0]
%add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {})
return (add_1,)
Post-lowering passes Autograd graph:=======
graph():
%arg0_1 : [num_users=1] = placeholder[target=arg0_1]
%add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {})
%_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0]
%add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {})
return (add_1,)
So empty_like decomposes into empty_permute which decomposes into empty.memory_format. The above test does not give error, even though empty.memory_format is not supported since constant folding removes the op.
I am working on empty.memory_format in PR #2745
There was a problem hiding this comment.
In the above example, the Pre-AOT graph shows:
%empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {})
Since there is only one argument in args, what is empty_permute = args[1] defined as in the decomposition for that case?
There was a problem hiding this comment.
In the above case with the AOT decomposition, the above operation decomposes to
%empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([3, 2], [0, 1]), kwargs = {dtype: torch.float32, layout: torch.strided, device: cuda:0, pin_memory: False})
The args[1] in this case is[0,1] since it keeps the shapes in the original form.
Not sure how it gets the [0,1] exact, but I assume it must be the internal AOT lowering heuristics?
gs-olive
left a comment
There was a problem hiding this comment.
Overall looks good to me - added one clarifying question
| fx_graph = torch.fx.symbolic_trace(emptyLike()) | ||
| unexpected_ops_seen, expected_ops_unseen = lower_graph_testing( | ||
| fx_graph, | ||
| inputs, | ||
| expected_ops=expected_ops, | ||
| unexpected_ops=unexpected_ops, | ||
| min_block_size=1, | ||
| ) |
There was a problem hiding this comment.
In the above example, the Pre-AOT graph shows:
%empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {})
Since there is only one argument in args, what is empty_permute = args[1] defined as in the decomposition for that case?
This is an extension to support aten::empty_like.