fix: Add special cases for clone and to_copy where input of graph is output#2265
fix: Add special cases for clone and to_copy where input of graph is output#2265
clone and to_copy where input of graph is output#2265Conversation
4f4556f to
bcfa6c1
Compare
|
@gs-olive is this only for graphs like: Is it worth even converting these into engines? |
bcfa6c1 to
cbd22ac
Compare
|
@narendasan - This change is primarily for cases encountered in detectron-style models where inputs are also outputs of the engine. It might be something like: graph(x: Tensor, y: Tensor):
# A lot of logic/operations applied to y
...
return x, y_new |
|
I see |
| # dtype but we don't have a way to detect whether it makes sense for the | ||
| # scalar to be float or half. Hence we go with the lhs dtype. | ||
| if is_lhs_trt_tensor and isinstance(rhs_val, (float, int)): | ||
| rhs_val = torch.tensor([rhs_val], dtype=lhs_dtype) |
There was a problem hiding this comment.
Why is this getting converted to np array in place of the torch.Tensor. Is that because the torch.Tensor was leading to clones getting produced in the graph?
There was a problem hiding this comment.
Creating torch.Tensor objects within the torch.compile scope causes them to be "Fake-ified" which removes the data contained within them.
| and any(user.op == "output" for user in list(node.users.keys())) | ||
| ) | ||
|
|
||
|
|
There was a problem hiding this comment.
Why is there any used here? I believe that the above is for to_copy and clone cases having one node arg only? Or are there different cases for this?
There was a problem hiding this comment.
any is used here because we are checking if any of the users of the node are outputs, meaning that the node is the only function between a placeholder and an output. We are effectively search for subgraphs where an input is followed by a function is followed by an output.
cbd22ac to
a57c97f
Compare
a57c97f to
a055858
Compare
- TRT does not allow inputs of graphs to be outputs as well, however many of the scenarios encountered in real models can have this situation come up, especially in cases where the input is cloned or copied and then returned - The current converters will register these operators as a no-op, causing TRT engine building to fail on such inputs - Instead of requiring creation of an identity layer for every case of a clone or copy node, we instead check if that node is the only operator on a placeholder (input) and then insert the identity layer or not, accordingly - Coalesce implementations of clone and to_copy, which are effectively the same operator - Add test cases to validate new behavior - Add new boilerplate converter validator utility to support this case
a055858 to
892f7d0
Compare
Description
Addresses bug in #1565
Type of change
Checklist: