Adds utility to replace Q/DQ ops with torchao quantized linear ops#1967
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1967
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 74f209e with merge base 3bbf42a ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
|
||
| eager_results = model(activations) | ||
|
|
||
| unwrap_tensor_subclass(model) |
There was a problem hiding this comment.
Let me double check. I thought without it, the exported graph didn't decompose into the Q/DQ ops.
There was a problem hiding this comment.
Removed unwrap_tensor_subclass
| return pattern, replacement | ||
|
|
||
|
|
||
| def replace_q_dq_with_torchao_quantized_linear_ops( |
There was a problem hiding this comment.
can you add a bit context on when this is used in the docstring of the function
There was a problem hiding this comment.
Added context in doc string
ca5e688 to
74f209e
Compare
jerryzh168
left a comment
There was a problem hiding this comment.
looks good, maybe also add some context in the summary about when do we use this, why people first use QDQLayout and then do the fusion instead of generating these ops directly with some other layout
…1967) * up * up * up * up
This utility is for export scenarios in which people quantize with Q/DQ layout, and then later want to fuse the ops.