Remove preserve_zero and zero_point_domain from choose_qparams_affine#2149
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2149
Note: Links to docs will display an error until the docs builds have been completed. ❌ 7 New FailuresAs of commit 214e704 with merge base 212d912 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
85936a5 to
9780257
Compare
| quant_min = 0 | ||
| quant_max = 15 | ||
| zero_point_domain = ZeroPointDomain.FLOAT | ||
| # zero_point_domain is ZeroPointDomain.FLOAT |
There was a problem hiding this comment.
I thought to keep it for now, as it can be an indicator of previous implementation.
| block_size=block_size, | ||
| target_dtype=target_dtype, | ||
| _layout=_layout, | ||
| scale_dtype=torch.float32, |
There was a problem hiding this comment.
should this be reverted?
jerryzh168
left a comment
There was a problem hiding this comment.
looks good, thanks @jainapurva for carefully working through this!
| zero_point_domain is optional specifies how we quantize the floating point to quantized data: | ||
| INT: quantized_val = (float_val / scale) (integer) + zero_point (integer) | ||
| FLOAT: quantized_val = (float_val - (zero_point (float) - scale * mid_point)) / scale | ||
| None: quantized_val = (float_val / scale) | this is primarily used for floatx quantization | ||
| Where we do not want to round values to nearest integer and instead scale and cast. |
There was a problem hiding this comment.
nit: we can just leave the one that is relevant
| raise ValueError("Please use ZeroPointDomain.NONE instead of None") | ||
| elif zero_point_domain is ZeroPointDomain.NONE and zero_point is not None: | ||
| raise ValueError("zero_point should be None when zero_point_domain is NONE") | ||
| # if zero_point_domain is None: |
There was a problem hiding this comment.
nit: please remove the commented code before landing
| quant_max: Union[int, float], | ||
| output_dtype: torch.dtype = torch.float32, | ||
| ) -> torch.Tensor: | ||
| """This function converts AQT tensors to their high precision floating point representation |
There was a problem hiding this comment.
should we only have doc for non-private helper functions?
jerryzh168
left a comment
There was a problem hiding this comment.
I think the docs has to be updated a bit, commented inline
This pull request focuses on refactoring and simplifying quantization-related code by removing unused or redundant functionality and introducing specialized methods for handling specific cases. The most important changes include removing the
preserve_zeroandzero_point_domainparameters from many functions, introducing new specialized quantization and dequantization methods, and modifying use-cases accordingly.Refactoring and Simplification:
preserve_zeroandzero_point_domainparameters fromchoose_qparams_affine,quantize_affine, anddequantize_affinecalls across multiple files, while introducing specialized methods to handle specific quantization scenarios.The following table contains the new methods:
Notable updates related to the changes:
The following list contains AOBaseConfigs, along with the corresponding choose_qparams_affine function calls made by the backend for each configuration: