[quantization] Store bias in PackedLinearWeight struct in fbgemm#25428
[quantization] Store bias in PackedLinearWeight struct in fbgemm#25428supriyar wants to merge 15 commits intogh/supriyar/7/basefrom
Conversation
Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
Pull Request resolved: #25428 Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. ghstack-source-id: 89269220 Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
This comment has been minimized.
This comment has been minimized.
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
jamesr66a
left a comment
There was a problem hiding this comment.
ROCM failure is true-positive. The function signature at qlinear.cpp:218 needs to be updated
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
Pull Request resolved: #25428 Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. ghstack-source-id: 89329368 Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
|
I have approved it. Please take care of the tests. |
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
| /*ld=*/K, | ||
| /*pmat=*/nullptr, // PackBMatrix manages ownership of pmat | ||
| /*groups=*/1), | ||
| bias_contig, |
There was a problem hiding this comment.
How does this work when bias is None?
There was a problem hiding this comment.
It is stored as an optional tensor - similar to the current linear op that has bias as an input argument. So if it is none then that gets taken care of.
|
We also need to make similar changes to: https://github.com/pytorch/pytorch/blob/9d06a984f866289c2acb28a379f62378c5e70454/torch/nn/quantized/functional.py |
raghuramank100
left a comment
There was a problem hiding this comment.
Looks great, a few suggested changes.
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
|
Removed self.bias from the modules. Added |
…fbgemm" Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. Differential Revision: [D17121304](https://our.internmc.facebook.com/intern/diff/D17121304/)
The updated API for |
|
This pull request has been merged in 9d2d31e. |
Summary: Pull Request resolved: pytorch/pytorch#25428 Added bias as an optional param to the quantized_linear_prepack function. Bias is quantized during runtime using input scale and weight scale. ghstack-source-id: 89601399 Test Plan: python test/run_test.py --exclude nn --verbose --bring-to-front quantization quantized quantized_tensor quantized_nn_mods quantizer Differential Revision: D17121304 fbshipit-source-id: 8adb0e55e4aed0a5430aaa2c8639c8ad1639c85a
Stack from ghstack:
Added bias as an optional param to the quantized_linear_prepack function.
Bias is quantized during runtime using input scale and weight scale.
Differential Revision: D17121304