Add per_tensor_quantize to int8 quantize by zihaomu · Pull Request #21372 · opencv/opencv

zihaomu · 2021-12-31T08:23:21Z

Hi, the purpose of this PR is to add per-tensor quantization to the model quantization part of opencv dnn.
The difference of per-channel quantization and per-tensor quantization.

The existing quantization method in opencv/dnn is based on per-channel quantization, which can achieve better accuracy than per-tensor quantization. But for some hardware, per-tensor quantization will be easier to optimize the speed, especially NPU chips.

For example, the NPU of TIM-VX backend:
ResNet50 int8 (per-channel) takes 525.298 ms on Khadas Vim3,
And ResNet50 int8 (per-tensor) takes 20.01 ms on Khadas Vim3.

Compared with per-channel quantize, the per-tensor quantization has the disadvantage of low accuracy. Some original unit test in dnn/test/test_int8_layer.cpp cannot be passed due to low accuracy. So I changed the threshold of the parameter in some unit test cases.

Related PR.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

The feature is well documented and sample code can be built with the project CMake
I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
The PR is proposed to proper branch
There is reference to original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.

vpisarev · 2022-03-24T01:14:24Z

👍
@zihaomu, please, resolve merge conflicts and this PR can be merged

rogday

Thank you!

modules/dnn/include/opencv2/dnn/dnn.hpp

modules/dnn/src/layers/fully_connected_layer.cpp

modules/dnn/src/layers/convolution_layer.cpp

zihaomu · 2022-03-31T08:05:35Z

@rogday I have removed some duplicating code, but not that much.

rogday · 2022-03-31T14:36:31Z

@rogday I have removed some duplicating code, but not that much.

I was thinking more of along these lines. Take a look, feel free to disagree. It has major disadvantage of taking a lot of parameters. But no code duplication.

zihaomu · 2022-04-01T00:53:06Z

I was thinking more of along these lines. Take a look, feel free to disagree. It has major disadvantage of taking a lot of parameters. But no code duplication.

Thanks for your reply. My concern is a function with a lot of parameters maybe make code more complicated and hard to understand.

alalek · 2022-04-01T09:10:51Z

Please rebase to resolve merge conflict: modules/dnn/test/test_int8_layers.cpp

asmorkalov · 2022-04-15T11:46:20Z

@zihaomu @rogday Friendly reminder.

rogday

Looks good. We have a few duplicating lines, but given the alternative - function with lots of parameters, it's fine.

modules/dnn/src/layers/convolution_layer.cpp

modules/dnn/src/layers/fully_connected_layer.cpp

asmorkalov · 2022-04-28T05:44:49Z

@zihaomu Friendly reminder.

asmorkalov · 2022-06-10T07:46:33Z

@zihaomu Friendly reminder.

zihaomu · 2022-06-10T08:01:35Z

@asmorkalov Thanks for the reminder, at the moment I'm all into fast Conv. I'll be updating this PR early next week.

zihaomu · 2022-06-13T04:51:16Z

Hi @rogday, the code has been updated.

rogday

LGTM! 👍

alalek · 2022-06-17T15:44:28Z

modules/dnn/src/net_quantization.cpp

 Net Net::Impl::quantize(InputArrayOfArrays calibData, int inputsDtype, int outputsDtype)
+{
+    return quantize(calibData, inputsDtype, outputsDtype, false);
+}


Do we need this method? (in internal class)

Thanks for code reviewing.
How about removing Net Net::Impl::quantize(InputArrayOfArrays calibData, int inputsDtype, int outputsDtype) and just keeping Net Net::Impl::quantize(InputArrayOfArrays calibData, int inputsDtype, int outputsDtype, bool perTensor = false)?

We could remove this if it is not used anymore.

alalek · 2022-06-17T15:47:03Z

modules/dnn/test/test_int8_layers.cpp

        testDarknetModel(config_file, weights_file, ref.rowRange(0, N0), scoreDiff, iouDiff, confThreshold);
+
+        // per-tensor quantize
+        testDarknetModel(config_file, weights_file, ref.rowRange(0, N0), scoreDiff, 0.16, 0.7, 0.4, true);


BTW, It makes sense to create dedicated test or at least use SCOPED_TRACE();.
(here and above)

Ok, I will try to update it.

Hi, the SCOPED_TRACE() has been used in every test case of per-tensor quantization.

zihaomu · 2022-06-23T03:52:19Z

@alalek There is a API ERROR, due to we have removed the oringinal Net Net::quantize(InputArrayOfArrays calibData, int inputsDtype, int outputsDtype). CI Report.

alalek · 2022-06-23T05:21:13Z

due to we have removed the oringinal Net Net::Impl::quantize(InputArrayOfArrays calibData, int inputsDtype, int outputsDtype)

Net::Impl is internal API. It is not checked by API compatibility tool.

Tool blames on public Net::quantize ( cv::InputArrayOfArrays calibData, int inputsDtype, int outputsDtype ) (we still need thin wrapper for it over extended overload)

I believe we could add this call into "skip" list of this tool as quantization API is experimental (and we don't need to maintain compatibility overloads here).
I will do this later if there is no objections.

vpisarev · 2022-06-24T09:44:03Z

@zihaomu, thank you! What about the case when we already have quantized model stored in ONNX format and load it - does it recognize "per-tensor" case?

zihaomu · 2022-06-24T09:49:01Z

@zihaomu, thank you! What about the case when we already have quantized model stored in ONNX format and load it - does it recognize "per-tensor" case?

This PR only affects the fly-quantize model. And about per-quantized ONNX model, if the original model is per-tensor model, it will be run in per-tensor way. And if it is per-channel model, it will run in the per-channel way.

…type and onnx importer.

zihaomu · 2022-06-26T03:30:56Z

Hi @vpisarev @alalek and @rogday, I have changed the API from perTensor to perChannel. Because the ONNX uses the same Quantize Flag at the quantize_static(..., per_channel, ...).
In addition, the onnx_importer can recognize quantization type at QConv and QMatMul Layers.

alalek · 2022-06-26T07:56:13Z

modules/dnn/src/net.cpp


 // FIXIT drop from inference API
-Net Net::quantize(InputArrayOfArrays calibData, int inputsDtype, int outputsDtype)
+Net Net::quantize(InputArrayOfArrays calibData, int inputsDtype, int outputsDtype, bool perTensor = false)


bool perTensor = false

Default value works properly from .hpp files only. They are useless in .cpp files

Thanks, fixed.

Add per_tensor_quantize to int8 quantize * add per_tensor_quantize to dnn int8 module. * change api flag from perTensor to perChannel, and recognize quantize type and onnx importer. * change the default to hpp

alalek assigned vpisarev Jan 18, 2022

zihaomu force-pushed the dnn_quantize_per_tensor branch from 96f6a62 to 955ec56 Compare March 28, 2022 02:59

zihaomu requested review from alalek and rogday March 30, 2022 02:18

rogday suggested changes Mar 30, 2022

View reviewed changes

modules/dnn/include/opencv2/dnn/dnn.hpp Show resolved Hide resolved

modules/dnn/src/layers/fully_connected_layer.cpp Outdated Show resolved Hide resolved

modules/dnn/src/layers/convolution_layer.cpp Outdated Show resolved Hide resolved

zihaomu force-pushed the dnn_quantize_per_tensor branch from a2f4c13 to 266522a Compare March 31, 2022 07:16

zihaomu force-pushed the dnn_quantize_per_tensor branch from 266522a to 22ead8d Compare April 1, 2022 14:50

rogday suggested changes Apr 15, 2022

View reviewed changes

zihaomu force-pushed the dnn_quantize_per_tensor branch from bc7a942 to f52a4cb Compare June 13, 2022 01:08

zihaomu requested a review from rogday June 13, 2022 02:46

rogday approved these changes Jun 13, 2022

View reviewed changes

alalek reviewed Jun 17, 2022

View reviewed changes

add per_tensor_quantize to dnn int8 module.

ca2711f

zihaomu force-pushed the dnn_quantize_per_tensor branch from f52a4cb to ca2711f Compare June 23, 2022 02:50

change api flag from perTensor to perChannel, and recognize quantize …

03ac646

…type and onnx importer.

alalek reviewed Jun 26, 2022

View reviewed changes

change the default to hpp

f45dadf

zihaomu requested a review from alalek June 29, 2022 09:06

vpisarev approved these changes Jul 5, 2022

View reviewed changes

alalek merged commit a80fcac into opencv:4.x Jul 5, 2022

alalek mentioned this pull request Aug 21, 2022

(5.x) Merge 4.x #22408

Merged

Uh oh!

Conversation

zihaomu commented Dec 31, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

vpisarev commented Mar 24, 2022

Uh oh!

rogday left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zihaomu commented Mar 31, 2022

Uh oh!

rogday commented Mar 31, 2022

Uh oh!

zihaomu commented Apr 1, 2022

Uh oh!

alalek commented Apr 1, 2022

Uh oh!

asmorkalov commented Apr 15, 2022

Uh oh!

rogday left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asmorkalov commented Apr 28, 2022

Uh oh!

asmorkalov commented Jun 10, 2022

Uh oh!

zihaomu commented Jun 10, 2022

Uh oh!

zihaomu commented Jun 13, 2022

Uh oh!

rogday left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zihaomu commented Jun 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalek commented Jun 23, 2022

Uh oh!

vpisarev commented Jun 24, 2022

Uh oh!

zihaomu commented Jun 24, 2022

Uh oh!

zihaomu commented Jun 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zihaomu commented Dec 31, 2021 •

edited

Loading

zihaomu commented Jun 23, 2022 •

edited

Loading

zihaomu commented Jun 26, 2022 •

edited

Loading