Skip to content

dnn : int8 quantized layers support in onnx importer#20535

Merged
alalek merged 9 commits intoopencv:masterfrom
jebastin-nadar:onnx-q
Oct 4, 2021
Merged

dnn : int8 quantized layers support in onnx importer#20535
alalek merged 9 commits intoopencv:masterfrom
jebastin-nadar:onnx-q

Conversation

@jebastin-nadar
Copy link
Copy Markdown
Contributor

@jebastin-nadar jebastin-nadar commented Aug 11, 2021

merge with extra : opencv/opencv_extra#896

Final PR for GSoC'21 project on 8-bit quantization support in dnn module. This PR adds new layers which are currently supported by onnx quantization.
Docs - https://onnxruntime.ai/docs/how-to/quantization.html
Supported layers - https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/python/tools/quantization/registry.py#L34-L53

resolves : #20188
replaces #20264

TODO:

  • tests for new layers
  • fallback to FP32 for unsupported cases . Automatic fallback does not look possible
  • add support for unsupported cases - eltwise scalar input, concat layer, int8 resize layer
  • add quantized resnet50 onnx test - Add qunatized resnet50 model onnx/models#460
  • tutorial

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@vpisarev vpisarev self-assigned this Aug 18, 2021
int depth = CV_32F;
checkQuantizedLayer(node_proto, layerParams, depth);
if (depth == CV_8S)
CV_Error(Error::StsNotImplemented, "Int8 resize layer is not supported");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, we should fallback for such cases - reuse 32F implementation which would always have better coverage.

BTW, #20228 should NOT block implementation of new layers without "int8" stuff. That experimental code must be optional.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding fallback for current layers without int8 support and for new layers added in future :

  1. Functions in 20228 can fallback to FP32 version automatically if int8 version of a layer is unavailable. Tests which have been added check the logic for the same. Adding int8 version of a new layer is optional.
  2. Nodes in this PR which don't have int8 version donot fallback to FP32 version, they should be probably be added. Will try to add it in the coming days. Logic is slightly difficult as quantize/dequantize nodes have to be added before and after the unsupported layer.

@jebastin-nadar
Copy link
Copy Markdown
Contributor Author

Marking pull request as ready for review as coding part is done. Only a short tutorial for using quantization and quantized onnx models remains.

@alalek Automatic FP32 fallback for an unsupported INT8 node does not look possible right now. Instead, adding INT8 path for that unsupported node seems much more easier. That's what I did with resize layer.

INT8 path for resize layer has been added in the FP32 version of the layer itself to avoid code duplication. But looking at it now, maybe the changes made are too much and may affect performance of the FP32 layer. If required, I can move the int8 implementation to int8layers/resize_layer.cpp and revert the changes in layers/resize_layer.cpp.

@jebastin-nadar jebastin-nadar marked this pull request as ready for review September 8, 2021 10:09
@jebastin-nadar
Copy link
Copy Markdown
Contributor Author

Any solution to solve this build failure?

OpenCV tests: Can't find required data file: dnn/onnx/models/resnet50_int8.onnx in function 'findData'
" thrown in the test body

I have modified download_models.py in opencv/opencv_extra#896 to download quantized ResNet50 from onnx modelzoo.

@alalek
Copy link
Copy Markdown
Member

alalek commented Sep 8, 2021

download_models.py is not triggered automatically by CI. Now testdata share should be updated.

testONNXModels("conv_resize_pool_1d");
}

TEST_P(Test_ONNX_layers, Quantized_Convolution)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test_ONNX_layers

This test subset are parametrized to run on all available backends/targets.
This doesn't make sense for now.
Create separate test fixture to test on OpenCV/CPU target only and move there all "quantized" cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense for now

While it's true that other backends don't have quantized layer's implementation, we still need to ensure quantized networks fallback to a supported backend. Keeping the tests available for all backends helps in testing this fallback.

[ RUN      ] Test_ONNX_nets.ResNet50_Int8/0, where GetParam() = NGRAPH/CPU
[ WARN:0] global /home/jebastin/opencv_build/opencv/modules/dnn/src/dnn.cpp (4562) setPreferableBackend DNN: Only default backend supports quantized networks
FALLBACK: Layer [Quantize]:[data_quantized] is expected to has backend implementation
.
.
[       OK ] Test_ONNX_nets.ResNet50_Int8/0 (168 ms)

Internally, the code fallbacks to OpenCV/CPU target for unsupported backends, so I dont see a point in limiting the backends of these tests (except reducing the total time for running dnn module tests).

If you still think a new test fixture is needed, I will start working on it soon (will probably need some reference code)

testONNXModels("resnet50v1", pb, default_l1, default_lInf, true, target != DNN_TARGET_MYRIAD);
}

TEST_P(Test_ONNX_nets, ResNet50_Int8)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test_ONNX_nets

The same.
We need separate test fixture for quantized tests with limited set of backends.

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let put it in to have some usable tests for int8 feature.
Thank you 👍

@alalek alalek merged commit cce78cc into opencv:master Oct 4, 2021
@alalek alalek mentioned this pull request Dec 24, 2021
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
dnn : int8 quantized layers support in onnx importer

* added quantized layers support in onnx importer

* added more cases in eltwise node, some more checks

* added tests for quantized nodes

* relax thresholds for failed tests, address review comments

* refactoring based on review comments

* added support for unsupported cases and pre-quantized resnet50 test

* relax thresholds due to int8 resize layer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for quantized ONNX networks

4 participants