Add support for quantized ONNX networks

It's possible to quantize ONNX networks to reduce the storage requirements and also accelerate the inference: https://www.onnxruntime.ai/docs/how-to/quantization.html

However, OpenCV 4.x/pre-5.0 is unable to load such networks, because of the missing support for layers `QLinearConv` and `QLinearMatMul` that such quantized networks contain.

It would be nice to add support for such layers into OpenCV. By default, the weights can be converted to FP32 (or FP16 maybe), but the original INT8 weights should be preserved as well — we will be adding fixed-point paths to our implementations of convolution and fully-connected layers.

For testing, here is the original ONNX model: 
https://drive.google.com/file/d/1JW6_zrgzjeSZQcKKEDhTvp3aseNu0pe9/view?usp=sharing
and its quantized variant:
https://drive.google.com/file/d/1RHkF8pGMfo0covNR0_GQhB11JvrzogFO/view?usp=sharing
(provided by @SamFC10)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for quantized ONNX networks #20188

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Add support for quantized ONNX networks #20188

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions