Fix performance on resnet50 quantized models#7670
Fix performance on resnet50 quantized models#7670ilyachur merged 11 commits intoopenvinotoolkit:masterfrom
Conversation
LP transformations won't work on the model unless the last 4 inputs to FakeQuantize are constants. In order to meet that requirement, we need to perform constant folding for those inputs in QuantizeLinear ONNX operator. Ticket: 65375
| @@ -0,0 +1,126 @@ | |||
| ir_version: 6 | |||
There was a problem hiding this comment.
I don't like the approach to store models under git. And what is the license of the model? Is there the way not to store this model as a file under git?
There was a problem hiding this comment.
We use this approach for a long time, for example:
https://github.com/openvinotoolkit/openvino/tree/master/inference-engine/tests/functional/inference_engine/onnx_reader/models
https://github.com/openvinotoolkit/openvino/tree/master/ngraph/test/models
There was a problem hiding this comment.
In general case we can have a model as a C++ string and after that read model from the stream. It should allow to avoid additional files on file system
There was a problem hiding this comment.
that doesn't work anymore since we support only protobuf lite. Current approach is to have models in prototxt and convert them to onnx during build time.
|
|
||
| using namespace ONNXTestsDefinitions; | ||
|
|
||
| INSTANTIATE_TEST_SUITE_P(ONNXQuantizedModels, QuantizedModelsTests, |
There was a problem hiding this comment.
Can we enable these tests for other plugins?
There was a problem hiding this comment.
I think that can be checked later. Currently I wanted to make sure that CPU works with onnx low precision models.
There was a problem hiding this comment.
Please create a ticket to enable tests for other plugins in this case.
| @@ -0,0 +1,126 @@ | |||
| ir_version: 6 | |||
There was a problem hiding this comment.
In general case we can have a model as a C++ string and after that read model from the stream. It should allow to avoid additional files on file system
|
@mateusztabaka , do you have any progress with this PR? When do you think it can be merged? |
From my perspective it's ready. I'm just waiting for tests to be green |
LP transformations won't work on the model unless the last 4 inputs to FakeQuantize
are constants. In order to meet that requirement, we need to perform constant folding
for those inputs in QuantizeLinear ONNX operator.
Ticket: 65375