DNN: Add qgemm and squeeze op13 supported on ONNXImporter#22306
DNN: Add qgemm and squeeze op13 supported on ONNXImporter#22306opencv-pushbot merged 1 commit intoopencv:4.xfrom
Conversation
| if (constBlobs.find(node_proto.input(3)) == constBlobs.end()) | ||
| CV_Error(Error::StsNotImplemented, "Variable weights is not supported"); |
There was a problem hiding this comment.
inp_sc/inp_zp below can also be non-constant - so I think we should probably have new overload of getGlob that accepts error message and does this CV_Error(Error::StSNotImplemented) stuff under the hood.
There was a problem hiding this comment.
That's a good idea. In the near future, we do not have a plan to load this type ONNX op (like QGemm with non-constant inp_sc/inp_zp). I will try to do it.
| else // ninputs == 8 | ||
| { | ||
| out_sc = getBlob(node_proto, 6); | ||
| bias = Mat::zeros(1, outCn, CV_32S); | ||
| } |
There was a problem hiding this comment.
Again, we assume that input C is skipped, but that might not be the case. And I think there should be "7" either way - the sixth tensor is just empty, no?
There was a problem hiding this comment.
Currently, I can only support 8 or 9 inputs.
If it has 9 inputs, there are two different cases: the sixth tensor is empty or not.
If it has 8 inputs, we think the C is empty, and we get the output_scale at sixth tensor.
This is not a complete solution, but it will handle most cases.
The example has 9 inputs, and with empty tensor at the sixth tensor, so we can only get the output_scale from the seventh tensor by getBlob().
There was a problem hiding this comment.
I think we should test the "else" case, because as far as I understand this paragraph, out_sc should still be the seventh blob. Either way, if we expect something to be missing - we should add an assertion for that.
In case of 8 inputs I think we don't have y_zero_point, and in case of 9 it seems that we ignore it even if we have it.
There was a problem hiding this comment.
Agree, now I get your points.
There was a problem hiding this comment.
Thank you! I see that C and y_scale tensors are being handled correctly now. But what about y_zero_point? I think we should assert that either ninputs == 8 or node_proto.input(8).empty(), or that 8'th blob contains scalar zero.
There was a problem hiding this comment.
Hi @rogday, I have checked the detail of the quantized structure in DNN. It was my fault, we have uniformly set the output_scale and output_zeropoint here. So we don't need to set them again during the node parsing stage. And in some cases, we need to calculate the outputMultiplier, that's why we need to get the output_scale.
There was a problem hiding this comment.
It seems unreasonable to calculate the outputMultiplier in the parse stage, maybe we can move it to the initialization stage of every quantized node.
There was a problem hiding this comment.
That function seems very fragile. It relies on the names the producer decided, not the official spec. I think your PR can be merged, but we need to fix this function in the future.
|
the operations are needed to supported quantized version of text recognition model that one of GSoC students is working on. |
|
@zihaomu Friendly reminder |
1 similar comment
|
@zihaomu Friendly reminder |
|
@zihaomu Please squash the commits before merge. |
1dfe8b6 to
2d837ef
Compare
|
Hi @asmorkalov, done. |
Squeeze node of ONNX has moved the axes from attribute to input.
Test Data
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.