DNN: Reduce Layer (add dynamic batch and ReduceSum support) by zihaomu · Pull Request #22199 · opencv/opencv

zihaomu · 2022-07-06T12:10:54Z

Related ReduceSum issue #22195
Related Dynamic Batch of Reduce Layer issue #22086
In this PR, we supported two input of ReduceSum layer and dynamic batch size in Reduce Layer of ONNX_importer.cpp.

Regression test.

The pervious PR on Dynamic Batch of Reduce Layer has been closed.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

rogday

Thank you for your contribution!
I think it would be great if we had a test covering this new functionality.

rogday · 2022-07-06T12:59:53Z

modules/dnn/src/onnx/onnx_importer.cpp

-        CV_Error(Error::StsNotImplemented, "Unsupported " + layer_type + " operation of opset 13, please try to "
-                                                                         "re-export the onnx model with opset 11.");


We should leave this error, but change the message: we don't support the case of non-constant second input.

modules/dnn/src/onnx/onnx_importer.cpp

…ceLayer.

rogday · 2022-07-18T22:21:30Z

modules/dnn/src/onnx/onnx_importer.cpp

+    // Set batchsize dim as dynamic to be compatible with batch size >= 2.
+    if (targetShape[0] == 1 && targetShape.size() > 1)
+    {
+        std::vector<int> dynamicAxes = {0};  // The index of batchsize dim is 0.
+        std::vector<int> inputIndices = {0};
+
+        layerParams.set("has_dynamic_shapes", true);
+        layerParams.set("dynamic_axes", DictValue::arrayInt(dynamicAxes.data(), dynamicAxes.size()));
+        layerParams.set("input_indices", DictValue::arrayInt(inputIndices.data(), inputIndices.size()));
+    }


That seems a bit off to me. Could you clarify what is happening here?

Thanks for code reviewing @rogday. This part code is for dynamic batch size. The test can be found at reduce_sum_axis_dynamic_batch.
In the current implementation, the parseReduce consists of ReduceLayer and ReshapeLayer.
And the ReduceLayer supports dynamic batch size by default. And for ReshapeLayer to support dynamic batch size needs to manually specify the index of the dynamic dimension.

So, I check if targetShape.size() > 1 first, and then set the index 0 as dynamic shape.
Maybe we can change the judgment logic of if at line 1306 or directly do not support dynamic batch size in parseReduce.

For example, the model input shape is [1, 4, 4, 4]. And the ONNX model one layer of ReduceSum with axis 2, keepdim = 1. But the shape of input blob is [2, 4, 4, 4]. In parseReduce, the ReduceSum layer can output shape with [2, 4, 4]. But the Reshape output is set to [1, 4, 1, 4] without batchsize dynamic shape. And we get an error.
In this case, if we set the index 0 as dynamic shape, the Reshape Layer will re-compute the output shape and output the [2, 4, 1, 4].

BTW, we use Reshape Layer in the parseReduce because when keepdim = 1, we need to reserve the dimension of shape 1.

For example, the model input shape is [1, 4, 4, 4]. And the ONNX model one layer of ReduceSum with axis 2, keepdim = 1. But the shape of input blob is [2, 4, 4, 4]. In parseReduce, the ReduceSum layer can output shape with [2, 4, 4]. But the Reshape output is set to [1, 4, 1, 4] without batchsize dynamic shape. And we get an error. In this case, if we set the index 0 as dynamic shape, the Reshape Layer will re-compute the output shape and output the [2, 4, 1, 4].

BTW, we use Reshape Layer in the parseReduce because when keepdim = 1, we need to reserve the dimension of shape 1.

Thank you very much for clarification! But shouldn't it be error to pass different-shaped inputs? If the batch size is dynamic, we should see [0, 4, 4, 4] as model input(instead of [1, 4, 4, 4]), as far as I know.

Hi @rogday, you're right. If the user has a dynamic shape at batchsize dimension, then the input shape should be [0, 4, 4, 4]. To my knowledge, if the input shape of model is [1, 4, 4, 4] but the shape of the given input is [2, 4, 4, 4], we should support such batchsize dynamic shape (or multi-batch size).

In order to check this, I have generated two different regression test of dynamic batchsize shape:

[1, 4, 4, 4] reduce_sum_axis_dynamic_batch.

[?, 4, 4, 4] google drive

And these two test cases both need the code from line number 1305 to 1314. Should I update these two test regressions to opencv_extra?

@zihaomu, I suggest to add a test that reduce_sum.onnx with shape (2, 3, 4, 2) fails as it should, right now it's not the case because of these lines:

// Support dynamic shape of Batch size. // Note that: when there are multiple dynamic inputs, we will give an error. if (total(outShape) != total(outShapeTmp)) { if (outShape[0] != outShapeTmp[0]) outShape[0] = outShapeTmp[0]; } CV_Assert(total(outShape) == total(outShapeTmp));

LSTM problem can be fixed inside lstm_add_reshape: set has_dynamic_shapes of Reshape layer to true if input shape contains zeros, but this sounds fragile.

I ran all tests from DNN tests suite after commenting out setting batch to one and everything passed(including whole networks) except for LSTM_Activations. If you have a use case that our tests do not cover - please, add it to the tests.

I would suggest removing hasDynamicShapes altogether, but as of now, it's used by SliceLayer, PoolingLayer and ReshapeLayer. Possibly we could go ahead and come up with something more robust than guessing which dimensions can and cannot represent # of batches.

@vpisarev, can you share your thoughts?

Hi @rogday, since currently [0, 4, 4, 4] will be parsed as [1, 4, 4, 4], that's why I write that code. Do you have any better suggestions for this? Maybe we should remove it and do not support dynamic batch size in this PR. After you successfully remove this part, we will come back to support it.

I suggest to add a test that reduce_sum.onnx with shape (2, 3, 4, 2) fails as it should.

We have to make a choice, support dynamic shape batch size in ReduceLayer, or remove this part of the code and support dynamic shape batch size later.

I would suggest removing hasDynamicShapes altogether.

I agree. This is a big change, we can do it in another PR.

Hi @rogday, I have another interesting discovery. In ONNX, if we have a model with fixed shape like [1, 4, 4, 4], and we give the model a input which shape is [2, 4, 4, 4], it should error in the shape matching stage. But for now, OpenCV only gives this type errors inside some layers' getMemoryShapes. I mean it should error earlier if the input shape is not correctly. More specifc, this code never work for ONNX models.

In my opinion, if we want OpenCV gives an error in reduce_sum.onnx with shape (2, 3, 4, 2), it should do it in the input shape match stage, instead of inside reduce Layer.

@zihaomu, I propose to discuss it further on weekly meeting. I agree, we should fail earlier, but sanity checks also come in handy sometimes.

modules/dnn/test/test_onnx_importer.cpp

zihaomu · 2022-08-10T08:38:12Z

Hi @asmorkalov, any update?

rogday

Can be merged as is, we'll fix dynamic shapes on the engine level in the future.

zihaomu · 2022-08-11T10:01:23Z

Can be merged as is, we'll fix dynamic shapes on the engine level in the future.

Agree.

zihaomu · 2023-03-13T13:27:46Z

Link:#21078

zihaomu requested a review from rogday July 6, 2022 12:11

zihaomu added category: dnn bug feature labels Jul 6, 2022

zihaomu mentioned this pull request Jul 6, 2022

Issue in OpenCV DNN when reading network with Upsample2D with interpolation "bilinear" #22195

Closed

zihaomu linked an issue Jul 6, 2022 that may be closed by this pull request

Issue in OpenCV DNN when reading network with Upsample2D with interpolation "bilinear" #22195

Closed

rogday suggested changes Jul 6, 2022

View reviewed changes

zihaomu changed the title ~~DNN: ReduceSum support two input on onnx_importer~~ DNN: Bug fix 22195 in 4.x Jul 11, 2022

zihaomu mentioned this pull request Jul 11, 2022

update ReduceSum with two input and Reduce with dynamic batch size opencv/opencv_extra#987

Merged

zihaomu force-pushed the bug_fix_22195 branch from 5a4e8a8 to 29e2ab9 Compare July 11, 2022 12:06

zihaomu removed the bug label Jul 11, 2022

zihaomu force-pushed the bug_fix_22195 branch from 1e695cd to a35ea06 Compare July 13, 2022 05:16

zihaomu changed the title ~~DNN: Bug fix 22195 in 4.x~~ DNN: Reduce Layer (add dynamic batch and ReduceSum support) Jul 13, 2022

zihaomu requested a review from rogday July 13, 2022 05:32

support ReduceSum with two input and dynamic shape batch size in Redu…

1b8fba8

…ceLayer.

zihaomu force-pushed the bug_fix_22195 branch from a35ea06 to 1b8fba8 Compare July 13, 2022 05:46

zihaomu mentioned this pull request Jul 13, 2022

DNN: fix bugs in scale and reduce layer -WIP #22140

Closed

6 tasks

zihaomu requested a review from alalek July 13, 2022 06:41

rogday reviewed Jul 18, 2022

View reviewed changes

batchsize dynamic is set to index 0.

98c33c6

zihaomu requested a review from rogday July 20, 2022 11:27

rogday reviewed Jul 27, 2022

View reviewed changes

modules/dnn/test/test_onnx_importer.cpp Outdated Show resolved Hide resolved

zihaomu requested a review from rogday July 28, 2022 06:47

support ReduceLayer without reshape layer.

d4640f4

zihaomu force-pushed the bug_fix_22195 branch from 33fb45e to d4640f4 Compare August 2, 2022 02:34

fengyuentau self-requested a review August 5, 2022 09:27

rogday approved these changes Aug 11, 2022

View reviewed changes

asmorkalov assigned rogday Aug 11, 2022

asmorkalov merged commit bb71cb2 into opencv:4.x Aug 11, 2022

alalek mentioned this pull request Aug 21, 2022

(5.x) Merge 4.x #22408

Merged

zihaomu mentioned this pull request Mar 13, 2023

ONNX conformance test results #21078

Open

48 tasks

		CV_Error(Error::StsNotImplemented, "Unsupported " + layer_type + " operation of opset 13, please try to "
		"re-export the onnx model with opset 11.");

Uh oh!

Conversation

zihaomu commented Jul 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

rogday left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zihaomu Jul 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zihaomu Jul 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zihaomu Jul 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zihaomu Aug 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zihaomu Aug 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zihaomu commented Aug 10, 2022

Uh oh!

rogday left a comment

Choose a reason for hiding this comment

Uh oh!

zihaomu commented Aug 11, 2022

Uh oh!

zihaomu commented Mar 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zihaomu commented Jul 6, 2022 •

edited

Loading

zihaomu Jul 19, 2022 •

edited

Loading

zihaomu Jul 20, 2022 •

edited

Loading

zihaomu Jul 21, 2022 •

edited

Loading

zihaomu Aug 2, 2022 •

edited

Loading

zihaomu Aug 2, 2022 •

edited

Loading