Skip to content

changed names of permutations if Reshpe is in NHWC#22448

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
Ichini24:reshape-permutations-fix
Sep 13, 2022
Merged

changed names of permutations if Reshpe is in NHWC#22448
asmorkalov merged 1 commit intoopencv:4.xfrom
Ichini24:reshape-permutations-fix

Conversation

@Ichini24
Copy link
Copy Markdown

@Ichini24 Ichini24 commented Aug 30, 2022

Fixes #21290
opencv_extra: opencv/opencv_extra#1004

Pull Request Readiness Checklist

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Original issue:
#21290

Found out, why my tf models wasn't supported by latest version of OpenCV. The problem was, that in graph layers with their id's were looking like:
0: input
1: reshape
2: conv

But while building graph in OpenCV it was looking like:
0: input
1: reshape//nhwc
2: reshape
3: reshape/nchw
4: conv

The problem is while getting memory shapes and connecting nodes with each other, the algorithm is connecting them by names, so the chain of layers ids was not: 0 -> 1 -> 2 -> 3 -> 4, but 0 -> 1 -> 2 -> 4.
There's 2 ways of solving this issue:

  1. Place the original name as the latest in this permutation chain, which may be not 100% correct, but it fixes bug.
  2. Change the input layer name on conv layer(see above in example). But there's no interface for such a fix.

@fengyuentau fengyuentau added category: dnn pr: needs test New functionality requires minimal tests set labels Aug 31, 2022
@fengyuentau fengyuentau self-assigned this Aug 31, 2022
@fengyuentau
Copy link
Copy Markdown
Member

Thank you for your contribution. Could you provide a test for this patch?

@Ichini24
Copy link
Copy Markdown
Author

@fengyuentau sure, but how do I do it? Should I place small pb graph to opencv_extra and write test for this patch in dnn module?

@asmorkalov
Copy link
Copy Markdown
Contributor

@Ichini24 Yes, you are right. You need to create small model with several layers to reproduce the issue, submit PR with the same branch name to extra and add test case to DNN test set. Extra patch should include input and expected output tensors for check. You can use on of onnx tests as reference.

@Ichini24
Copy link
Copy Markdown
Author

Ichini24 commented Sep 1, 2022

Sorry for the delay, tried to understand, how there all work. So, I prepared little model with 28x28x3 input shape, so the graph size is 19.2 kB. on the disk. I have several questions about test.

  1. Cause the model is small, not sure, that it would be correct to await for any output result to compare it with predefined tensor. In my fix, not getting an exception is already the prove, that fix is working. You can see the original issue, everything is covered there. Will it be ok just to Assert that output is not empty?
  2. I've watched the test_tf_importer.cpp and placed my test there. Can I use Test_TensorFlow_layers class for my issue?

@fengyuentau
Copy link
Copy Markdown
Member

Since the issue happens in forwarding, I suggest you should provide the reference data for output checking as well, especially when you want to reuse Test_TensorFlow_layers which needs reference data.

Copy link
Copy Markdown
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix. Your changes look good to me. We can move on when you provide the test case.

Comment on lines -1129 to +1137
addPermuteLayer(order, name + "/nchw", inpId);

setName = changedType ? name : name + "/nchw";
addPermuteLayer(order, setName, inpId);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reviewers:
This issue is triggered when this if condition is entered. addPermuteLayer sets the output name to name + "/nchw", leading to the incorrect input fetched by the next layer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need name + "/nchw" node at the end? I don't think other layers know this node by name.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fengyuentau, it's a refactoring PR that broke history a little bit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I was investigating it already, but didn't get the answer.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rogday I've investigated this. So, there's a sample in tests, where the next condition is not met:
https://github.com/opencv/opencv/pull/22448/files#diff-f5872c76075aa1f06f357d1dc27448bf1a02a28b79e62d92e8b7aa47311b57dfR1105

It's not met cause newShape has values: [1, 2, 4, 3].
So, there will not be permutation before the actual Reshape node.
The test sample is: reshape_layer with the dnn/tensorflow/reshape_layer_net.pb graph. Which has only input and Reshape layer(look at the screenshot).
Screenshot from 2022-09-06 19-20-01

The reference output has shape: output shape: (1, 3, 2, 4).
So, the loaded graph in opencv will look like:
input->Reshape->Reshape/nchw.

Also this behaviour can be observed in not_implemented_layer_net.pb graph, which has 2 inputs, 1 is going to not_implemented_layer, the 2nd is going to reshape, after that outputs from this layers are multiplied.

So, the problem, that you've covered may be defined like:
1). Input layer of reshape has NHWC layout.
2). By some reason(the question here is why don't we reshape, if condition newShape.at(1) == 1 is not met) the permutation from HNWC to NCHW layouts is not performed.
3). Next Reshape is performed only with NCHW data layout.
4). Data is always permuted from NCHW to NHWC data format at the end if the source of Reshape in original graph is NHWC/ Even if it wasn't permuted from NHWC to NCHW at input.

I don't see any reasons, why there's no permutation at the start. Cause usually when you have graph, where all layers have 1 data layout.
I can investigate it lately, but this is not what this PR is about. Also the changing of naming the original node name to last permutation layer breaks several tests.

@Ichini24
Copy link
Copy Markdown
Author

Ichini24 commented Sep 2, 2022

Done everything like you said. Here's the link to the opencv-extra pr:
opencv/opencv_extra#1004

@fengyuentau
Copy link
Copy Markdown
Member

I checked the model you provided. I think you should add a shape-sensitive layer after realdiv, such as conv in your original model, since realdiv is elementwise and it runs no matter what the input shape is.

@Ichini24
Copy link
Copy Markdown
Author

Ichini24 commented Sep 3, 2022

Updated pb graph with the convolution layer at the end and updated output array. Please, check.

@fengyuentau fengyuentau removed the pr: needs test New functionality requires minimal tests set label Sep 3, 2022
Copy link
Copy Markdown
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash commits into one. Others look good to me 👍

@Ichini24 Ichini24 force-pushed the reshape-permutations-fix branch 2 times, most recently from 78ca124 to 337452b Compare September 3, 2022 22:22
@Ichini24
Copy link
Copy Markdown
Author

Ichini24 commented Sep 3, 2022

I've squashed opencv and opencv_extra commits, please check.

Copy link
Copy Markdown
Member

@zihaomu zihaomu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! 👍
It's a workaround for me.

@rogday
Copy link
Copy Markdown
Member

rogday commented Sep 6, 2022

Could you check if this fixes #20433? And link it to this PR, if it does?

@rogday rogday linked an issue Sep 12, 2022 that may be closed by this pull request
Copy link
Copy Markdown
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for now, but we should come back to this layer in the future, this cascade of conditions seems very fragile.

@asmorkalov asmorkalov merged commit c2c8da2 into opencv:4.x Sep 13, 2022
@alalek alalek mentioned this pull request Jan 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lost support of tf nhwc models reshape - pass trough layer in Tensorflow .pb file error

6 participants