Skip to content

Fixed ONNXImporter#16424

Merged
alalek merged 6 commits intoopencv:3.4from
czgdp1807:issue-16370
Feb 15, 2020
Merged

Fixed ONNXImporter#16424
alalek merged 6 commits intoopencv:3.4from
czgdp1807:issue-16370

Conversation

@czgdp1807
Copy link
Copy Markdown
Contributor

@czgdp1807 czgdp1807 commented Jan 24, 2020

Merge with extra: opencv/opencv_extra#701

resolves #16370

This pullrequest changes

This Pull Request aims to fix ONNXImporter. Currently, there are few bugs in processing of various layer_type from .onnx files. The referenced issue reports a bug in Split layer. Though I have observed that there are more bugs. I have listed them below as tasks. I will update them every commit.

  • "Split" layer fixed.
  • Tests added
force_builders=Custom,Custom Win,Custom Mac
build_image:Custom=ubuntu-openvino-2020.1.0:16.04
build_image:Custom Win=openvino-2020.1.0
build_image:Custom Mac=openvino-2020.1.0

test_modules:Custom=dnn,python2,python3,java
test_modules:Custom Win=dnn,python2,python3,java
test_modules:Custom Mac=dnn,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
test_opencl:Custom=OFF
test_bigdata:Custom=1
test_filter:Custom=*

@dkurt
Copy link
Copy Markdown
Member

dkurt commented Jan 24, 2020

Thank you!

Can you please for each fixed case provide a simple one-layer test by the following script: https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/onnx/generate_onnx_models.py?

@czgdp1807
Copy link
Copy Markdown
Contributor Author

Hi, @dkurt
I was trying to import the ONNX model after making a fix to split layer. However, other layers were not processed correctly like, when layer_type is Constant, attribute_proto.t() is returning an empty tensor. I suspect there is a problem with .onnx file shared in #16370 (comment) as the visualization of the model doesn't have any Constant layer but the graph_proto is showing to have it and that too with string data type which is not supported by getMatFromTensors.
I am still trying to figure out what's not working correctly. I will add the tests once the model is processed correctly. Thank you.

@czgdp1807 czgdp1807 changed the title [WIP] Fixed ONNXImporter Fixed ONNXImporter Jan 28, 2020
@czgdp1807
Copy link
Copy Markdown
Contributor Author

Update:
Hi,
I will be having my mid semester examinations till 7th February(IST). So the work on this may get slowed down. Please do not close this PR until then. I am on it and I will put efforts to complete it from 7th Feb. Thank you and apologies.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

Hi,
I am unable to figure out how to run the tests for this PR locally.
I tried running, python3 generate_onnx_models.py under /opencv_project/opencv_extra/testdata/dnn/onnx but got,

maxpooling input has sizes torch.Size([1, 3, 10, 9])
maxpooling output has sizes torch.Size([1, 3, 3, 3])

Segmentation fault (core dumped)

Do we have to build the code after generating models or they are generated automatically during the build?
I am unable to see logs at https://pullrequest.opencv.org/buildbot/builders/precommit_linux64/builds/24598/steps/test_dnn

@alalek
Copy link
Copy Markdown
Member

alalek commented Feb 8, 2020

unable to see logs

These logs were expired.
I rescheduled new builds, please check.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

Thanks @alalek
After fixing a probable bug in Constant layer, I observed that ArgMax hasn't been handled. I got the following exception,

 what():  OpenCV(3.4.9-dev) /home/czgdp1807/opencv_project/opencv/modules/dnn/src/dnn.cpp:561: error: (-2:Unspecified error) Can't create layer "LabelId" of type "ArgMax" in function 'getLayerInstance'

Should I add a new file under layers for this and some more uncovered layers(ArgMin)?

@dkurt
Copy link
Copy Markdown
Member

dkurt commented Feb 8, 2020

@czgdp1807, Can you please dump a subgraph from the model which includes ArgMin node? Just to understand which layer it tries to perform.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

I have attached a screenshot below of the last few layers. The ArgMax isn't created as the case for it isn't covered as far as I am able to figure out.
Screenshot from 2020-02-09 00-41-46

@dkurt
Copy link
Copy Markdown
Member

dkurt commented Feb 8, 2020

Got it, thanks! This is training only nodes so author should exclude them from deployment model (you may try to load and save model but before export call .eval() method or specify SoftMax as output.

So let's focus on Split layer but ask author to try to remove training nodes.

image

@czgdp1807
Copy link
Copy Markdown
Contributor Author

czgdp1807 commented Feb 8, 2020

Thanks. IMO, this could be one of the ways as the last few layers(the layers which are extracting the label on the left side.) aren't making any difference to the result, though the interpretation would be different.
I will drop the last commit from here and will add test files for Split layer in my other PR.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

Please restart the tests here as I have made an update to opencv_extra.
Thanks.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

Please restart the tests here as I have made an update to opencv_extra.
Thanks.

+1

@alalek
Copy link
Copy Markdown
Member

alalek commented Feb 13, 2020

Amend commit (no changes are required, just update commit date) and push "new" changes to this PR:

git commit --amend
git push <remote> <branch> --force

@czgdp1807
Copy link
Copy Markdown
Contributor Author

Done. Locally the tests were passing. Let's see how it goes here.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

I have applied the comments and reviews. Let me know if anything else is to be done. Tests should pass here.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

The tests passed. Is it good to go?

@dkurt
Copy link
Copy Markdown
Member

dkurt commented Feb 13, 2020

@czgdp1807, some of test from Inference Engine backend are failed:

[ RUN      ] Test_ONNX_layers.Split/1, where GetParam() = NGRAPH/CPU
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:66: Failure
Expected: (normL1) <= (l1), actual: 0.5 vs 1e-05
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:69: Failure
Expected: (normInf) <= (lInf), actual: 1 vs 0.0001
FALLBACK: Layer [Slice]:[1] is expected to has backend implementation
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:66: Failure
Expected: (normL1) <= (l1), actual: 0.5 vs 1e-05
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:69: Failure
Expected: (normInf) <= (lInf), actual: 1 vs 0.0001
FALLBACK: Layer [Slice]:[1] is expected to has backend implementation
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:66: Failure
Expected: (normL1) <= (l1), actual: 0.5 vs 1e-05
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:69: Failure
Expected: (normInf) <= (lInf), actual: 1 vs 0.0001
FALLBACK: Layer [Slice]:[1] is expected to has backend implementation
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:66: Failure
Expected: (normL1) <= (l1), actual: 0.5 vs 1e-05
/build/precommit_custom_mac/3.4/opencv/modules/dnn/test/test_common.impl.hpp:69: Failure
Expected: (normInf) <= (lInf), actual: 1 vs 0.0001
FALLBACK: Layer [Slice]:[1] is expected to has backend implementation

I can take a look and tell how to fix it

On the other hand, other Split tests are skipped.

So please add these lines at the beginning of test:

    if (backend == DNN_BACKEND_INFERENCE_ENGINE_NN_BUILDER_2019)
        applyTestTag(CV_TEST_TAG_DNN_SKIP_IE_NN_BUILDER);
    if (backend == DNN_BACKEND_INFERENCE_ENGINE_NGRAPH)
        applyTestTag(CV_TEST_TAG_DNN_SKIP_IE_NGRAPH);

@czgdp1807
Copy link
Copy Markdown
Contributor Author

How to run the failing tests locally, as no test was failing on, ./opencv_test_dnn.

@dkurt
Copy link
Copy Markdown
Member

dkurt commented Feb 14, 2020

@czgdp1807, it's optional backend for deep learning module: https://github.com/opencv/opencv/wiki/Intel%27s-Deep-Learning-Inference-Engine-backend (OpenVINO). Anyway there are other tests for split which are failed as well. So let's skip it for now but fix them all separately.

@czgdp1807
Copy link
Copy Markdown
Contributor Author

Sure. Can we open a separate issue for the remaining failures in Split layer? I will work on it in a different PR.

@dkurt
Copy link
Copy Markdown
Member

dkurt commented Feb 14, 2020

Yep, let's open a separate issue to not to miss. Thanks!

@czgdp1807
Copy link
Copy Markdown
Contributor Author

I have updated the code and opened a separate issue. Please let me know, if any more update is to be done in this PR.

Copy link
Copy Markdown
Member

@dkurt dkurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks for your contribution!

@dkurt dkurt self-assigned this Feb 14, 2020
@alalek alalek merged commit a6f3a21 into opencv:3.4 Feb 15, 2020
@czgdp1807 czgdp1807 deleted the issue-16370 branch February 15, 2020 12:01
@alalek alalek mentioned this pull request Feb 20, 2020
jshiwam pushed a commit to jshiwam/opencv that referenced this pull request Feb 22, 2020
* fixed Split layer in ONNXImporter

* added test for fix of split layer

* fixed tests for Split layer

* applied reviews

* updated tests

* fixed paths in tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DNN: Required argument "split" not found into dictionary

4 participants