Skip to content

cuda4dnn(Eltwise, Power): fix eltwise fusion, enable more eltwise fusion, fix power fusion#17939

Merged
mshabunin merged 2 commits intoopencv:masterfrom
YashasSamaga:cuda4dnn-fix-eltwise-fusion
Aug 1, 2020
Merged

cuda4dnn(Eltwise, Power): fix eltwise fusion, enable more eltwise fusion, fix power fusion#17939
mshabunin merged 2 commits intoopencv:masterfrom
YashasSamaga:cuda4dnn-fix-eltwise-fusion

Conversation

@YashasSamaga
Copy link
Copy Markdown
Contributor

@YashasSamaga YashasSamaga commented Jul 24, 2020

Eltwise layer can create two types of operators: ShortcutOp and EltwiseOp. There is a dynamic cast to EltwiseOpBase to retrieve configuration to check fusion compatibility. No check is made to see if the cast was successful (required since some EltwiseLayer use cuda4dnn::ShortcutOp which is not derived from EltwiseOpBase).

fixes #17934
fixes #17946 for master
fixes cuda part of #17964

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under OpenCV (BSD) License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@YashasSamaga
Copy link
Copy Markdown
Contributor Author

YashasSamaga commented Jul 24, 2020

To ensure that this doesn't happen again and to catch future bugs, I was thinking of adding a test for every possible fusion in the CUDA backend.

  1. create test_cuda_layers.hpp
  2. create test templates for:
    • conv + activation
    • conv + eltwise + activation
    • conv + eltwise
    • conv + activation + eltwise
  3. add a test case for conv + shortcut

I would create a test template that would create the above scenarios. It will compare the outputs of the CUDA backend (and maybe others?) against OCV CPU backend without fusions.

@dkurt @alalek Is this ok?

@mshabunin
Copy link
Copy Markdown
Contributor

@YashasSamaga , yes, generic fusion tests would be useful for all backends.

Is this PR ready to be merged?

@YashasSamaga
Copy link
Copy Markdown
Contributor Author

YashasSamaga commented Jul 27, 2020

@mshabunin I was waiting for a reply for the tests. I will finish this PR in maybe in another 6-8 hours.

EDIT: might need more time. Tests with power activation are failing for both OpenCL and CUDA and I am not able to figure out why.

@dkurt
Copy link
Copy Markdown
Member

dkurt commented Jul 27, 2020

I agree that it would be nice to have the generic tests. You may build it in runtime, without test data like here: https://github.com/opencv/opencv/blob/master/modules/dnn/test/test_layers.cpp

@YashasSamaga YashasSamaga force-pushed the cuda4dnn-fix-eltwise-fusion branch from 6a523f1 to 885e240 Compare July 29, 2020 12:47
@YashasSamaga YashasSamaga marked this pull request as ready for review July 29, 2020 13:31
@YashasSamaga
Copy link
Copy Markdown
Contributor Author

YashasSamaga commented Jul 29, 2020

PR is ready. There is an unrelated test from calib3d that is failing in Win64 OpenCL CI build that prevents the PR from turning green. All tests are passing locally (CUDA + OpenCL).

Results of fusion tests #17976 in 4.4.0 with debug printfs enabled: https://gist.github.com/YashasSamaga/354b2ef2cf570dcd5beb8fc91c657b84

@YashasSamaga YashasSamaga changed the title cuda4dnn(Eltwise): fix segfault while checking eltwise fusion compatiblity cuda4dnn(Eltwise, Power): fix eltwise fusion, enable more eltwise fusion, fix power fusion Jul 29, 2020
Copy link
Copy Markdown
Member

@dkurt dkurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mshabunin mshabunin self-assigned this Aug 1, 2020
@mshabunin mshabunin merged commit f53f491 into opencv:master Aug 1, 2020
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
…e-fusion

* fix eltwise fusion segfault, more eltwise fusions, fix power fusion

* add assertion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug category: dnn category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dnn(opencl, cuda): missed eltwise fusion opportunity DNN model crashes on detect for Yolo with CUDA backend

4 participants