cuda4dnn: enable fusion tests, update thresholds, fix eltwise fusion regression by YashasSamaga · Pull Request #18285 · opencv/opencv

YashasSamaga · 2020-09-07T18:28:54Z

All tests pass on GTX 1080 Ti (cuDNN 7.6.5) and RTX 2080 Ti (cuDNN 7.6.5).

Some tests fail due to non-availability of convolution algorithms on GTX 1050 (cuDNN 8.0.5). These cannot be fixed as it's a cuDNN issue.

[  FAILED  ] 9 tests, listed below:
[  FAILED  ] Test_ONNX_layers.Convolution3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_layers.Convolution3D_bias/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_layers.PoolConv3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_nets.Resnet34_kinetics/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.Convolution3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.concat_3d/1, where GetParam() = CUDA/CUDA_FP16

unknown file: Failure
C++ exception with description "OpenCV(4.5.1-pre) /fakepath/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/convolution.hpp:303: error: (-217:Gpu API call) cuDNN did not return a suitable algorithm for convolution. in function 'ConvolutionAlgorithm'

This PR also fixes an issue that arose because of a check from #18554 that disabled valid fusions.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#mking-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
The PR is proposed to proper branch
There is reference to original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

force_builders=Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu-cuda:18.04

asmorkalov · 2020-11-20T11:28:51Z

@YashasSamaga Do you have progress on the patch? Do you need help with testing on different hardware?

YashasSamaga · 2020-11-21T04:50:19Z

@asmorkalov I will finish this in a few hours. I am testing on GTX 1050, 2080 Ti and 1080 Ti. There are probably tests failing on other devices. It's hard to get these tests to pass on all devices since cuDNN chooses different algorithms for convolution on different devices.

YashasSamaga · 2020-11-21T12:37:23Z

modules/dnn/src/dnn.cpp

 #endif

-                    if (pinsToKeep.count(lpNext) != 0)
+                    if (IS_DNN_OPENCL_TARGET(preferableTarget) && pinsToKeep.count(lpNext) != 0)


This change fixes a regression (affects master only -- 3.4 is fine) from #18554 (that landed in master through merge PR #18556). The CUDA backend can optionally fuse an activation before eltwise (for YOLO like networks) and also optionally fuse an activation after eltwise (for ResNet like networks). Hence, this check is wrong for CUDA backend. This check prevents valid fusions and causes the fusion tests to fail (Yay! The tests are doing their job!).

It's correct for OpenCL since it always fuses an activation after eltwise. However, this check is redundant on master. The fusion logic downstream will actually do the same check. But this check will help break the fusion process early.

So I have added a target check to ensure that it's only used in OpenCL targets.

YashasSamaga · 2020-11-21T12:41:43Z

modules/dnn/src/dnn.cpp

                        break;

-                    if (!nextData->params.has("operation") || toLowerCase(nextData->params.get<String>("operation")) == "sum")
+                    if (IS_DNN_OPENCL_TARGET(preferableTarget))


I have added an OpenCL target check to prevent eltwise operation and coeffs checks running in other targets. These checks are strictly not required for the CUDA backend as it has its own checks that appear prior to these checks. This OpenCL checks would still be redo the checks in a different way (although redundant) but it causes coupling of the two backends. I just thought decoupling OpenCL and CUDA checks might prevent future surprises (they already have different fusion capabilities).

I am not sure if this change is really required. It seems to be messing up git diff because of the indentation added across the whole block.

YashasSamaga · 2020-11-21T12:44:42Z

modules/dnn/test/test_onnx_importer.cpp

    testONNXModels("deconvolution_output_shape", npy, 0, 0, false, false);
-    testONNXModels("deconv_adjpad_2d", npy, 0, 0, false, false);
+    if (target != DNN_TARGET_CUDA_FP16) // bug
+        testONNXModels("deconv_adjpad_2d", npy, 0, 0, false, false);


This test case has five tests. Only the last test fails for CUDA FP16 target. I haven't skipped since the other tests are still useful but not skipping might make it harder to track (and enable it in the future after the bug is fixed).

Maybe it would be a good idea to split this tests into multiple tests (or at least split the one that is failing into its own test).

split this tests into multiple

Agreed.

/cc @l-bat @dkurt @sl-sergei

alalek

Well done! Thank you for contribution 👍

YashasSamaga force-pushed the cuda4dnn-update-tests branch 3 times, most recently from fc4f9c6 to e9a717e Compare November 21, 2020 08:54

enable fusion tests, update thresholds and fix missed eltwise fusions

0f8ab05

YashasSamaga force-pushed the cuda4dnn-update-tests branch from 7abe9f3 to 0f8ab05 Compare November 21, 2020 12:29

YashasSamaga commented Nov 21, 2020

View reviewed changes

YashasSamaga marked this pull request as ready for review November 21, 2020 12:45

YashasSamaga changed the title ~~cuda4dnn: enable fusion tests, update thresholds~~ cuda4dnn: enable fusion tests, update thresholds, fix eltwise fusion regression Nov 21, 2020

alalek approved these changes Nov 27, 2020

View reviewed changes

opencv-pushbot merged commit df18431 into opencv:master Nov 27, 2020

alalek mentioned this pull request Nov 27, 2020

Merge 4.x #18951

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuda4dnn: enable fusion tests, update thresholds, fix eltwise fusion regression#18285

cuda4dnn: enable fusion tests, update thresholds, fix eltwise fusion regression#18285
opencv-pushbot merged 1 commit intoopencv:masterfrom
YashasSamaga:cuda4dnn-update-tests

YashasSamaga commented Sep 7, 2020 •

edited

Loading

Uh oh!

asmorkalov commented Nov 20, 2020

Uh oh!

YashasSamaga commented Nov 21, 2020 •

edited

Loading

Uh oh!

YashasSamaga Nov 21, 2020 •

edited

Loading

Uh oh!

YashasSamaga Nov 21, 2020 •

edited

Loading

Uh oh!

YashasSamaga Nov 21, 2020

Uh oh!

alalek Nov 21, 2020

Uh oh!

alalek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

YashasSamaga commented Sep 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

asmorkalov commented Nov 20, 2020

Uh oh!

YashasSamaga commented Nov 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

YashasSamaga Nov 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YashasSamaga Nov 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YashasSamaga Nov 21, 2020

Choose a reason for hiding this comment

Uh oh!

alalek Nov 21, 2020

Choose a reason for hiding this comment

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

YashasSamaga commented Sep 7, 2020 •

edited

Loading

YashasSamaga commented Nov 21, 2020 •

edited

Loading

YashasSamaga Nov 21, 2020 •

edited

Loading

YashasSamaga Nov 21, 2020 •

edited

Loading