Skip to content

cuda4dnn: enable fusion tests, update thresholds, fix eltwise fusion regression#18285

Merged
opencv-pushbot merged 1 commit intoopencv:masterfrom
YashasSamaga:cuda4dnn-update-tests
Nov 27, 2020
Merged

cuda4dnn: enable fusion tests, update thresholds, fix eltwise fusion regression#18285
opencv-pushbot merged 1 commit intoopencv:masterfrom
YashasSamaga:cuda4dnn-update-tests

Conversation

@YashasSamaga
Copy link
Copy Markdown
Contributor

@YashasSamaga YashasSamaga commented Sep 7, 2020

All tests pass on GTX 1080 Ti (cuDNN 7.6.5) and RTX 2080 Ti (cuDNN 7.6.5).

Some tests fail due to non-availability of convolution algorithms on GTX 1050 (cuDNN 8.0.5). These cannot be fixed as it's a cuDNN issue.

[  FAILED  ] 9 tests, listed below:
[  FAILED  ] Test_ONNX_layers.Convolution3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_layers.Convolution3D_bias/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_layers.PoolConv3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_nets.Resnet34_kinetics/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.Convolution3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.concat_3d/1, where GetParam() = CUDA/CUDA_FP16
unknown file: Failure
C++ exception with description "OpenCV(4.5.1-pre) /fakepath/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/convolution.hpp:303: error: (-217:Gpu API call) cuDNN did not return a suitable algorithm for convolution. in function 'ConvolutionAlgorithm'

This PR also fixes an issue that arose because of a check from #18554 that disabled valid fusions.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#mking-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu-cuda:18.04

@asmorkalov
Copy link
Copy Markdown
Contributor

@YashasSamaga Do you have progress on the patch? Do you need help with testing on different hardware?

@YashasSamaga
Copy link
Copy Markdown
Contributor Author

YashasSamaga commented Nov 21, 2020

@asmorkalov I will finish this in a few hours. I am testing on GTX 1050, 2080 Ti and 1080 Ti. There are probably tests failing on other devices. It's hard to get these tests to pass on all devices since cuDNN chooses different algorithms for convolution on different devices.

@YashasSamaga YashasSamaga force-pushed the cuda4dnn-update-tests branch 3 times, most recently from fc4f9c6 to e9a717e Compare November 21, 2020 08:54
#endif

if (pinsToKeep.count(lpNext) != 0)
if (IS_DNN_OPENCL_TARGET(preferableTarget) && pinsToKeep.count(lpNext) != 0)
Copy link
Copy Markdown
Contributor Author

@YashasSamaga YashasSamaga Nov 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes a regression (affects master only -- 3.4 is fine) from #18554 (that landed in master through merge PR #18556). The CUDA backend can optionally fuse an activation before eltwise (for YOLO like networks) and also optionally fuse an activation after eltwise (for ResNet like networks). Hence, this check is wrong for CUDA backend. This check prevents valid fusions and causes the fusion tests to fail (Yay! The tests are doing their job!).

It's correct for OpenCL since it always fuses an activation after eltwise. However, this check is redundant on master. The fusion logic downstream will actually do the same check. But this check will help break the fusion process early.

So I have added a target check to ensure that it's only used in OpenCL targets.

break;

if (!nextData->params.has("operation") || toLowerCase(nextData->params.get<String>("operation")) == "sum")
if (IS_DNN_OPENCL_TARGET(preferableTarget))
Copy link
Copy Markdown
Contributor Author

@YashasSamaga YashasSamaga Nov 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added an OpenCL target check to prevent eltwise operation and coeffs checks running in other targets. These checks are strictly not required for the CUDA backend as it has its own checks that appear prior to these checks. This OpenCL checks would still be redo the checks in a different way (although redundant) but it causes coupling of the two backends. I just thought decoupling OpenCL and CUDA checks might prevent future surprises (they already have different fusion capabilities).

I am not sure if this change is really required. It seems to be messing up git diff because of the indentation added across the whole block.

testONNXModels("deconvolution_output_shape", npy, 0, 0, false, false);
testONNXModels("deconv_adjpad_2d", npy, 0, 0, false, false);
if (target != DNN_TARGET_CUDA_FP16) // bug
testONNXModels("deconv_adjpad_2d", npy, 0, 0, false, false);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case has five tests. Only the last test fails for CUDA FP16 target. I haven't skipped since the other tests are still useful but not skipping might make it harder to track (and enable it in the future after the bug is fixed).

Maybe it would be a good idea to split this tests into multiple tests (or at least split the one that is failing into its own test).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split this tests into multiple

Agreed.

/cc @l-bat @dkurt @sl-sergei

@YashasSamaga YashasSamaga marked this pull request as ready for review November 21, 2020 12:45
@YashasSamaga YashasSamaga changed the title cuda4dnn: enable fusion tests, update thresholds cuda4dnn: enable fusion tests, update thresholds, fix eltwise fusion regression Nov 21, 2020
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Thank you for contribution 👍

@opencv-pushbot opencv-pushbot merged commit df18431 into opencv:master Nov 27, 2020
@alalek alalek mentioned this pull request Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants