DNN model crashes on detect for Yolo with CUDA backend

##### System information (version)
- OpenCV => 4.4
- Operating System / Platform => Ubuntu 18.04 64 Bit and Ubuntu 18.04 Arm64
- Compiler => gcc
- Nvidia driver = 440.82
- Cuda = 10.2

##### Detailed description

I get a segmentation fault when using the [**DNN detection model**](https://docs.opencv.org/master/d3/df1/classcv_1_1dnn_1_1DetectionModel.html) detect function from **Python** 3.6. This happens only when using **CUDA** backends, running on CPU works fine. I am running some custom yolov3 config and weights, trained with [Darknet](https://github.com/AlexeyAB/darknet/).
The same code of mine works fine with older versions of OpenCV. Both on current master and the 4.4.0 release the segmentation fault happens. As a workaround I am using now OpenCV at commit 0ccc839397e50f8df27424aef20827f678e96cea and the extra modules at [468345511f94ca54079c739f47e64e2520a6f8e9](https://github.com/opencv/opencv_contrib/commit/468345511f94ca54079c739f47e64e2520a6f8e9). I use those only because I knew that master worked for me at that point in time. Maybe newer versions work as well.

I was running my code in GDB and backtrace gave me this output:
```
#0  0x00007f877de6d123 in cv::dnn::dnn4_v20200609::Net::Impl::fuseLayers(std::vector<cv::dnn::dnn4_v20200609::LayerPin, std::allocator<cv::dnn::dnn4_v20200609::LayerPin> > const&) ()
    at /usr/local/lib/libopencv_dnn.so.4.4
#1  0x00007f877de6ecd5 in cv::dnn::dnn4_v20200609::Net::Impl::allocateLayers(std::vector<cv::dnn::dnn4_v20200609::LayerPin, std::allocator<cv::dnn::dnn4_v20200609::LayerPin> > const&) ()
    at /usr/local/lib/libopencv_dnn.so.4.4
#2  0x00007f877de7201f in cv::dnn::dnn4_v20200609::Net::Impl::setUpNet(std::vector<cv::dnn::dnn4_v20200609::LayerPin, std::allocator<cv::dnn::dnn4_v20200609::LayerPin> > const&) ()
    at /usr/local/lib/libopencv_dnn.so.4.4
#3  0x00007f877de73363 in cv::dnn::dnn4_v20200609::Net::forward(cv::_OutputArray const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) () at /usr/local/lib/libopencv_dnn.so.4.4
#4  0x00007f877dfd9442 in cv::dnn::dnn4_v20200609::Model::Impl::predict(cv::dnn::dnn4_v20200609::Net&, cv::Mat const&, cv::_OutputArray const&) () at /usr/local/lib/libopencv_dnn.so.4.4
#5  0x00007f877dfdda84 in cv::dnn::dnn4_v20200609::DetectionModel::detect(cv::_InputArray const&, std::vector<int, std::allocator<int> >&, std::vector<float, std::allocator<float> >&, std::vector<cv::Rect_<int>, std::allocator<cv::Rect_<int> > >&, float, float) () at /usr/local/lib/libopencv_dnn.so.4.4
#6  0x00007f8787999ab5 in pyopencv_cv_dnn_dnn_DetectionModel_detect(_object*, _object*, _object*) () at /usr/local/lib/python3.6/dist-packages/cv2/python-3.6/cv2.cpython-36m-x86_64-linux-gnu.so
```

The 64 bit code I am usually running from within Docker, using this as a base image: [nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04](https://hub.docker.com/layers/nvidia/cuda/10.0-cudnn7-devel-ubuntu18.04/images/sha256-e277b9eef79d6995b10d07e30228daa9e7d42f49bcfc29d512c1534b42d91841?context=explore) . It happens also outside of Docker though.

OpenCV is build from source using this Cmake command:
`cmake -DBUILD_opencv_python2=OFF -DWITH_OPENGL=ON -DFORCE_VTK=ON -DWITH_TBB=ON -DWITH_GDAL=ON -DWITH_XINE=ON -DENABLE_PRECOMPILED_HEADERS=OFF -DWITH_GSTREAMER=ON -DWITH_FFMPEG=ON -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib/modules -DWITH_CUDA=ON ..`


##### Steps to reproduce
Some sample code, showing how I use the dnn model

```
import cv2

net = cv2.dnn.readNetFromDarknet(config_file, weights_file)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
model = cv2.dnn_DetectionModel(net)
model.setInputSize(size)
model.setInputScale(1.0 / 255)
classes = open('data/config/coco.names').read().strip().split('\n')

height, width, channels = frame.shape
    if size[0] != width or size[1] != height:
        frame_resized = cv2.resize(frame, size, interpolation=cv2.INTER_LINEAR)
    else:
        frame_resized = frame

#crashes here        
classes, confidences, boxes = self.model.detect(frame_resized, confThreshold=score_thresh, nmsThreshold=nms_thresh)
```

The yolov3 config used is close to this [template](https://github.com/AlexeyAB/darknet/blob/master/cfg/csresnext50-panet-spp-original-optimal.cfg).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DNN model crashes on detect for Yolo with CUDA backend #17934

System information (version)

Detailed description

Steps to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DNN model crashes on detect for Yolo with CUDA backend #17934

Description

System information (version)

Detailed description

Steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions