Skip to content

Fix cuda11#17499

Merged
alalek merged 4 commits intoopencv:masterfrom
cyyever:fix_CUDA11
Jun 27, 2020
Merged

Fix cuda11#17499
alalek merged 4 commits intoopencv:masterfrom
cyyever:fix_CUDA11

Conversation

@cyyever
Copy link
Copy Markdown
Contributor

@cyyever cyyever commented Jun 8, 2020

force_builders=Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu-cuda11:18.04
Xbuild_image:Custom=ubuntu-cuda11:18.04

@asmorkalov asmorkalov linked an issue Jun 8, 2020 that may be closed by this pull request
@asmorkalov
Copy link
Copy Markdown
Contributor

@cyyever The solution for CuDNN is not complete. Please take a look on #17496 and previous trial: #17238.

@cyyever
Copy link
Copy Markdown
Contributor Author

cyyever commented Jun 16, 2020

@cyyever The solution for CuDNN is not complete. Please take a look on #17496 and previous trial: #17238.

Because opencv may use the system FindCUDA.cmake instead of the modified one. We must have some way to get rid of CUDA_nppicom_LIBRARY. Because FindCUDA.cmake is deprecated in favor of FindCUDAToolkit.cmake in newer versions, we could consider using it.

@alalek
Copy link
Copy Markdown
Member

alalek commented Jun 16, 2020

@cyyever Which CMake versions are provide new CUDA scripts? We can do something like this (for CMake 3.9 we use CMake's FindCUDA scripts)

@cyyever
Copy link
Copy Markdown
Contributor Author

cyyever commented Jun 17, 2020

@cyyever Which CMake versions are provide new CUDA scripts? We can do something like this (for CMake 3.9 we use CMake's FindCUDA scripts)

FindCUDAToolkit.cmake is available since 3.17, and it doesn't provide npp target but its components. So we need to scan the code to decide which components are used and link them.
See npp

@cyyever
Copy link
Copy Markdown
Contributor Author

cyyever commented Jun 20, 2020

@cyyever Which CMake versions are provide new CUDA scripts? We can do something like this (for CMake 3.9 we use CMake's FindCUDA scripts)

I committed a new patch.

Copy link
Copy Markdown
Contributor

@tomoaki0705 tomoaki0705 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don' think it's a good idea hard coding CUDA version.
I hope you consider my comment.

@PeterBowman
Copy link
Copy Markdown

See also upstream workaround targeted for CMake 3.18.x: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4929.

@tomoaki0705
Copy link
Copy Markdown
Contributor

It seems reasonable to hard code the CUDA version in the FindCUDA.cmake
I'll pull down my comments.

@cyyever
Copy link
Copy Markdown
Contributor Author

cyyever commented Jun 24, 2020

See also upstream workaround targeted for CMake 3.18.x: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4929.

I noticed that. But we couldn't force our users to upgrade cmake. So it makes little sense.

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good to me 👍

@asmorkalov asmorkalov self-assigned this Jun 26, 2020
@alalek
Copy link
Copy Markdown
Member

alalek commented Jun 26, 2020

I enabled build image with CUDA11 based on https://gitlab.com/nvidia/container-images/cuda/-/tree/master/dist/11.0/ubuntu18.04-x86_64.

CMake summary:
--   NVIDIA CUDA:                   YES (ver 11.0, CUFFT CUBLAS)
--     NVIDIA GPU arch:             61
--     NVIDIA PTX archs:
-- 
--   cuDNN:                         YES (ver 8.0.0)
CMake CUDA vars:
CUDA_64_BIT_DEVICE_CODE=ON
CUDA_64_BIT_DEVICE_CODE_DEFAULT=ON
CUDA_ARCH_BIN=6.1
CUDA_ARCH_BIN_OR_PTX_10=0
CUDA_ARCH_PTX=
CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE=ON
CUDA_BUILD_CUBIN=OFF
CUDA_BUILD_EMULATION=OFF
CUDA_COMMON_GPU_ARCHITECTURES=3.0;3.5;5.0;5.2;6.0;6.1;7.0;7.0+PTX
CUDA_CUBLAS_LIBRARIES=/usr/local/cuda/lib64/libcublas.so;CUDA_cublas_device_LIBRARY-NOTFOUND
CUDA_CUDART_LIBRARY=/usr/local/cuda/lib64/libcudart.so
CUDA_CUDART_LIBRARY_VAR=CUDA_cudart_static_LIBRARY
CUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so
CUDA_CUFFT_LIBRARIES=/usr/local/cuda/lib64/libcufft.so
CUDA_FAST_MATH=OFF
CUDA_FOUND=TRUE
CUDA_GENERATED_OUTPUT_DIR=
CUDA_GENERATION=
CUDA_HAS_FP16=TRUE
CUDA_HOST_COMPILATION_CPP=ON
CUDA_HOST_COMPILER=/usr/bin/cc
CUDA_INCLUDE_DIRS=/usr/local/cuda/include
CUDA_KNOWN_GPU_ARCHITECTURES=Fermi;Kepler;Maxwell;Kepler+Tegra;Kepler+Tesla;Maxwell+Tegra;Pascal;Volta
CUDA_LIBRARIES=cudart_static;-lpthread;dl;rt
CUDA_LIBRARIES_ABS=/usr/local/cuda/lib64/libcudart_static.a;-lpthread;dl;/usr/lib/x86_64-linux-gnu/librt.so
CUDA_LIBS_PATH=/usr/local/cuda/lib64;/usr/lib/x86_64-linux-gnu
CUDA_LINK_LIBRARIES_KEYWORD=PRIVATE
CUDA_NVCC_EXECUTABLE=/usr/local/cuda/bin/nvcc
CUDA_NVCC_FLAGS=-ccbin;/usr/bin/cc;-gencode;arch=compute_61,code=sm_61;-D_FORCE_INLINES
CUDA_NVCC_FLAGS=-ccbin;/usr/bin/cc;-gencode;arch=compute_61,code=sm_61;-D_FORCE_INLINES
CUDA_NVCC_FLAGS_DEBUG=
CUDA_NVCC_FLAGS_MINSIZEREL=
CUDA_NVCC_FLAGS_RELEASE=
CUDA_NVCC_FLAGS_RELWITHDEBINFO=
CUDA_NVCC_INCLUDE_DIRS_USER=
CUDA_PROPAGATE_HOST_FLAGS=ON
CUDA_SDK_ROOT_DIR=CUDA_SDK_ROOT_DIR-NOTFOUND
CUDA_SDK_ROOT_DIR_INTERNAL=CUDA_SDK_ROOT_DIR-NOTFOUND
CUDA_SDK_SEARCH_PATH=CUDA_SDK_ROOT_DIR-NOTFOUND;/usr/local/cuda/local/NVSDK0.2;/usr/local/cuda/NVSDK0.2;/usr/local/cuda/NV_CUDA_SDK;/home/build/NVIDIA_CUDA_SDK;/home/build/NVIDIA_CUDA_SDK_MACOSX;/Developer/CUDA
CUDA_SEPARABLE_COMPILATION=OFF
CUDA_STUB_ENABLED_LINK_WORKAROUND=1
CUDA_STUB_SYMLINK_RESULT=0
CUDA_STUB_TARGET_PATH=/build/precommit_custom_linux/build/CMakeFiles/
CUDA_SUPPORTED_CC=3.0 3.5 3.7 5.0 5.2 6.0 6.1 7.0 7.5 8.0
CUDA_TOOLKIT_INCLUDE=/usr/local/cuda/include
CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
CUDA_TOOLKIT_ROOT_DIR_INTERNAL=/usr/local/cuda
CUDA_TOOLKIT_TARGET_DIR=/usr/local/cuda
CUDA_TOOLKIT_TARGET_DIR_INTERNAL=/usr/local/cuda
CUDA_USE_STATIC_CUDA_RUNTIME=ON
CUDA_VERBOSE_BUILD=OFF
CUDA_VERSION=11.0
CUDA_VERSION_MAJOR=11
CUDA_VERSION_MINOR=0
CUDA_VERSION_STRING=11.0
CUDA_configuration_types=Debug;Release;MinSizeRel;RelWithDebInfo
CUDA_cublas_LIBRARY=cublas
CUDA_cublas_LIBRARY=cublas
CUDA_cublas_LIBRARY_ABS=/usr/local/cuda/lib64/libcublas.so
CUDA_cublas_device_LIBRARY=CUDA_cublas_device_LIBRARY-NOTFOUND
CUDA_cudadevrt_LIBRARY=/usr/local/cuda/lib64/libcudadevrt.a
CUDA_cudart_static_LIBRARY=/usr/local/cuda/lib64/libcudart_static.a
CUDA_cufft_LIBRARY=cufft
CUDA_cufft_LIBRARY=cufft
CUDA_cufft_LIBRARY_ABS=/usr/local/cuda/lib64/libcufft.so
CUDA_cupti_LIBRARY=CUDA_cupti_LIBRARY-NOTFOUND
CUDA_curand_LIBRARY=/usr/local/cuda/lib64/libcurand.so
CUDA_cusolver_LIBRARY=/usr/local/cuda/lib64/libcusolver.so
CUDA_cusparse_LIBRARY=/usr/local/cuda/lib64/libcusparse.so
CUDA_make2cmake=/usr/share/cmake-3.10/Modules/FindCUDA/make2cmake.cmake
CUDA_npp_LIBRARY=nppc;nppial;nppicc;nppidei;nppif;nppig;nppim;nppist;nppisu;nppitc;npps
CUDA_npp_LIBRARY_ABS=/usr/local/cuda/lib64/libnppc.so;/usr/local/cuda/lib64/libnppial.so;/usr/local/cuda/lib64/libnppicc.so;/usr/local/cuda/lib64/libnppidei.so;/usr/local/cuda/lib64/libnppif.so;/usr/local/cuda/lib64/libnppig.so;/usr/local/cuda/lib64/libnppim.so;/usr/local/cuda/lib64/libnppist.so;/usr/local/cuda/lib64/libnppisu.so;/usr/local/cuda/lib64/libnppitc.so;/usr/local/cuda/lib64/libnpps.so
CUDA_nppc_LIBRARY=/usr/local/cuda/lib64/libnppc.so
CUDA_nppial_LIBRARY=/usr/local/cuda/lib64/libnppial.so
CUDA_nppicc_LIBRARY=/usr/local/cuda/lib64/libnppicc.so
CUDA_nppicom_LIBRARY=CUDA_nppicom_LIBRARY-NOTFOUND
CUDA_nppidei_LIBRARY=/usr/local/cuda/lib64/libnppidei.so
CUDA_nppif_LIBRARY=/usr/local/cuda/lib64/libnppif.so
CUDA_nppig_LIBRARY=/usr/local/cuda/lib64/libnppig.so
CUDA_nppim_LIBRARY=/usr/local/cuda/lib64/libnppim.so
CUDA_nppist_LIBRARY=/usr/local/cuda/lib64/libnppist.so
CUDA_nppisu_LIBRARY=/usr/local/cuda/lib64/libnppisu.so
CUDA_nppitc_LIBRARY=/usr/local/cuda/lib64/libnppitc.so
CUDA_npps_LIBRARY=/usr/local/cuda/lib64/libnpps.so
CUDA_nvcuvid_LIBRARY=CUDA_nvcuvid_LIBRARY-NOTFOUND
CUDA_parse_cubin=/usr/share/cmake-3.10/Modules/FindCUDA/parse_cubin.cmake
CUDA_rt_LIBRARY=/usr/lib/x86_64-linux-gnu/librt.so
CUDA_run_nvcc=/usr/share/cmake-3.10/Modules/FindCUDA/run_nvcc.cmake
CUDNN_FOUND=TRUE
CUDNN_INCLUDE_DIR=/usr/include
CUDNN_INCLUDE_DIRS=/usr/include
CUDNN_LIBRARIES=cudnn
CUDNN_LIBRARIES_ABS=/usr/lib/x86_64-linux-gnu/libcudnn.so
CUDNN_LIBRARY=/usr/lib/x86_64-linux-gnu/libcudnn.so
CUDNN_MAJOR_VERSION=8
CUDNN_MINOR_VERSION=0
CUDNN_PATCH_VERSION=0
CUDNN_VERSION=8.0.0

Current errors are:

In file included from /build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/csl/cudnn.hpp:8:0,
                 from /build/precommit_custom_linux/opencv/modules/dnn/src/layers/../op_cuda.hpp:11,
                 from /build/precommit_custom_linux/opencv/modules/dnn/src/layers/blank_layer.cpp:43:
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/convolution.hpp: In constructor 'cv::dnn::cuda4dnn::csl::cudnn::ConvolutionAlgorithm<T>::ConvolutionAlgorithm(const cv::dnn::cuda4dnn::csl::cudnn::Handle&, const cv::dnn::cuda4dnn::csl::cudnn::ConvolutionDescriptor<T>&, const cv::dnn::cuda4dnn::csl::cudnn::FilterDescriptor<T>&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<T>&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<T>&)':
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/convolution.hpp:266:21: error: 'CUDNN_CONVOLUTION_FWD_PREFER_FASTEST' was not declared in this scope
                     CUDNN_CONVOLUTION_FWD_PREFER_FASTEST,
                     ^
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/csl/cudnn/cudnn.hpp:22:53: note: in definition of macro 'CUDA4DNN_CHECK_CUDNN'
     ::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, __FILE__, __LINE__)
                                                     ^~~~
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/convolution.hpp:266:21: note: suggested alternative: 'CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3'
                     CUDNN_CONVOLUTION_FWD_PREFER_FASTEST,
                     ^
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/csl/cudnn/cudnn.hpp:22:53: note: in definition of macro 'CUDA4DNN_CHECK_CUDNN'
     ::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, __FILE__, __LINE__)
                                                     ^~~~
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/transpose_convolution.hpp: In constructor 'cv::dnn::cuda4dnn::csl::cudnn::TransposeConvolutionAlgorithm<T>::TransposeConvolutionAlgorithm(const cv::dnn::cuda4dnn::csl::cudnn::Handle&, const cv::dnn::cuda4dnn::csl::cudnn::ConvolutionDescriptor<T>&, const cv::dnn::cuda4dnn::csl::cudnn::FilterDescriptor<T>&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<T>&, const cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<T>&)':
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/transpose_convolution.hpp:42:21: error: 'CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST' was not declared in this scope
                     CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST,
                     ^
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/csl/cudnn/cudnn.hpp:22:53: note: in definition of macro 'CUDA4DNN_CHECK_CUDNN'
     ::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, __FILE__, __LINE__)
                                                     ^~~~
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/transpose_convolution.hpp:42:21: note: suggested alternative: 'CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT'
                     CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST,
                     ^
/build/precommit_custom_linux/opencv/modules/dnn/src/layers/../cuda4dnn/csl/cudnn/cudnn.hpp:22:53: note: in definition of macro 'CUDA4DNN_CHECK_CUDNN'
     ::cv::dnn::cuda4dnn::csl::cudnn::detail::check((call), CV_Func, __FILE__, __LINE__)
                                                     ^~~~

@YashasSamaga Could you please suggest some fix for that?

@YashasSamaga
Copy link
Copy Markdown
Contributor

YashasSamaga commented Jun 26, 2020

@alalek

I have a version that compiles without errors here: https://github.com/YashasSamaga/opencv/tree/cuda4dnn-cudnn8-support

Those compile-time errors can be resolved but there are some other non-trivial problems. I gave up eventually as I wasn't able to get everything working. Many tests required increasing thresholds and some tests completely failed. Some tests had OOM errors during initialization (which wasn't happening before). Some tests failed with CUDNN_STATUS_NOT_SUPPORTED or CUDNN_STATUS_EXECUTION_FAILED. I haven't been able to fix some of these (and don't even know why it's happening).

@YashasSamaga
Copy link
Copy Markdown
Contributor

YashasSamaga commented Jun 26, 2020

[ RUN      ] Test_ONNX_nets.Resnet34_kinetics/1, where GetParam() = CUDA/CUDA_FP16
Convolution Algorithms Available:
CUDNN_STATUS_NOT_SUPPORTED 1 0 0 -1
CUDNN_STATUS_NOT_SUPPORTED 0 0 0 -1
CUDNN_STATUS_NOT_SUPPORTED 2 0 0 -1
CUDNN_STATUS_NOT_SUPPORTED 5 0 0 -1
CUDNN_STATUS_NOT_SUPPORTED 4 0 0 -1
CUDNN_STATUS_NOT_SUPPORTED 7 0 0 -1
CUDNN_STATUS_NOT_SUPPORTED 6 0 0 -1
CUDNN_STATUS_NOT_SUPPORTED 3 0 0 -1
unknown file: Failure
C++ exception with description "OpenCV(4.4.0-pre) /opencv/modules/dnn/src/layers/../cuda4dnn/primitives/../csl/cudnn/convolution.hpp:303: error: (-217:Gpu API call) cuDNN did not return a suitable algorithm for convolution. in function 'ConvolutionAlgorithm'
" thrown in the test body.
[  FAILED  ] Test_ONNX_nets.Resnet34_kinetics/1, where GetParam() = CUDA/CUDA_FP16 (365 ms)

cudnnGetConvolutionForwardAlgorithm_v7 does not return any algorithm that works. It returns a bunch of algorithms which all have the status set to CUDNN_STATUS_NOT_SUPPORTED. I think this is a bug in cuDNN 8. The behaviour of the API does not conform to what's stated in the documentation. The release notes does not mention anything about algorithms being removed or support for some architectures being removed.


Tests that can be corrected by thresholds:

[  FAILED  ] DNNTestNetwork.MobileNet_SSD_v1_TensorFlow/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] DNNTestNetwork.MobileNet_SSD_v1_TensorFlow_Different_Width_Height/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] DNNTestNetwork.DenseNet_121/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_Darknet_nets.TinyYoloVoc/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_Caffe_layers.Conv_Elu/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_Model.DetectRegion/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_nets.TinyYolov2/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_nets.LResNet100E_IR/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.Convolution3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_nets.MobileNet_v1_SSD/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_nets.MobileNet_v1_SSD_PPN/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_nets.EfficientDet/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_Torch_nets.FastNeuralStyle_accuracy/1, where GetParam() = CUDA/CUDA_FP16

Tests that fail completely

[  FAILED  ] Test_ONNX_layers.PoolConv3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_nets.Resnet34_kinetics/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_ONNX_layers.Convolution3D/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.concat_3d/1, where GetParam() = CUDA/CUDA_FP16

cuDNN does not return a suitable convolution algorithm for these tests.

Tests failing due to OOM errors aren't listed. They are taken care of by a memory check in the algorithm selection loop. I don't know if it's a bug in cuDNN or a genuine out-of-memory error but the fix is working.

I don't know if the tests should be updated to reflect the changes since the cuDNN 8 released is a preview version.


TensorFlow's cuDNN 8 PR tensorflow/tensorflow#39577 skips WINOGRAD_NONFUSED algorithms for both forward and backward pass. I don't know why.

@alalek
Copy link
Copy Markdown
Member

alalek commented Jun 26, 2020

@YashasSamaga Thank you for update!

Could you please copy your comment with progress about CUDNN 8.0 support here: #17496 (issue about CUDNN 8.0 support)?

I will temporary disable CUDNN 8.0 usage in this PR, so it will unlock PRs from you (just revert disabling part).

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍


ocv_option(OPENCV_DNN_CUDA "Build with CUDA support" HAVE_CUDA AND HAVE_CUBLAS AND HAVE_CUDNN)
if(NOT DEFINED OPENCV_DNN_CUDA AND HAVE_CUDNN AND CUDNN_VERSION VERSION_LESS 8.0)
message(STATUS "DNN: CUDNN 8.0 is not supported yes. Details: https://github.com/opencv/opencv/issues/17496")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet not yes. There is a typo :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this message will be removed soon =)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this , still i am getting error
"OpenCV/modules/cudacodec/src/precomp.hpp:59:14: fatal error: nvcuvid.h: No such file or directory
59 | #include <nvcuvid.h>"

@alalek alalek merged commit 206c843 into opencv:master Jun 27, 2020
@YashasSamaga
Copy link
Copy Markdown
Contributor

YashasSamaga commented Jun 29, 2020

cmake .. -DWITH_CUDA=on -DOPENCV_EXTRA_MODULES_PATH=/opencv_contrib/modules -DWITH_CUDNN=ON in a clean build directory.

I have run into this after the rebasing onto master:

CMake Error at cmake/OpenCVDetectCUDA.cmake:34 (if):
  if given arguments:

    "CUDA_VERSION" "VERSION_GREATER_EQUAL" "11.0"

  Unknown arguments specified
Call Stack (most recent call first):
  cmake/OpenCVFindLibsPerf.cmake:43 (include)
  CMakeLists.txt:688 (include)


-- Configuring incomplete, errors occurred!

Removing this PR's commit stops this error.

@tomoaki0705
Copy link
Copy Markdown
Contributor

CMake version ?
If I remember correctly, VERSION_GREATER_EQUAL was relatively new command (but later than the minimum requirement in OpenCV)
I guess we need to use NOT VERSION_GREATER or something similar

@YashasSamaga
Copy link
Copy Markdown
Contributor

cmake version 3.5.1

VERSION_GREATER_EQUAL was added in CMake 3.7.

@tomoaki0705
Copy link
Copy Markdown
Contributor

Yep, up to 3.6, it doesn't exist
It was added on and after 3.7

@alalek
Copy link
Copy Markdown
Member

alalek commented Jun 29, 2020

VERSION_GREATER_EQUAL => NOT CUDA_VERSION VERSION_LESS 11.0 should fix problem

@asmorkalov
Copy link
Copy Markdown
Contributor

#17700

NALLEIN pushed a commit to NALLEIN/opencv that referenced this pull request Sep 20, 2020
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613098 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613095 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613092 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613089 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613087 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613083 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613078 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613075 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613073 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613069 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613067 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613063 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613061 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613055 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613053 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613051 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613049 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613045 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613042 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613039 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613037 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613035 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613032 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613029 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613021 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613006 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600613002 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612998 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612994 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612991 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612989 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612985 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612982 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612980 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612972 +0800

parent b083c20
author Boubacar <boubacar.diallo1@macaulay.cuny.edu> 1590059814 -0400
committer NALLEIN <nigelts@sjtu.edu.cn> 1600612951 +0800

group contour articles together

chk divide 0

Generate constructor with smart pointer, if it's expected.

Merge pull request opencv#17403 from wangmengHB:master

Fix Test Case: in latest version, window.cv is a promise instance that makes most test case failed.

* Fix Browser Test Case: In latest version, window.cv is a promise instance

In latest version of opencv.js, window.cv is promise instance.
So that most of the test cases is run failed.
This commit is to fix browser test case.

* Add comment for backward compatible

Add comments for backward compatible

Fixed cascadedetect convert from old cascade to new

dnn(test): file 'dnn/efficientdet-d0.pb' is optional

dnn: add a human parsing cpp sample

Merge pull request opencv#17417 from vpisarev:fix_fitellipse

* improved fitEllipse and fitEllipseDirect accuracy in singular or close-to-singular cases (see issue opencv#9923)

* scale points using double precision

* added normalization to fitEllipseAMS as well; fixed Java test case by raising the tolerance (it's unclear what is the correct result in this case).

* improved point perturbation a bit. make the code a little bit more clear

* trying to fix Java fitEllipseTest by slightly raising the tolerance threshold

* synchronized C++ version of Java's fitEllipse test

* removed trailing whitespaces

core: fix builds with eigen helper header

dnn(ie): fix layers extraction

videoio(ffmpeg): fix handling of AVERROR_EOF

decoder should be properly flushed after that

Add instructions for how to use findEssentialMat() when camera matrices are different

Added countNonZero test for big arrays and disable IPP for some cases

Fixed virtual try on sample

build: winpack_dldt with dldt 2020.3.0

https://github.com/openvinotoolkit/openvino/releases/tag/2020.3.0

Add handling for Android "NDK (Side by side)"

Merge pull request opencv#17431 from mshabunin:support-vtk9

* Added VTK 9 support

Skip some GAPI tests if VideoCapture is not capable to playback video.

Fix '--help' of stitching_detailed.py sample

Fixes the help for `--features`, previously listed all possible values as default value.

Also adds the default value to the help for two other arguments

Add WebGPU-Dawn support for dnn module

Add test

Build with -g4

Add test

Add console for test temporarily

Add test

Fix build issue

Add asyncForwardWrapper by using Asyncify

Delete outputs

Add test for dnn webgpu backend

Remove pre-build files and add README.md to provide compilation reference

Update dawn

Add layer test

videoio: fix plugins build

Merge pull request opencv#17165 from komakai:objc-binding

Objc binding

* Initial work on Objective-C wrapper

* Objective-C generator script; update manually generated wrappers

* Add Mat tests

* Core Tests

* Imgproc wrapper generation and tests

* Fixes for Imgcodecs wrapper

* Miscellaneous fixes. Swift build support

* Objective-C wrapper build/install

* Add Swift wrappers for videoio/objdetect/feature2d

* Framework build;iOS support

* Fix toArray functions;Use enum types whenever possible

* Use enum types where possible;prepare test build

* Update test

* Add test runner scripts for iOS and macOS

* Add test scripts and samples

* Build fixes

* Fix build (cmake 3.17.x compatibility)

* Fix warnings

* Fix enum name conflicting handling

* Add support for document generation with Jazzy

* Swift/Native fast accessor functions

* Add Objective-C wrapper for calib3d, dnn, ml, photo and video modules

* Remove IntOut/FloatOut/DoubleOut classes

* Fix iOS default test platform value

* Fix samples

* Revert default framework name to opencv2

* Add converter util functions

* Fix failing test

* Fix whitespace

* Add handling for deprecated methods;fix warnings;define __OPENCV_BUILD

* Suppress cmake warnings

* Reduce severity of "jazzy not found" log message

* Fix incorrect #include of compatibility header in ios.h

* Use explicit returns in subscript/get implementation

* Reduce minimum required cmake version to 3.15 for Objective-C/Swift binding

3rdparty: update TBB 2020.1 => 2020.2

https://github.com/oneapi-src/oneTBB/releases/tag/v2020.2

pre: OpenCV 4.4.0 (version++)

ffmpeg/4.4: update FFmpeg wrapper

- FFmpeg 4.2.3

dnn/NGraph: added nullptr checks

QRDetectMulti: refactored checkPoints method

use C++11 static variables as memory barrier

allow multiple inputs to resize, fix tests

Merge pull request opencv#16955 from themechanicalcoder:text_recognition

* add text recognition sample

* fix pylint warning

* made changes according to the c++ example

* fix errors

* add text recognition sample

* update text detection sample

Merge pull request opencv#17368 from themightyoarfish:cv2eigen-doc

* Add documentation about usage of cv2eigen functions in eigen.hpp

* Fixed Doxygen syntax.

Co-authored-by: Alexander Smorkalov <smorkalov.a.m@gmail.com>

dnn/NGraph: added nullptr checks

Added information to OpenCV documentation [MacOS]

Added and Edited specific information to the "Installation in MacOS" OpenCV documentation.
Closes opencv#17340

improve the mkl search procedure

dnn: use OpenVINO 2020.3 defines

Removed error lisneter usage

improve mish performance and accuracy

Removed plugin dispatcher

Merge pull request opencv#16772 from aDanPin:dp/performance_render_tests

Added g-api render performance tests

* Add render performance tests for BGROCV

* Add render NV12 performance tests

* Review response

* Review response

* Review response

* Review response

* Review response

* Review response

* Just a small fix

* Final review response I hope)

* Review response

* Review response

* Review response

* Review response

* Review response

* Review response

platforms/android: fix --no_samples_build flag not working

Fix framework_name option in build script

Fix testFitEllipse test

Use cv::Ptr instead of raw pointers

Cleanup unneeded raw pointer handling code

Merge pull request opencv#17534 from YashasSamaga:cuda4dnn-remove-unused-funcs

cuda4dnn: reduce CUDA version requirements to at least CUDA 9.2

* remove half2 specializations

* do not remove atomicAdd for half in CUDA 10 and below

* remove fp16.hpp

Fix typo

This typo just made me lose my mind on the conan package update. please merge.

core: fix handling of ND-arrays in dumpInputArray() helpers

Removed plugin dispatcher

backport of commit 7411373

Changed StridedSlice to VariadicSplit in Region layer

Merge pull request opencv#17468 from liqi-c:sharedlib_build_problem

TEngine installation rules fix for static build

* Modify cmake config error for -DBUILD_SHARED_LIBS=OFF

* Modify for not install tengine include directory

* Update compile error.

* move install command to tengine/CMakeLists.txt

* rm include dir when make install,only build static lib will install libtengine.a

Merge pull request opencv#17573 from alexcohn:fix/android_windows_build

* fixing opencv#17572

opencv#17572 Build for Android failed: "can't concat str to bytes"

on Windows 10 64bit with python 3.6.6

* similar to changes in platforms/winpack_dldt/build_package.py

Fix the build of imgproc using MinGW (variables with the same name as symbols defined in MinGW headers)

fix VS Windows build with eigen. opencv#17548

core(logger): complete initialization of logger structures

- for using of logging functions from global destructors

Fix the detection of the XIMEA library (since its location may be different when the version of the ximea software is updated)

Conditional compilation for network reader

Remove deprecated Inference Engine CPU extensions

use fp32 mish for fp16 mish

docs: linkfix in bibliography

The [current link](https://arxiv.org/pdf/1808.01752) goes to a
random unrelated paper.

Disabling dafault NMS in yolo layer

Remove deprecated Inference Engine CPU extensions

Conditional compilation for network reader

origibal commit: 63e92cc

Optimize Mish for CPU backend

Add implementation in case plaidml isn't found

Enable state initialization params via compile_args

Conditional compilation for IR v7 support

Merge pull request opencv#17020 from dbudniko:dbudniko/serialization_backend

G-API Serialization routines

* Serialization backend in tests, initial version

* S11N/00: A Great Rename

- "Serialization" is too long and too error-prone to type,
  so now it is renamed to "s11n" everywhere;
- Same applies to "SRLZ";
- Tests also renamed to start with 'S11N.*' (easier to run);
- Also updated copyright years in new files to 2020.

* S11N/01: Some basic interface segregation

- Moved some details (low-level functions) out of serialization.hpp;
- Introduced I::IStream and I::OStream interfaces;
- Implemented those via the existing [De]SerializationStream classes;
- Moved all operators to use interfaces instead of classes;
- Moved the htonl/ntohl handling out of operators (to the classes).

The implementation didn't change much, it is a subject to the further
refactoring

* S11N/02: Basic operator reorg, basic tests, vector support

- Reorganized operators on atomic types to follow >>/<< model
  (put them closer in the code for the respective types);
- Introduce more operators for basic (scalar) types;
- Drop all vector s11n overloads -- replace with a generic
  (template-based) one;
- Introduced a new test suite where low-level s11n functionality
  is tested (for the basic types).

* S11N/03: Operators reorganization

- Sorted the Opaque types enum by complexity;
- Reorganized the existing operators for basic types, also ordered by
  complexity;
- Organized operators in three groups (Basics, OpenCV, G-API);
- Added a generic serialization for variant<>;
- Reimplemented some of the existing operators (for OpenCV and G-API
  data structures);
- Introduced new operators for cv::gimpl data types. These operators
  (and so, the data structures) are not yet used in the graph
  dump/reconstruction routine, it will be done as a next step.

* S11N/04: The Great Clean-up

- Drop the duplicates of GModel data structures from the
  serialization, serialize the GModel data structures themselve
  instead (hand-written code replaced with operators).
- Also removed usuned code for printing, etc.

* S11N/05: Internal API Clean-up

- Minimize the serialization API to just Streams and Operators;
- Refactor and fix the graph serialization (deconstruction and
  reconstruction) routines, fix data addressing problems there;
- Move the serialization.[ch]pp files to the core G-API library

* S11N/06: Top-level API introduction

- !!!This is likely the most invasive commit in the series!!!
- Introduced a top-level API to serialize and deserialize a GComputation
- Extended the compiler to support both forms of a GComputation:
  an expession based and a deserialized one. This has led to changes in
  the cv::GComputation::Priv and in its dependent components (even the
  transformation tests);
- Had to extend the kernel API (GKernel) with extra information on
  operations (mainly `outMeta`) which was only available for expression
  based graphs. Now the `outMeta` can be taken from kernels too (and for
  the deserialized graphs it is the only way);
- Revisited the internal serialization API, had to expose previously
  hidden entities (like `GSerialized`);
- Extended the serialized graph info with new details (object counter,
  protocol). Added unordered_map generic serialization for that;
- Reworked the very first pipeline test to be "proper"; GREEN now, the rest
  is to be reworked in the next iteration.

* S11N/07: Tests reworked

- Moved the sample pipeline tests w/serialization to
  test the public API (`cv::gapi::serialize`, then
  followed by `cv::gapi::deserialize<>`). All GREEN.
- As a consequence, dropped the "Serialization" test
  backend as no longer necessary.

* S11N/08: Final touches

- Exposed the C++ native data types at Streams level;
- Switched the ByteMemoryIn/OutStreams to store data in `char`
  internally (2x less memory for sample pipelines);
- Fixed and refactored Mat dumping to the stream;
- Renamed S11N pipeline tests to their new meaning.

* linux build fix

* fix RcDesc and int uint warnings

* more Linux build fix

* white space and virtual android error fix (attempt)

* more warnings to be fixed

* android warnings fix attempt

* one more attempt for android build fix

* android warnings one more fix

* return back override

* avoid size_t

* static deserialize

* and how do you like this, elon? anonymous namespace  to fix android warning.

* static inline

* trying to fix standalone build

* mat dims fix

* fix mat r/w for standalone

Co-authored-by: Dmitry Matveev <dmitry.matveev@intel.com>

Dynamic build for Objective-C/Swift wrapper

Merge pull request opencv#17499 from cyyever:fix_CUDA11

Fix cuda11

* use cudnn_version.h to detect version when it is available

* remove nppi from CUDA11

* use ocv_list_filterout

* dnn(cuda): temporary disable CUDNN 8.0

cmake(cuda): update handling of -std=c++11/14 flags

cmake(gapi): fix opencv_world build for winpack

min max fix for standalone

3rdparty: libjpeg 9d

http://www.ijg.org/files/jpegsrc.v9d.tar.gz

3rdparty: libjpeg-turbo 2.0.4 => 2.0.5

https://github.com/libjpeg-turbo/libjpeg-turbo/releases/tag/2.0.5

Fix: error in the dimension used for computeMinMax

Instead of using the current dimension for which we just got a big span,
we were computing Min and Max for the previous dimension stored in cutfeat
(and using 0 instead of the dimension indice for the very first dimension
with "span > (1-eps)max_span")

Conditional compilation for IR v7 support

backported commit 8690575

Optim: test that could be done once has been extracted from the loop

Merge pull request opencv#17642 from pemmanuelviel:pev--fixes-and-clean

* Clean: make the use of the indices array length consistent

Either we don't want this method to be used in the future for any other node
than the root node, and so we replace indices_length by size_ and remove it as
argument, or we want to be able to use it potentially for other nodes, and
so using size_ instead of indices_length would have lead to a bug.

* Fix: b was not an address

* Fix: transpose the Flann repo commit "Fixes in accum_dist methods" from Adil Ibragimov

Avoids trying to compute log(ratio) with ratio = 0

* Fix: transpose the Flann repo commit "result_set bugfix" from Jack Rae

* Fix Jack Rae commit as the initial i - 1 index was decremented before entering the loop body

* Clean: transpose the Flann repo commit "Updated comments in lsh_index" from Richard McPherson

* Fix: Transpose the Flann repo commit "Fixing unreachable code in lsh_table.h" from hypevr

* Fix warning the same way it was done in flann standalone repo

* Change the return value in case of unsupported type

Increased portability of CV_Func

add if block for indexed color images

Fix the 'cvflann::anyimpl::bad_any_cast' error using Lsh

Add test checking we don't throw when creating GenericIndex with LshIndexParams()

Restored compatibility with CMake older than 3.7.

Type consistency for all xxxIndexParams integer arguments as well as with miniflann's LshIndexParams

Merge pull request opencv#17454 from creinders:master

fix instable fisheye undistortPoints

* remove artefacts when (un)distorting fisheye images with large distortion coefficient values

* fix fisheye undistortion when theta is close to zero

* add fisheye image undistort and distort test

* Fixed type conversion warnings

* fixed trailing whitespace

Merge pull request opencv#17756 from ilyachur:feature/ichuraev/fix_ngraph_headers

* Fixed header paths for some nGraph ops

* Added dependency on IE version

avoid kernel compile error on Arm SBCs

reduce slice, concat to copy; enable more concat fusions

add cuDNN 8 support

Remove duplicate line

Merge pull request opencv#17708 from shirriff:patch-1

Clarify component statistics documentation

* Change ConnectedComponentsTypes documentation

Change from "algorithm output formats" to "statistics" because it specifies types of statistics, not formats.

* Documentation: clarify component statistics

Explain that ConnectedComponentTypes selects a statistic.

Fix arguments list in loadindex for histogram intersection

Precompute the divisor to ensure that no kind of compiler would process it on the fly at each call.

Merge pull request opencv#17733 from l-bat:tiny_yolov4

* Supported yolov4-tiny

* Added comments

cmake: fix ENABLE_PROFILING

Fix genericity of computeNodeStatistics that couldn't compute stats properly on sub-nodes

Merge pull request opencv#17722 from pemmanuelviel:pev--replace-asserts

* Clean: replace C style asserts by CV_Assert and CV_DbgAssert

* Try fixing warning on Windows compilation

* Another way trying to fix warnings on Win

* Fixing warnings with some compilers:
Some compilers warn on systematic exit preventing to execute the code that follows.
This is why assert(0) that exits only in debug was working, but not CV_Assert or CV_Error
that exit both in release and debug, even if with different behavior.
In addition, other compilers complain when return 0 is removed from getKey(),
even if before we have a statement leading to systematic exit.

* Disable "unreachable code" warnings for Win compilers so we can use proper CV_Error

Mix of 32 and 64bits vector types prevents vectorisation for distance computation.
Argument "a" is of type ElementType* that is either int* or float*, while b was double*.
Mixing types prevents the possibility to use SSE or AVX instructions.
On implementation without SIMD instructions, this doesn't show any impact on performance.

forget to look in sub folder of include/openblas

imgcodecs: fix test build with disabled JPEG and PNG libs

dnn(test): add YOLOv4-tiny tests

dnn(slice): disable buggy OCV/OCL implementation

Merge pull request opencv#17699 from alalek:build_core_cuda

* core(cuda): fix build

- MSVS 19.25.28612.0
- CUDA release 11.0, V11.0.167

* cmake(cuda): backport workaround for CUDA 11

* cmake(cuda): call CUDA_BUILD_CLEAN_TARGET() on finalize

* cmake(cuda): use CMAKE_SUPPRESS_REGENERATION with MSVS

Update documentation of imwrite()

build: winpack_dldt with dldt 2020.4.0

imgproc(test): test bitExact cases in OCL/sepFilter2D

use universal SIMD intrinsics for SIFT

Fix trees parsing behavior in hierarchical_clustering_index:
Before, when maxCheck was reached in the first descent of a tree, time was still wasted parsing
the next trees till their best leaves whose points were not used at all.

Merge pull request opencv#17363 from YashasSamaga:cuda4dnn-eltwise-fusion2

cuda4dnn(conv): fuse eltwise with convolutions

* fuse eltwise with convolutions

* manually rebase to avoid bad git merge

Added a script to measure & report privacy masking camera performance in different configurations

G-API: Change the default FD model in the privacy-masking-camera

Fix typo

Merge pull request opencv#17694 from dbudniko:dbudniko/serialization_args2

G-API args serialization

* args serialization

* GRunArgP draft

* UMat added

* bind added

* DmitryM's review addressed. Code clean up required.

* fix android build

* bind test added

* more comments addressed

* try to fix Mac build

* clean up

* header-based generic implementation (GRunArg)

* clang again

* one more attempt for clang

* more clean up

* More Dmitry's comments addressed.

* monostate removed

* Top level functions and some other comments addressed.

* fix warnings

* disable warning

Fixed checkMasks in DescriptorMatcher with train descs in UMats

Added test for checkMasks with UMat train descs

Use“ moms” replace "contourArea"

double area = moms.m00;
is same as
double area = contourArea(contours[contourIdx]);
Not to mention
"moms" already calculated here,"contourArea" should not apply

use bufferarea for allocating buffer

core: use lazy on-demand initialization for param_traceEnable

Merge pull request opencv#17770 from jasonKercher:3.4_triggered

3.4 Allow first capture to return false

* fix first capture timeout

* fix first capture timeout

Merge pull request opencv#17639 from pemmanuelviel:pev--binary-kmeans

Pev binary kmeans

* Ongoing work transposing kmeans clustering method for bitfields: the computeClustering method

Ongoing work transposing kmeans clustering method for bitfields: interface computeBitfieldClustering

Fix genericity of computeNodeStatistics

Ongoing work transposing kmeans clustering method for bitfields: adapt computeNodeStatistics()

Ongoing work transposing kmeans clustering method for bitfields: adapt findNN() method

Ongoing work transposing kmeans clustering method for bitfields: allow kmeans with Hamming distance

Ongoing work transposing kmeans clustering method for bitfields: adapt distances code

Ongoing work transposing kmeans clustering method for bitfields: adapt load/save code

Ongoing work transposing kmeans clustering method for bitfields: adapt kmeans hierarchicalClustring()

PivotType -> CentersType Renaming

Fix type casting for ARM SIMD implementation of Hamming

Fix warnings with Win32 compilation

Fix warnings with Win64 compilation

Fix wrong parenthesis position on rounding

* Ensure proper rounding when CentersType is integral

dnn(ie): enable KEY_CPU_THREADS_NUM for Windows

features2d: v_fma => v_muladd for integers

Merge pull request opencv#17502 from dmatveev:dm/infer2

* G-API: Introduce a new gapi::infer2 overload + gaze estimation sample

* G-API/infer2: Introduced static type checking for infer2

- Also added extra tests on the type check routine

* G-API/infer2: Addressed self-review comments in the sample app

- Also fix build on Linux;

* G-API/infer2: Remove incorrect SetLayout(HWC) + dead code

- Also fixed comments in the backend

* G-API/infer2: Continue with self-review

- Fix warnings/compile errors in gaze estimation
- Dropped the use of RTTI (VectorRef::holds()) from the giebackend
- Replaced it with a trait-based enums for GArray<T> and std::vector<T>
- The enums and traits are temporary and need to be unified with
  the S11N when it comes

* G-API/infer2: Final self-review items

- Refactored ROIList test to cover 70% for infer<> and infer2<>;
- Fixed the model data discovery routine to be compatible with new
  OpenVINO;
- Hopefully fixed the final issues (warnings) with the sample.

* G-API/infer2: address review problems

- Fixed typo in comments;
- Fixed public (Doxygen) comment on GArray<GMat> input case for infer2;
- Made model lookup more flexible to allow new & old OMZ dir layouts.

* G-API/infer2: Change the model paths again

* G-API/infer2: Change the lookup path for test data

* G-API/infer2: use randu instead of imread. CI war is over

GAPI: fix warnings in own::Mat default generated constructors/assign op

error if cuda4dnn depends are not resolved

Merge pull request opencv#17741 from aDanPin:dp/add_dinamic_graph_feature

[G-API] Allow building graphs with a dynamic number of inputs and outputs

* Add dinamic graph feature and tests

* Remove unnecessary file

* Review response

* Add implementation of operator += for GRunArgs
And test for that case

* Tests refactoring

* Add doxygen
Review response

* Fix docs

* A small documentation fix

* Review response

* Add tests for more entities

* Add typed tests

* Another typed tests

* Doc fix

* Documentation fix

* Build fix

* Commit for rebuild

* The last one

Merge pull request opencv#17818 from komakai:documentation-improvements

Documentation fixes/improvements

* Documentation fixes/improvements

* Remove HASH_UTILS defines

GAPI: GAPI_LOG_DEBUG facility

Merge pull request opencv#17668 from OrestChura:oc/giebackend_migration_to_core

GAPI: Migration to IE Core API

* Migration to IE Core API
 - both versions are maintained
 - checked building with all the OpenVINO versions (2019.R1, R2, R3, 2020.4 (newest))

* commit to awake builders

* Addressing comments
 - migrated to Core API in 'gapi_ie_infer_test.cpp'
 - made Core a singleton object
 - dropped redundant steps

* Addressing comments
 - modified Mutex locking

* Update

* Addressing comments
 - remove getInitMutex()
 - reduce amount of #ifdef by abstracting into functions

* return to single IE::Core

* Divide functions readNet and loadNet to avoid warnings on GCC

* Fix deprecated code warnings

* Fix deprecated code warnings on CMake level

* Functions wrapped
 - All the functions depended on IE version wrapped into a cv::gapi::ie::wrap namesapace
 - All this contained to a new "giebackend/gieapi.hpp" header
 - The header shared with G-API infer tests to avoid code duplications

* Addressing comments
 - Renamed `gieapi.hpp` -> `giewrapper.hpp`, `cv::gapi::ie::wrap` -> `cv::gimpl::ie::wrap`
 - Created new `giewrapper.cpp` source file to avoid potential "multiple definition" problems
 - removed unnecessary step SetLayout() in tests

* Enabling two NN infer teest

* Two-NN infer test change for CI
 - deleted additional network
 - inference of two identical NN used instead

* Fix CI fileNotFound

* Disable MYRIAD test not to fail Custom CI runs

Fix TensorFlow->ONNX imports

dnn: fix OpenCL implementation of Slice layer

dnn: use OpenVINO 2020.4 defines

original commit: 2813aa7

winpack_dldt: switch defaults to OpenVINO 2020.4

dnn: eliminate IE deprecation warning

Merge pull request opencv#17841 from vpisarev:fixed_fs_dtor

* fixed issue opencv#17412

* Update test_io.cpp

dnn(test): adjust tests for OpenVINO 2020.4 (4.x branch)

G-API: Try to fix infer2 problem with VS2017

imgproc: add missing check into cvtColorTwoPlane()

core(persistence): fix "use after free" bug

- do not store user-controlled "FileStorage" pointer
- store FileStorage::Impl pointer instead

release: OpenCV 4.4.0

Fixed removing is_parameter, is_constant, is_output

Revert "Fixed removing is_parameter, is_constant, is_output"

Fixed removing is_parameter, is_constant, is_output

Replaced copy_with_new_args to clone_with_new_inputs

Merge pull request opencv#17896 from OrestChura:oc/fix_kw_videotests

* - fix numeric overflow due to incorrect type casting
 - remove unnecessary default constructor

* Drop the cast

Merge pull request opencv#17871 from OrestChura:oc/typed_GArray_GMat

* Added overload for `GArray<GMat>` ProtoParam in `gtyped.hpp`

* Tests+compile_args
 - added tests for GArray<GMat> as an input and an output of GComputationT
 - added possibility to give the compile_args to GComputationT.apply()

* Fix win errors

Support Gather for variable inputs

Added reference to Original Wu's articte about SAUF connected components search method.

Update samples

Update train_HOG.cpp

mish_functor_update

Merge pull request opencv#17493 from TolyaTalamanov:at/python-bindings-gapi

* Implement G-API python bindings

* Fix hdr_parser

* Drop initlization with brackets using regexp

* Handle bracket initilization another way

* Add test for core operations

* Declaration and definition of View constructor now in different files

* Refactor tests

* Remove combine decorator from tests

* Fix comment to review

* Fix test

* Fix comments to review

* Remove GCompilerArgs implementation from python

Co-authored-by: Pinaev <danil.pinaev@intel.com>

Merge pull request opencv#17816 from vpisarev:essential_2cameras

* add findEssentialMat for two different cameras

* added smoke test for the newly added variant of findEssentialMatrix

Co-authored-by: tompollok <tom.pollok@gmail.com>

add DetectionOutputOp

Merge pull request opencv#17858 from vpisarev:dnn_depthwise_conv

* added depth-wise convolution; gives ~20-30% performance improvement in MobileSSD networks

* hopefully, eliminated compile warnings, errors, as well as failure in one test

* * fixed a few typos
* decreased buffer size in some cases
* added more optimal im2row branch in the case of 1x1 convolutions
* tuned fastConv to reduce the number of passes over arrays

Add Objective-C/Swift wrappers for opencv_contrib modules

Merge pull request opencv#17735 from pemmanuelviel:pev-fix-trees-descent

* Fix trees parsing behavior in hierarchical_clustering_index:
Before, when maxCheck was reached in the first descent of a tree, time was still wasted parsing
the next trees till their best leaf, just to skip the points stored there.
Now we can choose either to keep this behavior, and so we skip parsing other trees after reaching
maxCheck, or we choose to do one descent in each tree, even if in one tree we reach maxCheck.

* Apply the same change to kdtree.
As each leaf contains only 1 point (unlike hierarchical_clustering), difference is visible if trees > maxCheck

* Add the new explore_all_trees parameters to miniflann

* Adapt the FlannBasedMatcher read_write test to the additional search parameter

* Adapt java tests to the additional parameter in SearchParams

* Fix the ABI dumps failure on SearchParams interface change

* Support of ctor calling another ctor of the class is only fully supported from C+11

add MVNOp

videoio: fix compilation with Aravis enabled

Added cmake toolchain for RISC-V with clang.

- Added cross compile cmake file for target riscv64-clang
- Extended cmake for RISC-V and added instruction checks
- Created intrin_rvv.hpp with C++ version universal intrinsics

Update README.md

I think there should be something under ### Resources for example:
* Additional OpenCV functionality: <https://github.com/opencv/opencv_contrib>

Updated the OpenCV logo

Merge pull request opencv#17977 from paroj:hervec

* calib3d: calibrateHandEye - allow using Rodrigues vectors for rotation

* calib3d: calibrateHandEye - test rvec representation

Fix build of grfmt_jpeg2000.cpp

libjasper has recently changed `jas_matrix_get` from a macro to an inline function
(389951d071 in https://github.com/jasper-software/jasper), causing the build to fail.

Fix Carotene compilation with XCode

Merge pull request opencv#17907 from Yosshi999:gsoc_asift-py2cpp

* Implement ASIFT in C++

* '>>' should be '> >' within a nested template

* add a sample for asift usage

* bugfix empty keypoints cause crash

* simpler initialization for mask

* suppress the number of lines

* correct tex document

* type casting

* add descriptorsize for asift

* smaller testdata for asift

* more smaller test data

* add OpenCV short license header

Merge pull request opencv#17885 from alalek:dnn_ocl_slice_update

DNN: OpenCL/slice update

* dnn(ocl/slice): make slice kernel VTune friendly

- more unique names
- inline code of copy functions

* dnn(ocl/slice): prefer to spawn more work groups

- even in case with 1D copy
- perf improvement up to 2x of kernel time (due to changed configuration 128x1x1 => 128x32x1)

* dnn(ocl/slice): cache kernel exec info

Adding comment from source code to documentation.

Updated comment.

Corrected Comment as requested by reviewer.

Do not use size_t for nGraph layers

Update warpPerspective_demo.cpp

Cleaner code for hierarchical_clustering

Merge pull request opencv#18019 from pemmanuelviel:pev--multiple-kmeans-trees

* Possibility to set more than one tree for the hierarchical KMeans (default is still 1 tree).

This particularly improves NN retrieval results with binary vectors, allowing better quality
compared to LSH for similar processing time when speed is the criterium.

* Add explanations on the FLANN's hierarchical KMeans for binary data.

Document PatchNANs input type

add relu as activation option in darknet

add relu option

add relu as activation option in darknet

simplify the setParams if-else ladder

add relu as activation option in darknet

correct activation_param type

format

format

add relu as activation option in darknet

spacing

spacing

add relu as activation option in darknet

Fixed removing is_parameter, is_constant, is_output

Replaced copy_with_new_args to clone_with_new_inputs

* added depth-wise convolution; gives ~20-30% performance improvement in MobileSSD networks

* hopefully, eliminated compile warnings, errors, as well as failure in one test

* * fixed a few typos
* decreased buffer size in some cases
* added more optimal im2row branch in the case of 1x1 convolutions
* tuned fastConv to reduce the number of passes over arrays

backport of commit 77b01de

Add support for using new ffmpeg encoding API when writing a video.

Removed get_output_as_single_output_node method

Fix Objective-C declaration of Mat_to_vector_Point2d

Merge the two KMeansIndexParams ctor on master

Obj-C/Swift docs improvements

Fix bug in ONNX Gather op

Use "src" not "*this" for source GpuMat

Merge pull request opencv#17643 from pemmanuelviel:pev--new-flann-demo

* Add a FLANN example showing how to search a query image in a dataset

* Clean: remove warning

* Replace dependency to boost::filesystem by calls to core/utils/filesystem

* Wait for escape key to exit

* Add an example of binary descriptors support

* Add program options for saving and loading the flann structure

* Fix warnings on Win64

* Fix warnings on 3.4 branch still relying on C++03

* Add ctor to img_info structure

* Comments modification

* * Demo file of FLANN moved and renamed

* Fix distances type when using binary vectors in the FLANN example

* Rename FLANN example file

* Remove dependency of the flann example to opencv_contrib's SURF.

* Remove mention of FLANN and other descriptors that aimed at giving hint on the other options

* Cleaner program options management

* Make waitKey usage minimal in FLANN example

* Fix the conditions order

* Use cv::Ptr

Merge pull request opencv#18033 from ieliz:dasiamrpn

Improving DaSiamRPN tracker sample

* changed layerBlobs in dnn.cpp and added DaSiamRPN tracker

* Improving DaSiamRPN tracker sample

* Docs fix

* Removed outdated changes

* Trying to reinitialize tracker without reloading models. Worked with LaSOT-based benchmark with reinit rate=250 frames

* Trying to reverse changes

* Moving the model in the constructor

* Fixing some issues with names

* Variable name changed

* Reverse parser arguments changes

Refactoring to prepare for other vector types while mutualizing some methods

Fix MatMul and Add axes

Merge pull request opencv#18096 from l-bat:update_onnx_importer

* Added ReduceSum to ONNX importer

* Fix comments

* Fix Mul

core(ocl): fix ocl::Image2d::isFormatSupported()

in case of OPENCV_OPENCL_DEVICE=disabled

Merge pull request opencv#18080 from nhlsm:improve-mat-operator-assign-scalar

* improve Mat::operator=(Scalar)

* touch

* remove trailing whitespace

* TEST: check if old code pass test or not

* remove CV_Error

* remove warning

* fix: is -> Scalar

* 1) Mat *mat -> Mat &mat 2) return bool, add output param

* add comment

Merge pull request opencv#17683 from ivashmak:homography

[GSoC] New RANSAC. Homography part

* change enum and squash commits

* add small improvements

* change function to static, update magsac

* remove path from samples, remove license, small updates

* update pnp solver, small improvements

* fix warnings

* add tutorial, comments

* fix markdown warnings

* fix markdown warnings

* fix markdown warnings

highgui: don't terminate if we can't initialize GTK backend

- allow Users to handle such case
- exception will be thrown instead

Merge pull request opencv#18073 from vpisarev:apache2_license

changed OpenCV license from BSD to Apache 2 license

* as discussed and announced earlier, changed OpenCV license from BSD to Apache 2. Many files still contain old-style copyrights though

* changed wording a bit; preserve the original OpenCV BSD license

[IE][VPU]: Refactor vpu configs

Add arm64-build-checks github action

fix CV_Check warnings

fix build error on odroid-n2-plus

Added reference to paper.

Improve initialization performance of Brisk

reformatting

Improve initialization performance of Brisk

fix formatting

Improve initialization performance of Brisk

formatting

Improve initialization performance of Brisk

make a lookup table for ring

use cosine/sine lookup table for theta in brisk and utilize trig identity

fix ring lookup table

use cosine/sine lookup table for theta in brisk and utilize trig identity

formatting

use cosine/sine lookup table for theta in brisk and utilize trig identity

move scale radius product to ring loop to ensure it's not recomputed for each rot

revert change

move scale radius product to ring loop to ensure it's not recomputed for each rot

remove rings lookup table

move scale radius product to ring loop to ensure it's not recomputed for each rot

fix formatting of for loop

move scale radius product to ring loop to ensure it's not recomputed for each rot

use sine/cosine approximations for brisk lookup table.

add documentation for sine/cosine lookup tables

Improve initialization performance of BRISK

Update the stereo sample:
* add the HH4 mode
* option to display disparity with a color map
* display current settings in the title bar
* don't close app when wanting to take screenshots

Add debug assert to check in FLANN the vectors size is multiple of the architecture word size

DNA mode: add the distance computations

Added a note about OpenCV logo

core(trace): lazy quering for OPENCV_TRACE_LOCATION

- fixes proper initialization of non-trivial variable

add OpenCV sample for digit and text recongnition, and provide multiple OCR models.

Merge pull request opencv#18148 from OrestChura:oc/fluid_core_perf

[G-API]: Fluid Core kernels performance tests

* Add performance tests for a list of Fluid Core kernels

* Update gapi_core_perf_tests_fluid.cpp

Addressing a comment

Merge pull request opencv#17163 from AsyaPronina:gcompound_kernel_gmatp_coop

* Fixed cooperation of Compound kernel and GMatP type

* Added test for GCompound kernel + GMatP type cooperation

Merge pull request opencv#18083 from IanMaquignaz:fix_gen_pattern_centering

* Fixed centering issue with make_cicle_pattern and make_acircle_pattern()

* Fixed issue where asymmetric circles were not at 45 degree angles. Also fixed support for inch measurement by converting parsing to usage of floating points for page size

* Fixed copy-paste error from experimental workspace

Supported ONNX Pow op

fix: libavcodec version check for AV_CODEC_FLAG_GLOBAL_HEADER

fix: libavcodec version check for AVDISCARD_NONINTRA

 - AVDISCARD_NONINTRA flag is supported only for FFMPEG libraries pack

Fix cubic root computation to be able to handle negative values. Improve doc. Add regression test.

Use camera intrinsic matrix everywhere. Add cameramatrix, distcoeffs and distcoeffsfisheye macros to avoid copy/paste errors.

Merge pull request opencv#17647 from jinyup100:add-siamrpnpp

[GSoC] Add siamrpnpp.py

* Updated base branch with siamrpnpp.py

* Addition of Parsers

* Merged to using few ONNX files, Changes to Parsers, Links to Repo

* Deleted whitespace

* Adjusting flake8 error

* Fixes according to review

* Fix according to review

* Addition of OpenVINO backends and Computation target devices

* Fix on backend after review

* Fixes after review

* Remove extra white space

* Removed Repeated Varaibles

Merge pull request opencv#17978 from sl-sergei:fix_17516_17531

* Fix ONNX loading in issues opencv#17516, opencv#17531

* Add tests for Linear and Matmul layers

* Disable tests for IE versions lower than 20.4

* Skip unstable tests with OpenCL FP16 on Intel GPU

* Add correct test filtering for OpenCL FP16 tests

support flownet2 with arbitary input size

revise default proto to match the filename in documentations

fix a bug

beautify python codes

fix bug

beautify codes

add test samples with larger/smaller size

remove unless code

using bytearray without creating tmp file

remove useless codes

Add Robot-World/Hand-Eye calibration function.

feat: change OpenJPEG encoder to lossy with default parameters

Fix build issue

Fix build issue

Fix CI issue

Modify test

Bypass upstream issue

Add -build_webgpu option

Avoid buildbot failure

Fix mat constructor conflict

Modify test case

Made some changes as requested

Rename some files to avoid conflicts

restore new-line in EOF

Modify cmake and reademe

Modify cmake

Update README.md

Update cmake and README
@cyyever cyyever deleted the fix_CUDA11 branch October 9, 2020 12:13
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
Fix cuda11

* use cudnn_version.h to detect version when it is available

* remove nppi from CUDA11

* use ocv_list_filterout

* dnn(cuda): temporary disable CUDNN 8.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants