Skip to content

CUDA: choose supported CC automatically#17432

Merged
opencv-pushbot merged 1 commit intoopencv:3.4from
tomoaki0705:automaticCC
Jun 4, 2020
Merged

CUDA: choose supported CC automatically#17432
opencv-pushbot merged 1 commit intoopencv:3.4from
tomoaki0705:automaticCC

Conversation

@tomoaki0705
Copy link
Copy Markdown
Contributor

@tomoaki0705 tomoaki0705 commented May 30, 2020

  • strip out the architecture based on nvcc result
  • DRY

Pull Request Readiness Checklist

First of all, I'm writing this PR on my personal interest.
No intend to represent my work or the company I belong.

NVIDIA announced CUDA 11
Adding another line here every time CUDA is released, it's just increasing chaos.

if(CUDA_VERSION VERSION_LESS "9.0")
set(__cuda_arch_bin "2.0 3.0 3.5 3.7 5.0 5.2 6.0 6.1")
elseif(CUDA_VERSION VERSION_LESS "10.0")
set(__cuda_arch_bin "3.0 3.5 3.7 5.0 5.2 6.0 6.1 7.0")
else()
set(__cuda_arch_bin "3.0 3.5 3.7 5.0 5.2 6.0 6.1 7.0 7.5")
endif()

This PR will let available CC to be chosen by nvcc automatically.

  • I agree to contribute to the project under OpenCV (BSD) License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu-cuda:18.04

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contribution!
Basically looks good to me 👍

  * cache the result
  * DRY
  * brush up based on review
@tomoaki0705
Copy link
Copy Markdown
Contributor Author

Thank you for the review @alalek
I followed your review, and pushed the modification.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

I looked in the log of build machine
The CMake Log says that

-- CUDA detected: 10.1
-- CUDA NVCC target flags: -gencode;arch=compute_61,code=sm_61;-D_FORCE_INLINES

--   NVIDIA CUDA:                   YES (ver 10.1, CUFFT CUBLAS)
--     NVIDIA GPU arch:             61
--     NVIDIA PTX archs:

This shows that CC has been specified manually.

What I expect is something like this

-- CUDA detected: 10.1
-- CUDA NVCC target flags: -gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-D_FORCE_INLINES

--   NVIDIA CUDA:                   YES (ver 10.1, CUFFT CUBLAS)
--     NVIDIA GPU arch:             30 35 37 50 52 60 61 70 75
--     NVIDIA PTX archs:

I 'guess' CUDA_ARCH_BIN was set to 6.1 or CUDA_GENERTION was set to AUTO on the build machine precommit_custom_linux
I just can't see it on the log so I want to confirm it.
Is CUDA_ARCH_BIN and CUDA_GENERATION is set to specific value ?

@alalek
Copy link
Copy Markdown
Member

alalek commented Jun 2, 2020

@tomoaki0705 Yes, it is fixed in this builder (to reduce build time and binaries size - only basic build checks are performed here).

Without forcing of -DCUDA_ARCH_BIN=6.1:

-- CUDA detected: 10.1
-- CUDA NVCC target flags: -gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-D_FORCE_INLINES

...

--   NVIDIA CUDA:                   YES (ver 10.1, CUFFT CUBLAS)
--     NVIDIA GPU arch:             30 35 37 50 52 60 61 70 75
--     NVIDIA PTX archs:

So results are same to yours.

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 👍

@opencv-pushbot opencv-pushbot merged commit 476aa44 into opencv:3.4 Jun 4, 2020
@alalek alalek mentioned this pull request Jun 4, 2020
@tomoaki0705 tomoaki0705 deleted the automaticCC branch June 4, 2020 21:24
@tomoaki0705 tomoaki0705 changed the title CUDA: choose supported NVCC automatically CUDA: choose supported CC automatically Jun 4, 2020
@nglee
Copy link
Copy Markdown
Contributor

nglee commented Jun 12, 2020

Hi, I get this error with CMake 3.16.2, 3.4 branch. Would you please take a look?

CMake Error at cmake/OpenCVDetectCUDA.cmake:95 (string):
  string sub-command STRIP requires two arguments.
Call Stack (most recent call first):
  cmake/OpenCVDetectCUDA.cmake:147 (ocv_filter_available_architecture)
  cmake/OpenCVFindLibsPerf.cmake:43 (include)
  CMakeLists.txt:694 (include)


CMake Error at cmake/OpenCVDetectCUDA.cmake:108 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  cmake/OpenCVDetectCUDA.cmake:157 (ocv_wipeout_deprecated)
  cmake/OpenCVFindLibsPerf.cmake:43 (include)
  CMakeLists.txt:694 (include)

The error is gone if I set CUDA_ARCH_BIN to 7.5 and CUDA_GENERATION to turing manually.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

Hmmm, that's interesting, @nglee.
May I ask

  1. CUDA version
  2. OS
  3. compiler, too?

Additionally, what happens if you invoke NVCC manually ?

nvcc -gencode arch=compute_75,code=sm_75 /path/to/opencv/cmake/checks/OpenCVDetectCudaArch.cu

I assume that this test command here somehow fails and filter outs all the available CC
https://github.com/tomoaki0705/opencv/blob/156406b56ca38ff7f6440fb16219078b00130007/cmake/OpenCVDetectCUDA.cmake#L87-L90

My comment is to run the nvcc manually, and see whats happens.

If my assumption is correct, additional information which might be useful is what will be shown if you remove ERROR_QUIET from this line ?

@nglee
Copy link
Copy Markdown
Contributor

nglee commented Jun 12, 2020

Thank you, @tomoaki0705, for taking your time on this issue.

  1. 10.2
  2. Win10
  3. cl.exe (19.24.28316)

OpenCVDetectCudaArch.cu compiles to a.exe and it prints 7.5.

E:\repos\opencv>a.exe
7.5

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

Thank you, @nglee
Hmm, how about if you remove ERROR_QUIET and re-run the cmake ?
Yeah, built binary will run with no problem.
The issue comes at the compile stage. Removing ERROR_QUIET will show us more information.

@alalek
Copy link
Copy Markdown
Member

alalek commented Jun 12, 2020

@nglee

It would be nice to dump "target_arch" in foreach for investigation:

+message(STATUS "CC_LIST='${CC_LIST}'")
 foreach(target_arch ${CC_LIST})
+  message(STATUS "Testing '${target_arch}' ...")

Also could you try this in your environment (add quotes):

-string(REPLACE "." "" target_arch_short ${target_arch})
+string(REPLACE "." "" target_arch_short "${target_arch}")
 ...
-string(STRIP ${${result_list}} ${result_list})
+string(STRIP "${${result_list}}" ${result_list})

@nglee
Copy link
Copy Markdown
Contributor

nglee commented Jun 14, 2020

@tomoaki0705 I do not find any different result by removing ERROR_QUIET.

@alalek This is what I get if I add message functions.

CC_LIST='2.0;3.0;3.5;3.7;5.0;5.2;6.0;6.1;7.0;7.5'
Testing '2.0' ...
Testing '3.0' ...
Testing '3.5' ...
Testing '3.7' ...
Testing '5.0' ...
Testing '5.2' ...
Testing '6.0' ...
Testing '6.1' ...
Testing '7.0' ...
Testing '7.5' ...
CMake Error at cmake/OpenCVDetectCUDA.cmake:96 (string):
  string sub-command STRIP requires two arguments.
Call Stack (most recent call first):
  cmake/OpenCVDetectCUDA.cmake:148 (ocv_filter_available_architecture)
  cmake/OpenCVFindLibsPerf.cmake:43 (include)
  CMakeLists.txt:694 (include)


CMake Error at cmake/OpenCVDetectCUDA.cmake:109 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  cmake/OpenCVDetectCUDA.cmake:158 (ocv_wipeout_deprecated)
  cmake/OpenCVFindLibsPerf.cmake:43 (include)
  CMakeLists.txt:694 (include)

Adding quotes as suggested by @alalek seems to remove this error message.

@Bleach665
Copy link
Copy Markdown
Contributor

Bleach665 commented Jun 14, 2020

For me
Also could you try this in your environment (add quotes):

-string(REPLACE "." "" target_arch_short ${target_arch})
+string(REPLACE "." "" target_arch_short "${target_arch}")
 ...
-string(STRIP ${${result_list}} ${result_list})
+string(STRIP "${${result_list}}" ${result_list})

dont resolve this (#17544)

I still get an error

CMake Error at cmake/OpenCVDetectCUDA.cmake:125 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  cmake/OpenCVDetectCUDA.cmake:174 (ocv_wipeout_deprecated)
  cmake/OpenCVFindLibsPerf.cmake:43 (include)
  CMakeLists.txt:688 (include)

Full log:

Details
Detected processor: AMD64
Found PythonInterp: C:/Program Files/Python37/python.exe (found suitable version "3.7.7", minimum required is "2.7") 
libjpeg-turbo: VERSION = 2.0.4, BUILD = opencv-4.4.0-pre-libjpeg-turbo
Could NOT find OpenJPEG (minimal suitable version: 2.0, recommended version >= 2.3.1)
found Intel IPP (ICV version): 2020.0.0 [2020.0.0 Gold]
at: E:/Lib_prebuild/opencv/prebuild_x64/3rdparty/ippicv/ippicv_win/icv
found Intel IPP Integration Wrappers sources: 2020.0.0
at: E:/Lib_prebuild/opencv/prebuild_x64/3rdparty/ippicv/ippicv_win/iw
Found CUDNN: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.2/lib/x64/cudnn.lib (found suitable version "7.6.5", minimum required is "7.5") 
CUDA detected: 10.2
CMake Error at cmake/OpenCVDetectCUDA.cmake:125 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  cmake/OpenCVDetectCUDA.cmake:174 (ocv_wipeout_deprecated)
  cmake/OpenCVFindLibsPerf.cmake:43 (include)
  CMakeLists.txt:688 (include)


CUDA NVCC target flags: -D_FORCE_INLINES
LAPACK(MKL): LAPACK_LIBRARIES: E:/Lib_prebuild/MKL/lib/mkl_intel_lp64.lib;E:/Lib_prebuild/MKL/lib/mkl_sequential.lib;E:/Lib_prebuild/MKL/lib/mkl_core.lib
LAPACK(MKL): Support is enabled.
The following CMake options are exported from the Inference Engine build

    THREADING: TBB

Detected InferenceEngine: cmake package (2.1.0)
CMake Warning at cmake/OpenCVDetectInferenceEngine.cmake:132 (message):
  InferenceEngine version has not been set, 2020.3 will be used by default.
  Set INF_ENGINE_RELEASE variable if you experience build errors.
Call Stack (most recent call first):
  CMakeLists.txt:750 (include)


Found VTK 8.2.0 (C:/Lib/VTK/build_x64/lib/cmake/vtk-8.2/UseVTK.cmake)
OpenCV Python: during development append to PYTHONPATH: E:/Lib_prebuild/opencv/prebuild_x64/python_loader
Caffe:   NO
Protobuf:   NO
Glog:   NO
freetype2:   NO
harfbuzz:    NO
No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
Found installed version of gflags: C:/Lib/gflags/build_x64/lib/cmake/gflags
Detected gflags version: 2.2.2
Found installed version of Eigen: E:/Lib_prebuild/eigen/build_x64/share/eigen3/cmake
Found required Ceres dependency: Eigen version 3.3.7 in E:/Lib_prebuild/eigen/build_x64/include/eigen3
Found installed version of glog: E:/Lib_prebuild/glog/prebuild_x64
Detected glog version: 0.4.0
Found required Ceres dependency: glog
Found installed version of gflags: C:/Lib/gflags/build_x64/lib/cmake/gflags
Detected gflags version: 2.2.2
Found required Ceres dependency: gflags
Ceres version 1.14.0 detected here: E:/Lib_prebuild/ceres-solver/build_x64 was built with C++11. Ceres target will add C++11 flags to compile options for targets using it.
Found Ceres version: 1.14.0 installed in: E:/Lib_prebuild/ceres-solver/build_x64 with components: [EigenSparse, SparseLinearAlgebraLibrary, LAPACK, SuiteSparse, SchurSpecializations, C++11, OpenMP, Multithreading]
Checking SFM deps... TRUE
Tesseract:   NO
Allocator metrics storage type: 'long long'
Registering hook 'INIT_MODULE_SOURCES_opencv_dnn': E:/Lib_prebuild/opencv/source/opencv/modules/dnn/cmake/hooks/INIT_MODULE_SOURCES_opencv_dnn.cmake
DNN: Enabling Inference Engine NN Builder API support
xfeatures2d/boostdesc: Download: boostdesc_bgm.i
xfeatures2d/boostdesc: Download: boostdesc_bgm_bi.i
xfeatures2d/boostdesc: Download: boostdesc_bgm_hd.i
xfeatures2d/boostdesc: Download: boostdesc_binboost_064.i
xfeatures2d/boostdesc: Download: boostdesc_binboost_128.i
xfeatures2d/boostdesc: Download: boostdesc_binboost_256.i
xfeatures2d/boostdesc: Download: boostdesc_lbgm.i
xfeatures2d/vgg: Download: vgg_generated_48.i
xfeatures2d/vgg: Download: vgg_generated_64.i
Try 1 failed
CMake Warning at E:/Lib_prebuild/opencv/source/opencv/cmake/OpenCVDownload.cmake:202 (message):
  xfeatures2d/vgg: Download failed: 56;"Failure when receiving data from the
  peer"

  For details please refer to the download log file:

  E:/Lib_prebuild/opencv/prebuild_x64/CMakeDownloadLog.txt

Call Stack (most recent call first):
  E:/Lib_prebuild/opencv/source/opencv_contrib/modules/xfeatures2d/cmake/download_vgg.cmake:16 (ocv_download)
  E:/Lib_prebuild/opencv/source/opencv_contrib/modules/xfeatures2d/CMakeLists.txt:12 (download_vgg_descriptors)


xfeatures2d/vgg: Download: vgg_generated_80.i
xfeatures2d/vgg: Download: vgg_generated_120.i
data: Download: face_landmark_model.dat
No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
Found installed version of gflags: C:/Lib/gflags/build_x64/lib/cmake/gflags
Detected gflags version: 2.2.2
Found installed version of Eigen: E:/Lib_prebuild/eigen/build_x64/share/eigen3/cmake
Found required Ceres dependency: Eigen version 3.3.7 in E:/Lib_prebuild/eigen/build_x64/include/eigen3
Found installed version of glog: E:/Lib_prebuild/glog/prebuild_x64
Detected glog version: 0.4.0
Found required Ceres dependency: glog
Found installed version of gflags: C:/Lib/gflags/build_x64/lib/cmake/gflags
Detected gflags version: 2.2.2
Found required Ceres dependency: gflags
Ceres version 1.14.0 detected here: E:/Lib_prebuild/ceres-solver/build_x64 was built with C++11. Ceres target will add C++11 flags to compile options for targets using it.
Found Ceres version: 1.14.0 installed in: E:/Lib_prebuild/ceres-solver/build_x64 with components: [EigenSparse, SparseLinearAlgebraLibrary, LAPACK, SuiteSparse, SchurSpecializations, C++11, OpenMP, Multithreading]
Checking SFM deps... TRUE
NVIDIA_OPTICAL_FLOW: Download: 79c6cee80a2df9a196f20afd6b598a9810964c32.zip
CMake Warning at cmake/OpenCVGenSetupVars.cmake:54 (message):
  CONFIGURATION IS NOT SUPPORTED: validate setupvars script in install
  directory
Call Stack (most recent call first):
  CMakeLists.txt:968 (include)



General configuration for OpenCV 4.4.0-pre =====================================
  Version control:               4.3.0-421-g0cbaaba4b1-dirty

  Extra modules:
    Location (extra):            E:/Lib_prebuild/opencv/source/opencv_contrib/modules
    Version control (extra):     4.3.0-62-g39ced2af

  Platform:
    Timestamp:                   2020-06-14T09:45:22Z
    Host:                        Windows 10.0.17763 AMD64
    CMake:                       3.18.0-rc1
    CMake generator:             Visual Studio 15 2017
    CMake build tool:            C:/Program Files (x86)/Microsoft Visual Studio/2017/Enterprise/MSBuild/15.0/Bin/MSBuild.exe
    MSVC:                        1916

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (15 files):         + SSSE3 SSE4_1
      SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (0 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (4 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (29 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (4 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      YES
    C++ standard:                11
    C++ Compiler:                C:/Program Files (x86)/Microsoft Visual Studio/2017/Enterprise/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe  (ver 19.16.27039.0)
    C++ flags (Release):         /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP  /MD /O2 /Ob2 /DNDEBUG 
    C++ flags (Debug):           /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP  /MDd /Zi /Ob0 /Od /RTC1 
    C Compiler:                  C:/Program Files (x86)/Microsoft Visual Studio/2017/Enterprise/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe
    C flags (Release):           /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP   /MD /O2 /Ob2 /DNDEBUG 
    C flags (Debug):             /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP /MDd /Zi /Ob0 /Od /RTC1 
    Linker flags (Release):      /machine:x64  /INCREMENTAL:NO 
    Linker flags (Debug):        /machine:x64  /debug /INCREMENTAL 
    ccache:                      NO
    Precompiled headers:         YES
    Extra dependencies:          cudart_static.lib nppc.lib nppial.lib nppicc.lib nppicom.lib nppidei.lib nppif.lib nppig.lib nppim.lib nppist.lib nppisu.lib nppitc.lib npps.lib cublas.lib cudnn.lib cufft.lib -LIBPATH:C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.2/lib/x64
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 alphamat aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev cvv datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor ml objdetect optflow ovis phase_unwrapping photo plot python3 quality rapid reg rgbd saliency sfm shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab viz xfeatures2d ximgproc xobjdetect xphoto
    Disabled:                    python_tests world
    Disabled by dependency:      -
    Unavailable:                 cnn_3dobj freetype hdf java js matlab python2 xfeatures2d
    Applications:                examples apps
    Documentation:               NO
    Non-free algorithms:         YES

  Windows RT support:            NO

  GUI: 
    QT:                          YES (ver 5.12.8)
      QT OpenGL support:         NO
    Win32 UI:                    YES
    VTK support:                 YES (ver 8.2.0)

  Media I/O: 
    ZLib:                        build (ver 1.2.11)
    JPEG:                        build-libjpeg-turbo (ver 2.0.4-62)
    WEBP:                        build (ver encoder: 0x020f)
    PNG:                         build (ver 1.6.37)
    TIFF:                        build (ver 42 - 4.0.10)
    JPEG 2000:                   build Jasper (ver 1.900.1)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      NO
    FFMPEG:                      YES (prebuilt binaries)
      avcodec:                   YES (58.54.100)
      avformat:                  YES (58.29.100)
      avutil:                    YES (56.31.100)
      swscale:                   YES (5.5.100)
      avresample:                YES (4.0.0)
    GStreamer:                   NO
    DirectShow:                  YES
    Media Foundation:            YES
      DXVA:                      YES

  Parallel framework:            Concurrency

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Intel IPP:                   2020.0.0 Gold [2020.0.0]
           at:                   E:/Lib_prebuild/opencv/prebuild_x64/3rdparty/ippicv/ippicv_win/icv
    Intel IPP IW:                sources (2020.0.0)
              at:                E:/Lib_prebuild/opencv/prebuild_x64/3rdparty/ippicv/ippicv_win/iw
    Lapack:                      YES (E:/Lib_prebuild/MKL/lib/mkl_intel_lp64.lib E:/Lib_prebuild/MKL/lib/mkl_sequential.lib E:/Lib_prebuild/MKL/lib/mkl_core.lib)
    Inference Engine:            YES (2020030000 / 2.1.0)
        * libs:                  E:/Lib_prebuild/DLDT/dldt/bin/intel64/Release/inference_engine_c_api.lib / E:/Lib_prebuild/DLDT/dldt/bin/intel64/Debug/inference_engine_c_apid.lib E:/Lib_prebuild/DLDT/dldt/bin/intel64/Release/inference_engine_c_api.dll / E:/Lib_prebuild/DLDT/dldt/bin/intel64/Debug/inference_engine_c_apid.dll E:/Lib_prebuild/DLDT/dldt/bin/intel64/Release/inference_engine_c_api.dll E:/Lib_prebuild/DLDT/dldt/bin/intel64/Debug/inference_engine_c_apid.dll
        * includes:              E:/Lib_prebuild/DLDT/dldt/inference-engine/ie_bridges/c/include
    Eigen:                       YES (ver 3.3.7)
    Custom HAL:                  NO
    Protobuf:                    build (3.5.1)

  NVIDIA CUDA:                   YES (ver 10.2, CUFFT CUBLAS)
    NVIDIA GPU arch:
    NVIDIA PTX archs:

  cuDNN:                         YES (ver 7.6.5)

  OpenCL:                        YES (NVD3D11)
    Include path:                E:/Lib_prebuild/opencv/source/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 C:/Program Files/Python37/python.exe (ver 3.7.7)
    Libraries:                   optimized C:/Program Files/Python37/libs/python37.lib debug C:/Program Files/Python37/libs/python37_d.lib (ver 3.7.7)
    numpy:                       C:/Program Files/Python37/lib/site-packages/numpy/core/include (ver 1.18.2)
    install path:                C:/Program Files/Python37/Lib/site-packages/cv2/python-3.7

  Python (for build):            C:/Program Files/Python37/python.exe

  Java:                          
    ant:                         NO
    JNI:                         C:/Program Files/Java/jdk-14.0.1/include C:/Program Files/Java/jdk-14.0.1/include/win32 C:/Program Files/Java/jdk-14.0.1/include
    Java wrappers:               NO
    Java tests:                  NO

  Install to:                    C:/Lib/opencv/build_x64
-----------------------------------------------------------------

Configuring incomplete, errors occurred!
See also "E:/Lib_prebuild/opencv/prebuild_x64/CMakeFiles/CMakeOutput.log".
See also "E:/Lib_prebuild/opencv/prebuild_x64/CMakeFiles/CMakeError.log".

Log files:
CMakeFiles.zip

@alalek
Copy link
Copy Markdown
Member

alalek commented Jun 14, 2020

@Bleach665 Please try to add quotes in one more place:

-string(REPLACE "2.1" "2.1(2.0)" ${_arch_bin_list} ${${_arch_bin_list}})
+string(REPLACE "2.1" "2.1(2.0)" ${_arch_bin_list} "${${_arch_bin_list}}")

(but probably _arch_bin_list is empty due some reason)

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

tomoaki0705 commented Jun 14, 2020

@nglee, @Bleach665, could you both mention about the CMake version, please ?
I have 3.11, and I can see number "3.18" in Bleach665's attached log but I wan to make sure.
If version difference is the point, I might be able to reproduce it on my side.

And @Bleach665, your log seems that you re-run cmake on some build environment.
Could you confirm if it was a clean cmake or not ?
My understanding is that re-run of cmake may cause unknown error.
In another word, quoting the lines that @alalek pointed and a clean cmake should solve the issue.
I believe so, but I'd like to confirm before pushing fix like shooting the air blindly.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

Sorry, nglee mentioned it was 3.16.2

Hi, I get this error with CMake 3.16.2

Let me get this version if this reproduces

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

Sorry, also Bleach665 had his CMake version in his log.

CMake: 3.18.0-rc1

nglee was seeing this issue on 3.16.2 + 3.4 branch
Bleach664 was seeing this issue on 3.18.0 + master branch
Well, let's see

@Bleach665
Copy link
Copy Markdown
Contributor

Bleach665 commented Jun 14, 2020

@alalek, afet this changes

Please try to add quotes in one more place:

-string(REPLACE "2.1" "2.1(2.0)" ${_arch_bin_list} ${${_arch_bin_list}})
+string(REPLACE "2.1" "2.1(2.0)" ${_arch_bin_list} "${${_arch_bin_list}}")

Cmake configuration was successful. CUDA_ARCH_BIN was empty.

@tomoaki0705.
Could you confirm if it was a clean cmake or not ?
I tested on both, clean build and re-run CMake "generate", the result was the same - the error described above.

@tomoaki0705 tomoaki0705 mentioned this pull request Jun 17, 2020
6 tasks
@tomoaki0705
Copy link
Copy Markdown
Contributor Author

@Bleach665 , I appreciate if you could have a look on patch #17571
Best

@Bleach665
Copy link
Copy Markdown
Contributor

@tomoaki0705, with this patch the CMake configuration and build in Win10, CUDA 10.2, CMake 3.18 was successful for me. CUDA_ARCH_BIN was empty, so I set it to 3.5.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

tomoaki0705 commented Jun 17, 2020

Sorry @Bleach665 , could you describe more detail about this point, please ?

CUDA_ARCH_BIN was empty, so I set it to 3.5.

CUDA_ARCH_BIN is supposed to have some value, and it is not supposed to be empty.
Did it end up empty ?

@Bleach665
Copy link
Copy Markdown
Contributor

Did it end up empty ?

Yes. I tested with the default configuration, set only WITH_CUDA and OPENCV_EXTRA_MODULES_PATH. After configuring CUDA_ARCH_BIN was empty.

@asmorkalov
Copy link
Copy Markdown
Contributor

@Bleach665 could you add more details: OS, CUDA version, compiler, etc.

@Bleach665
Copy link
Copy Markdown
Contributor

Bleach665 commented Jun 18, 2020

@asmorkalov. Win10, CUDA 10.2, CMake 3.18, VS 2017 15.9.22.

Also reproduced this on Win10, CUDA 10.1, CMake 3.12, VS 2017 15.9.11.
CMake log files (for PC with CUDA 10.1):
CMakeFiles.zip

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

@Bleach665 , things are getting bit crazy.

Did it end up empty ?

Yes.
After configuring CUDA_ARCH_BIN was empty.

No, this is not expected.
CUDA_ARCH_BIN was suppose to be set from 3.0 to 7.5 when using CUDA 10.x

I don't think I can throw any more patches by shooting the air blindly.
I need your help even more.

  1. Could you try what alalek's comment suggests ?
    CUDA: choose supported CC automatically #17432 (comment)

It would be nice to dump "target_arch" in foreach for investigation:

+message(STATUS "CC_LIST='${CC_LIST}'")
 foreach(target_arch ${CC_LIST})
+  message(STATUS "Testing '${target_arch}' ...")
  1. Also, could you try invoking nvcc manually ? like this comment
nvcc -gencode arch=compute_75,code=sm_75 /path/to/opencv/cmake/checks/OpenCVDetectCudaArch.cu
  1. Also, Immediately after nvcc, please try echo %ERRORLEVEL% on command line or echo $? on bash console ?

  2. Technically, the patch CUDA: fix automatic cc #17571 is based on 3.4 branch, not master. May I double check how you confirmed the patch on master ?

@Bleach665
Copy link
Copy Markdown
Contributor

Bleach665 commented Jun 18, 2020

@tomoaki0705,

  1. Output:
    CC_LIST=''
  2. Output from command prompt:
    nvcc fatal : Cannot find compiler 'cl.exe' in PATH
    Output from VS native tool:
OpenCVDetectCudaArch.cu
LINK : fatal error LNK1104: cannot open file 'a.exe'
  1. Output from command prompt:
    1
    Output from VS native tool:
    2.
  2. At first tests I take OpenCVDetectCUDA.cmake file from 3.4 branch. At latest tests I use 3.4 branch for opencv and opencv_contrib repo.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

@Bleach665 unfortunately, there's nothing more I can do.

nvcc fatal : Cannot find compiler 'cl.exe' in PATH

OpenCVDetectCudaArch.cu
LINK : fatal error LNK1104: cannot open file 'a.exe'

This is something critical.
If nvcc can't compile the code, it won't work anyway.
Since I can't reproduce on my side, I can do nothing.

@Bleach665
Copy link
Copy Markdown
Contributor

Bleach665 commented Jun 19, 2020

I reproduced this (empty CUDA_ARCH_BIN) on clean VBox machine (Win7, CMake 3.18, VS 2017 Community, CUDA 10.2).
If someone needs I can share this VBox machine or give access via TeamViewer or Anydesk.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

Again, nvcc is supposed to compile this sample cuda code.
It may fail based on the passed CC value, and that's the trick of this "automatic CC"
If you run nvcc and don't see any success on your environment, it's clear that your CUDA_ARCH_BIN will end up empty.
Make nvcc run correctly on your system. That's the next step.
Good luck

@Bleach665
Copy link
Copy Markdown
Contributor

Bleach665 commented Jun 20, 2020

Make nvcc run correctly on your system.

That's what I meant. On clean default installation nvcc fail to compile OpenCVDetectCudaArch.cu. It can be CUDA installation bug or some more complex problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants