Skip to content

respect CUDA_HOST_COMPILER when detecting CUDA arch#17526

Merged
opencv-pushbot merged 1 commit intoopencv:masterfrom
cyyever:fix_cuda_detection
Jun 20, 2020
Merged

respect CUDA_HOST_COMPILER when detecting CUDA arch#17526
opencv-pushbot merged 1 commit intoopencv:masterfrom
cyyever:fix_cuda_detection

Conversation

@cyyever
Copy link
Copy Markdown
Contributor

@cyyever cyyever commented Jun 11, 2020

force_builders=Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu-cuda:18.04

@cyyever cyyever force-pushed the fix_cuda_detection branch from 4b83cf1 to c6638d6 Compare June 11, 2020 07:57
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 👍

@opencv-pushbot opencv-pushbot merged commit 5ed6546 into opencv:master Jun 20, 2020
@tomoaki0705
Copy link
Copy Markdown
Contributor

This PR brakes the build on Jetson TX1 and TX2

  • cmake
-- General configuration for OpenCV 4.4.0-pre =====================================
--   Version control:               4.3.0-464-g5ed6546
-- 
--   Extra modules:
--     Location (extra):            /opencv_contrib/modules
--     Version control (extra):     4.3.0-69-g0f56e6d
 :
-- CUDA detected: 8.0
-- Automatic detection of CUDA generation failed. Going to build for all known architectures.
-- CUDA NVCC target flags: -ccbin;/usr/bin/cc;-gencode;arch=compute_53,code=sm_53;-gencode;arch=compute_62,code=sm_62;-gencode;arch=compute_72,code=sm_72;-D_FORCE_INLINES
  • It usually detects the native CC (i.e. sm53 for TX1, sm62 for TX2)

    • Then, sm72 is not in the list of CUDA 8.0. (i.e. The fall safe path is not safe at all)
  • make

nvcc fatal   : Unsupported gpu architecture 'compute_72'
CMake Error at cuda_compile_1_generated_gpu_mat.cu.o.Release.cmake:219 (message):
  Error generating
  /opencv/build/modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/./cuda_compile_1_generated_gpu_mat.cu.o

tomoaki0705 added a commit to tomoaki0705/opencv that referenced this pull request Jun 20, 2020
  * use only supported CC in the list
  * workaround of opencv#17526
@cyyever cyyever deleted the fix_cuda_detection branch June 21, 2020 07:58
@NHarishGit
Copy link
Copy Markdown

NHarishGit commented Jun 23, 2020

This change has broken CUDA builds. Please undo it.

@tomoaki0705
Copy link
Copy Markdown
Contributor

@NHarishGit , I appreciate your comment.
It would be a great help if you could share what "CUDA builds" you are talking about. like this

* Jetson TX2 (TX1)
* Ubuntu 16.04 (Aarch64)
* CUDA 8.0
* GCC 5.4.0
* CMake 3.15.1

OS, architecture, CUDA version, compiler version, cmake version.
That helps the discussion.
Thanks

@NHarishGit
Copy link
Copy Markdown

NHarishGit commented Jun 23, 2020

@NHarishGit , I appreciate your comment.
It would be a great help if you could share what "CUDA builds" you are talking about. like this

* Jetson TX2 (TX1)
* Ubuntu 16.04 (Aarch64)
* CUDA 8.0
* GCC 5.4.0
* CMake 3.15.1

OS, architecture, CUDA version, compiler version, cmake version.
That helps the discussion.
Thanks

OS: WIN 10 x64 2004 build
VS 2019 community v 16.6.2
CUDA v 10.2
CMAKE v 1.17.3
GSTREAMER v 1.16.2
MKL, TBB 2020 versions
CUDA DNN v 7.6.5

I spent quite a few hours to figure out that the generated CMAKE file for NVCCC has NVCC_FLAGS set wrongly without quotes around the path for cl.exe, which breaks on windows system on C:/program files (x86)/blah blah blah.

@tomoaki0705
Copy link
Copy Markdown
Contributor

@cyyever any comments ?
#17598 (comment)

@cyyever , could you comment on what's the situation you were trying to fix ?

@cyyever
Copy link
Copy Markdown
Contributor Author

cyyever commented Jun 25, 2020

@cyyever any comments ?
#17598 (comment)

@cyyever , could you comment on what's the situation you were trying to fix ?

My default g++ is g++-10, without this patch, nvcc would use g++-10 and because no CUDA supports g++-10, it would failed to detect my CUDA arch even if I specify CUDA_HOST_COMPILER.
@NHarishGit
Could it work by changing
LIST(APPEND CUDA_NVCC_FLAGS -ccbin ${CUDA_HOST_COMPILER})
to
LIST(APPEND CUDA_NVCC_FLAGS -ccbin "${CUDA_HOST_COMPILER}") ?
I feel strange that if space leads to your failure, then

  get_filename_component(host_compiler_bindir ${CMAKE_LINKER} DIRECTORY)
  LIST(APPEND CUDA_NVCC_FLAGS -ccbin ${host_compiler_bindir})

would fail before my PR. So should we change it too?

@tomoaki0705
Copy link
Copy Markdown
Contributor

@cyyever, @NHarishGit
I appreciate if you could have a look on #17671
If my understanding is correct, this patch should satisfy all three of us.

@NHarishGit
Copy link
Copy Markdown

@cyyever, @NHarishGit
I appreciate if you could have a look on #17671
If my understanding is correct, this patch should satisfy all three of us.

Looks good to me @tomoaki0705. Do you know tentative plan for 4.4.0 release ?

@tomoaki0705
Copy link
Copy Markdown
Contributor

@NHarishGit , thank you for checking. I don't know the release schedule of 4.4.0. Best

philippefoubert pushed a commit to philippefoubert/opencv that referenced this pull request Jun 29, 2020
  * use only supported CC in the list
  * workaround of opencv#17526
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants