Skip to content

Update OpenCVDetectCUDA.cmake#22675

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
CSBVision:patch-2
Dec 14, 2022
Merged

Update OpenCVDetectCUDA.cmake#22675
asmorkalov merged 1 commit intoopencv:4.xfrom
CSBVision:patch-2

Conversation

@CSBVision
Copy link
Copy Markdown
Contributor

Adds the option to enable delay loading of CUDA DLLs on Windows. This is particularly useful to use the same binary on systems with and without CUDA support without distributing the CUDA DLLs to systems that cannot use them at all due to missing CUDA-supported hardware. Resolves #13509

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov asmorkalov self-requested a review October 21, 2022 08:08
@asmorkalov asmorkalov self-assigned this Oct 24, 2022
@asmorkalov
Copy link
Copy Markdown
Contributor

Thanks a lot for the great suggestion! The option looks very promising. I'll check several corner cases and return back with feedback.

@asmorkalov asmorkalov requested a review from stopmosk December 2, 2022 07:53
@stopmosk
Copy link
Copy Markdown
Contributor

Hi

I found some issues with this PR:

  1. Build failed if Ninja generator is used. Only build with the MSVC generator succeeded. I googled and found this.
  2. I built OpenCV with and without CUDA_ENABLE_DELAYLOAD and also build test CUDA-apps. I then ran it on two Windows machines (with and without CUDA installed), and found no difference in apps behaviour on both machines. Dependency checking shows that the only one difference is delayed loding of CUDA-dlls by opencv_cudaarithm460.dll (see screenshots).

opencv_cudaarithm460.dll built without CUDA_ENABLE_DELAYLOAD:
image

opencv_cudaarithm460.dll built with CUDA_ENABLE_DELAYLOAD:
image

@asmorkalov
Copy link
Copy Markdown
Contributor

@stopmosk Please show your test app code and OpenCv build options pefore and after the patch.

@CSBVision
Copy link
Copy Markdown
Contributor Author

Hi,

Thanks for your comment!

First, regarding the issue with Ninja: From the link you added, I suspect that Ninja mixes something between the respective import libraries (.lib files) and the delay-loaded DLLs. Unluckily, the /delayload flag requires the DLLs and not the LIBs, as already mentioned above. This can be easily fixed by only allowing MSCV inside the CMake scripts (i.e. adding an 'if not Ninja condition'), but we are open to investigate which additional flags are required for Ninja as well to support the flag there, too. Still, we never used Ninja and a quickly setup of CMake using Ninja with specified C and C++ compilers terminates with the error message

  The C++ compiler

    "C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.34.31933/bin/Hostx64/x64/cl.exe"

  is not able to compile a simple test program.

for unknown reasoning. The compiler is working fine (e.g. in combination with MSVC CMake builds) and it's unclear how to debug this. Is there a reference on how to compile OpenCV using Ninja on Windows? Unluckily, I found none by myself.

Regarding your second comment: This is exactly the expected behavior, only the CUDA DLLs are delay-loaded. Still, the cudaarithm DLL is not a good example for the benefit of this flag because - at least as far as I know - there is nothing inside this DLL that works without CUDA-supported hardware. I think the DNN module is a better example: Here, we can either use the CPU backend (supported on all machines), the OpenVINO backend (supported on machines with an OpenVINO supported CPU or GPU) or the CUDA backend. Obviously, the latter is supported on machines equipped with CUDA-compatible hardware only. Still if CUDA is activated while compiling OpenCV, the CUDA libraries always have to be distributed to all machines at runtime, even though they cannot be used at all. This is where the delay load option shines: Activate it and the whole CUDA functionality is available to devices that can use it, while all non-CUDA functionalities are still available on all machines, without needing a superfluous CUDA installation.

@asmorkalov asmorkalov added this to the 4.7.0 milestone Dec 12, 2022
Comment on lines +345 to +347
if(MSVC)
OCV_OPTION(CUDA_ENABLE_DELAYLOAD "Enable delayed loading of CUDA DLLs" OFF)
endif()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCV_OPTION has VISIBLE_IF argument for conditions. I propose to replace the "if" with option with single line:

OCV_OPTION(CUDA_ENABLE_DELAYLOAD "Enable delayed loading of CUDA DLLs" OFF VISIBLE_IF MSVC AND (CMAKE_GENERATOR MATCHES "Visual Studio"))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes of course, it's more readable and cleaner.
Maybe it's worth think about defaulting to on? Just though about this option as it does not really change anything if the DLLs are there but avoids runtime errors if they are missing but still unused.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have it optional. It may affect libraries loading order for guys why use other CUDA-based libraries like TensorRT, CuDNN, python code with CUDA.

Copy link
Copy Markdown
Contributor Author

@CSBVision CSBVision Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, just committed your proposal. So the PR can be merged now?

Copy link
Copy Markdown
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Thanks for the contribution.

@CSBVision
Copy link
Copy Markdown
Contributor Author

Great, thanks for approving 👍

@asmorkalov
Copy link
Copy Markdown
Contributor

@CSBVision Could you squash the commits to have clear merge history?

@CSBVision
Copy link
Copy Markdown
Contributor Author

Generally yes; however I don't want to mess up something. The right command should be

git rebase -i HEAD~3 

or isn't it?

@CSBVision
Copy link
Copy Markdown
Contributor Author

After git rebase -i HEAD~3 git opens a VIM window with

pick f5e9dfe64a Update OpenCVDetectCUDA.cmake
pick fdf67a4f5f Update OpenCVDetectCUDA.cmake
pick efc4f66e11 Update OpenCVDetectCUDA.cmake

where exiting shows Successfully rebased and updated refs/heads/patch-2. but git push shows everything is up to date. What am I missing?

@alalek
Copy link
Copy Markdown
Member

alalek commented Dec 13, 2022

  1. Run again:

git rebase -i HEAD~3

  1. Change 2nd and 3rd lines from pick to f (fixup):
pick f5e9dfe64a Update OpenCVDetectCUDA.cmake
f fdf67a4f5f Update OpenCVDetectCUDA.cmake
f efc4f66e11 Update OpenCVDetectCUDA.cmake
  1. save and exit

  2. git push ... with --force flag

  3. PR's patch should be updated (1 commit is visible, +13 -0 lines changed).

Adds the option to enable delay loading of CUDA DLLs on Windows. This is particularly useful to use the same binary on systems with and without CUDA support without distributing the CUDA DLLs to systems that cannot use them at all due to missing CUDA-supported hardware.
Resolves opencv#13509
@CSBVision
Copy link
Copy Markdown
Contributor Author

Thanks @alalek , I hope it's correct now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Delayed loading of CUDA third party DLLs

4 participants