Skip to content

cuda: update default PTX behaviour when CUDA_ARCH_BIN is unset#24131

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
cudawarped:cuda_add_default_ptx
Sep 14, 2023
Merged

cuda: update default PTX behaviour when CUDA_ARCH_BIN is unset#24131
asmorkalov merged 1 commit intoopencv:4.xfrom
cudawarped:cuda_add_default_ptx

Conversation

@cudawarped
Copy link
Copy Markdown
Contributor

@cudawarped cudawarped commented Aug 9, 2023

Currently when CUDA_ARCH_BIN isn't specified CMake attempts to build .cu files for the all architechtures it knows about, generating a list of architectures supported by the current CUDA Toolkit (see #17432). The idea being that if a user doesn't specify the arch then to be on the safe side it should generate for all architechtures for maximum compatibility. Additionally if a user just wants to specify a PTX architechture they have to pass CUDA_ARCH_BIN= (incurring the overhead of calling ocv_filter_available_architecture on all know architectures) in addition to CUDA_ARCH_PTX=<TAERGET ARCH> to CMake to avoid generating binary code for all supported architectures as well.

This PR proposes that:

  1. When CUDA_ARCH_BIN and CUDA_ARCH_PTX are not specified, in addition to binary code for all compute capabilities, PTX code for the highest supported architecture is also generated. This would ensure that the CUDA device code will also run on newer GPU's than the installed CUDA Toolkit, increasing the level of compatibility futher.
  2. When CUDA_ARCH_PTX is passed in isolation only PTX code for the desired CC is generated.

@tomoaki0705

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov asmorkalov added category: build/install category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib labels Aug 9, 2023
@asmorkalov asmorkalov self-requested a review August 9, 2023 06:52
@cudawarped cudawarped changed the title cuda: add default PTX when CUDA_ARCH_BIN is missing cuda: add default PTX when CUDA_ARCH_BIN is missing Aug 9, 2023
@cudawarped cudawarped force-pushed the cuda_add_default_ptx branch from ee154ea to 358e306 Compare August 12, 2023 08:10
@cudawarped
Copy link
Copy Markdown
Contributor Author

@asmorkalov I have updated the PR.

@cudawarped cudawarped changed the title cuda: add default PTX when CUDA_ARCH_BIN is missing cuda: update default PTX behaviour when CUDA_ARCH_BIN is unset Aug 13, 2023
@asmorkalov asmorkalov self-assigned this Sep 14, 2023
@asmorkalov asmorkalov added this to the 4.9.0 milestone Sep 14, 2023
Copy link
Copy Markdown
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@asmorkalov asmorkalov merged commit ec1c060 into opencv:4.x Sep 14, 2023
@asmorkalov asmorkalov mentioned this pull request Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants