Skip to content

DNN/CUDA: let ADD operator run on CUDA#23243

Merged
opencv-pushbot merged 1 commit intoopencv:4.xfrom
WanliZhong:accelerate_palm_det
Feb 14, 2023
Merged

DNN/CUDA: let ADD operator run on CUDA#23243
opencv-pushbot merged 1 commit intoopencv:4.xfrom
WanliZhong:accelerate_palm_det

Conversation

@WanliZhong
Copy link
Copy Markdown
Member

@WanliZhong WanliZhong commented Feb 13, 2023

This PR will fix #23234.
After fix this problem, the inference time of palm detection model has improved from 200+ms to 60+ms on CUDA. It's still a little slower than version 4.6.0.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@WanliZhong WanliZhong added optimization category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib category: dnn (onnx) ONNX suport issues in DNN module labels Feb 13, 2023
@WanliZhong WanliZhong added this to the 4.8.0 milestone Feb 13, 2023
Copy link
Copy Markdown
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@opencv-pushbot opencv-pushbot merged commit 58d8a27 into opencv:4.x Feb 14, 2023
@WanliZhong WanliZhong deleted the accelerate_palm_det branch May 16, 2023 12:33
@asmorkalov asmorkalov mentioned this pull request May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn (onnx) ONNX suport issues in DNN module category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib optimization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MediaPipe palm detection model on CUDA is much slower after 4.7.0 released

3 participants