Skip to content

DNN: fuse conv+naryEletwise on CUDA backend.#23255

Merged
alalek merged 2 commits intoopencv:4.xfrom
zihaomu:fused_cuda_naryeltwise
Feb 17, 2023
Merged

DNN: fuse conv+naryEletwise on CUDA backend.#23255
alalek merged 2 commits intoopencv:4.xfrom
zihaomu:fused_cuda_naryeltwise

Conversation

@zihaomu
Copy link
Copy Markdown
Member

@zihaomu zihaomu commented Feb 15, 2023

Related issue: #23234.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@zihaomu
Copy link
Copy Markdown
Member Author

zihaomu commented Feb 15, 2023

Hi @WanliZhong, please take a look.

@zihaomu zihaomu force-pushed the fused_cuda_naryeltwise branch 4 times, most recently from 14bdabf to 0fa9d69 Compare February 15, 2023 07:46
@zihaomu zihaomu force-pushed the fused_cuda_naryeltwise branch from 0fa9d69 to 2a5ad05 Compare February 15, 2023 07:58
@zihaomu zihaomu force-pushed the fused_cuda_naryeltwise branch from a618713 to c6f59bd Compare February 15, 2023 08:26
@zihaomu zihaomu requested review from WanliZhong and rogday February 15, 2023 08:35
Copy link
Copy Markdown
Member

@WanliZhong WanliZhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍Currently, the inference time of palm detection improved from 60ms to 55.03ms.

@alalek alalek merged commit 20dac7e into opencv:4.x Feb 17, 2023
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
DNN: fuse conv+naryEletwise on CUDA backend.
@asmorkalov asmorkalov mentioned this pull request May 31, 2023
geversonsto pushed a commit to stodev-com-br/opencv that referenced this pull request Jun 3, 2023
DNN: fuse conv+naryEletwise on CUDA backend.
@Abdurrahheem
Copy link
Copy Markdown
Contributor

Abdurrahheem commented Dec 26, 2023

@zihaomu #24721 issue is related this this PR. Particularly to this line. It causes yolov8 CUDA backend to predict wrong outputs. These are relevant issues #24606 #23977 #24635.

@fengyuentau
Copy link
Copy Markdown
Member

@Abdurrahheem I was also looking at this problem. Have you came up with a solution yet? If yes, I will re-open #24606 and assigned to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants