Skip to content

support eltwise sum with different number of input channels in CUDA backend#16063

Merged
alalek merged 8 commits intoopencv:masterfrom
YashasSamaga:cuda4dnn-shortcut-unequal
Jan 16, 2020
Merged

support eltwise sum with different number of input channels in CUDA backend#16063
alalek merged 8 commits intoopencv:masterfrom
YashasSamaga:cuda4dnn-shortcut-unequal

Conversation

@YashasSamaga
Copy link
Copy Markdown
Contributor

@YashasSamaga YashasSamaga commented Dec 4, 2019

This pullrequest changes:

Timings (on GTX 1050):

Model Darknet OpenCV CUDA
YOLOv3 Tiny PRN 6.727ms 5.94ms

Pending:

  • enable FP16 tests (disabled by PR16010)

References:
#15724
#15739

force_builders=Custom
buildworker:Custom=linux-4
docker_image:Custom=ubuntu-cuda:18.04

@YashasSamaga YashasSamaga changed the title [WIP] support eltwise sum with different number of input channels in CUDA backend support eltwise sum with different number of input channels in CUDA backend Dec 5, 2019
@alalek
Copy link
Copy Markdown
Member

alalek commented Dec 7, 2019

@YashasSamaga Please rebase this patch on latest master (there is conflicts after merging #16087).

@YashasSamaga YashasSamaga force-pushed the cuda4dnn-shortcut-unequal branch from dd5d31e to adb448c Compare December 8, 2019 05:45
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YashasSamaga Please rebase patch on latest master (conflict with merged fix from #16088)

@YashasSamaga YashasSamaga force-pushed the cuda4dnn-shortcut-unequal branch 2 times, most recently from 65810fd to d3677e1 Compare December 13, 2019 07:24
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contribution 👍

@YashasSamaga YashasSamaga changed the title support eltwise sum with different number of input channels in CUDA backend [WIP] support eltwise sum with different number of input channels in CUDA backend Dec 14, 2019
@YashasSamaga
Copy link
Copy Markdown
Contributor Author

YashasSamaga commented Dec 14, 2019

I just noticed something weird. YOLOv3 has started using these shortcut kernels instead of the regular eltwise. Need to check.

YOLOv3 has equal inputs but the channels mode is not ELTWISE_CHANNNELS_SAME. I can patch this up in initCUDA by rechecking the input shapes but is this a bug elsewhere?

[SOLVED] initCUDA rechecks the channel dimension and attempts to use the regular eltwise if possible

@YashasSamaga YashasSamaga changed the title [WIP] support eltwise sum with different number of input channels in CUDA backend support eltwise sum with different number of input channels in CUDA backend Dec 15, 2019
@YashasSamaga YashasSamaga changed the title support eltwise sum with different number of input channels in CUDA backend [WIP] support eltwise sum with different number of input channels in CUDA backend Dec 20, 2019
@YashasSamaga YashasSamaga force-pushed the cuda4dnn-shortcut-unequal branch from 933c5d9 to 1b88474 Compare December 21, 2019 15:31
@YashasSamaga YashasSamaga changed the title [WIP] support eltwise sum with different number of input channels in CUDA backend support eltwise sum with different number of input channels in CUDA backend Dec 21, 2019
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@alalek alalek merged commit d85e67d into opencv:master Jan 16, 2020
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
…nequal

support eltwise sum with different number of input channels in CUDA backend

* add shortcut primitive

* add offsets in shortcut kernel

* skip tests involving more than two inputs

* remove redundant modulus operation

* support multiple inputs

* remove whole file indentation

* skip acc in0 trunc test if weighted

* use shortcut iff channels are unequal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants