Skip to content

cuda4dnn(eltwise): tensor broadcasting#20782

Merged
opencv-pushbot merged 1 commit intoopencv:masterfrom
YashasSamaga:cuda4dnn-eltwise-broadcast
Oct 4, 2021
Merged

cuda4dnn(eltwise): tensor broadcasting#20782
opencv-pushbot merged 1 commit intoopencv:masterfrom
YashasSamaga:cuda4dnn-eltwise-broadcast

Conversation

@YashasSamaga
Copy link
Copy Markdown
Contributor

@YashasSamaga YashasSamaga commented Sep 30, 2021

EltwiseOp requires all the input tensors to have the same shape. This PR generalizes it to support eltwise operations on any set of compatible tensors as input.

resolves #20778

The following tests were failing:

[  FAILED  ] Test_TensorFlow_layers.eltwise_add_vec/0, where GetParam() = CUDA/CUDA
[  FAILED  ] Test_TensorFlow_layers.eltwise_add_vec/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.eltwise_mul_vec/0, where GetParam() = CUDA/CUDA
[  FAILED  ] Test_TensorFlow_layers.eltwise_mul_vec/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_layers.channel_broadcast/0, where GetParam() = CUDA/CUDA
[  FAILED  ] Test_TensorFlow_layers.channel_broadcast/1, where GetParam() = CUDA/CUDA_FP16
[  FAILED  ] Test_TensorFlow_nets.EfficientDet/0, where GetParam() = CUDA/CUDA
[  FAILED  ] Test_TensorFlow_nets.EfficientDet/1, where GetParam() = CUDA/CUDA_FP16

The listed tests now pass.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu-cuda-cc52:18.04
Xbuild_image:Custom=ubuntu-cuda:18.04

@alalek
Copy link
Copy Markdown
Member

alalek commented Oct 3, 2021

Draft

@YashasSamaga Thank you for the contribution! Is this ready to be merged?

@YashasSamaga
Copy link
Copy Markdown
Contributor Author

Thank you for the contribution! Is this ready to be merged?

No. I will finish in 24 hours.

@YashasSamaga YashasSamaga force-pushed the cuda4dnn-eltwise-broadcast branch from 3200703 to 505dde0 Compare October 4, 2021 07:09
@YashasSamaga YashasSamaga marked this pull request as ready for review October 4, 2021 07:09
@asmorkalov asmorkalov requested a review from JulieBar October 4, 2021 08:05
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Thank you 👍

@opencv-pushbot opencv-pushbot merged commit 1b70f94 into opencv:master Oct 4, 2021
@alalek alalek mentioned this pull request Oct 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

error: (-217:Gpu API call) invalid device function in function 'make_policy'

3 participants