Remove exception for NaryEltwise by Abdurrahheem · Pull Request #24786 · opencv/opencv

Abdurrahheem · 2023-12-27T16:55:10Z

This is an experimental PR to check what fails if exception for NaryEltwise is removed. This PR is based on suggestion made in #24721

Data for this PR is located in 1136

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

Abdurrahheem · 2023-12-27T20:11:21Z

@dkurt it seems that just removing this exception works just fine

zihaomu · 2023-12-27T23:50:19Z

Hi @Abdurrahheem, this solution looks good to me. The frequency of use of concat fusion is much lower than that of Conv fusion. Turning off NaryEltwise such cases has little impact on the inference speed.

The following is the reply that I misunderstood, please ignore it.
~~This change will remove the support of NaryEltwise on all CUDA devices, which will affect the inference speed. A long-term solution to this issue is needed. @WanliZhong please take a look.~~

zihaomu

Good short-term solution. 👍

dkurt · 2023-12-28T03:32:58Z

@Abdurrahheem, we need a test for it.

fengyuentau · 2024-01-08T06:51:57Z

This bug is originally introduced from #23255.

This PR can also resolve #23977, #24606, #23977.

fengyuentau · 2024-01-08T06:55:54Z

modules/dnn/test/test_onnx_importer.cpp

+    ASSERT_FALSE(net_cuda.empty());
+
+    net.setPreferableBackend(backend);
+    net.setPreferableTarget(target);


Here net also runs different backends. I think we want it run only in the CPU backend and compare results against the ones in the CUDA backend, right? If so, I propose to move it to test_backends.cpp.

Agree. And no need for reference input/output data. Let's run on CPU and compare with backend.

This is already handled via https://github.com/opencv/opencv/pull/24834/files#diff-94c1c80543469b3a5e5c3276dcb55c5d47acf120f977d31f3d1bfaadc4011850R1531.

dnn (cuda): support broadcasting if a.rank() != b.rank() #24834 Inspired by #24786. This PR keeps the fusion of `NaryEltwise` and `Concat` while addressed the data missing problem via supporting broadcasting if a.rank() != b.rank(). Resolves #23977 Resolves #24606 Resolves #24635 Resolves #24721 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

Abdurrahheem · 2024-01-11T07:09:45Z

This PR is no longer need as #24834 has covered the issue

asmorkalov · 2024-01-11T07:12:34Z

@Abdurrahheem please stay it opened till discussion with Dmitry.

fengyuentau · 2024-01-12T14:23:22Z

modules/dnn/src/net_impl_fuse.cpp

                          inp_i_data->layerInstance->type != "Reorg" &&
                          inp_i_data->layerInstance->type != "Eltwise" &&
-                          inp_i_data->layerInstance->type != "NaryEltwise" &&
+                        //   inp_i_data->layerInstance->type != "NaryEltwise" && // link to the issue: https://github.com/opencv/opencv/issues/24721


@dkurt I guess they want your comment on this change.

In my opinion, we do not need to comment out this line as #24834 has mostly resolved the issue. Previously it fuses NaryEltwise and Concat regardless the CPU fallback in NaryEltwise. So there is data both in the host and device in the output of Concat, leading to the missing data reported in those related issues. The CPU fallback happened previously when broadcasting or operation is not supported. The broadcast support is done, leaving the risk of unsupported operation. However, it should also be fine since common operations hves been supported (add, sub, mul, div) and we do not get performance regression with disabling this fusion.

@fengyuentau, yes, I understand this but this PR before #24834 merged shows that even with disabled broadcast all the tests passed so having an exclusion like this is redundant.

shows that even with disabled broadcast all the tests passed

All tests were passed because:

there is no naryeltwise + concat model case in the tests before this PR,

this PR added the test case but commented out the naryeltwise + concat fusion.

dkurt · 2024-01-18T12:10:42Z

As original issue was resolved by a different PR, I'd like to close this PR and suggest refactor CUDA part later. @fengyuentau pointed right that benchmarking is required before touching this sensitive part of code. @Abdurrahheem, thanks!

Abdurrahheem added the category: dnn label Dec 27, 2023

Abdurrahheem requested review from asmorkalov, dkurt and zihaomu December 27, 2023 16:55

Abdurrahheem self-assigned this Dec 27, 2023

zihaomu requested a review from WanliZhong December 27, 2023 23:36

zihaomu approved these changes Dec 28, 2023

View reviewed changes

dkurt added the pr: needs test New functionality requires minimal tests set label Dec 28, 2023

asmorkalov added this to the 4.10.0 milestone Dec 28, 2023

Abdurrahheem added 3 commits December 28, 2023 22:55

commnet exception for NaryEltwise

3028340

added test for concat with cuda

441b853

remove commented line

b06782c

Abdurrahheem force-pushed the ash/cuda_concat branch from 669362c to b06782c Compare December 28, 2023 19:58

Abdurrahheem mentioned this pull request Dec 28, 2023

Weights and Input for matmul concat opencv/opencv_extra#1136

Closed

fengyuentau reviewed Jan 8, 2024

View reviewed changes

fengyuentau mentioned this pull request Jan 9, 2024

dnn (cuda): support broadcasting if a.rank() != b.rank() #24834

Merged

6 tasks

Abdurrahheem closed this Jan 11, 2024

asmorkalov reopened this Jan 11, 2024

asmorkalov removed the pr: needs test New functionality requires minimal tests set label Jan 12, 2024

asmorkalov marked this pull request as ready for review January 12, 2024 09:04

fengyuentau reviewed Jan 12, 2024

View reviewed changes

dkurt closed this Jan 18, 2024

Uh oh!

Conversation

Abdurrahheem commented Dec 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

Abdurrahheem commented Dec 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zihaomu commented Dec 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zihaomu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkurt commented Dec 28, 2023

Uh oh!

fengyuentau commented Jan 8, 2024

Uh oh!

fengyuentau Jan 8, 2024

Choose a reason for hiding this comment

Uh oh!

dkurt Jan 12, 2024

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jan 12, 2024

Choose a reason for hiding this comment

Uh oh!

Abdurrahheem commented Jan 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asmorkalov commented Jan 11, 2024

Uh oh!

fengyuentau Jan 12, 2024

Choose a reason for hiding this comment

Uh oh!

dkurt Jan 12, 2024

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jan 15, 2024

Choose a reason for hiding this comment

Uh oh!

dkurt commented Jan 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Abdurrahheem commented Dec 27, 2023 •

edited

Loading

Abdurrahheem commented Dec 27, 2023 •

edited

Loading

zihaomu commented Dec 27, 2023 •

edited

Loading

zihaomu left a comment •

edited

Loading

Abdurrahheem commented Jan 11, 2024 •

edited

Loading

dkurt commented Jan 18, 2024 •

edited

Loading