Remove exception for NaryEltwise#24786
Conversation
|
@dkurt it seems that just removing this exception works just fine |
|
Hi @Abdurrahheem, this solution looks good to me. The frequency of use of The following is the reply that I misunderstood, please ignore it. |
|
@Abdurrahheem, we need a test for it. |
669362c to
b06782c
Compare
| ASSERT_FALSE(net_cuda.empty()); | ||
|
|
||
| net.setPreferableBackend(backend); | ||
| net.setPreferableTarget(target); |
There was a problem hiding this comment.
Here net also runs different backends. I think we want it run only in the CPU backend and compare results against the ones in the CUDA backend, right? If so, I propose to move it to test_backends.cpp.
There was a problem hiding this comment.
Agree. And no need for reference input/output data. Let's run on CPU and compare with backend.
There was a problem hiding this comment.
dnn (cuda): support broadcasting if a.rank() != b.rank() #24834 Inspired by #24786. This PR keeps the fusion of `NaryEltwise` and `Concat` while addressed the data missing problem via supporting broadcasting if a.rank() != b.rank(). Resolves #23977 Resolves #24606 Resolves #24635 Resolves #24721 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
|
This PR is no longer need as #24834 has covered the issue |
|
@Abdurrahheem please stay it opened till discussion with Dmitry. |
| inp_i_data->layerInstance->type != "Reorg" && | ||
| inp_i_data->layerInstance->type != "Eltwise" && | ||
| inp_i_data->layerInstance->type != "NaryEltwise" && | ||
| // inp_i_data->layerInstance->type != "NaryEltwise" && // link to the issue: https://github.com/opencv/opencv/issues/24721 |
There was a problem hiding this comment.
@dkurt I guess they want your comment on this change.
In my opinion, we do not need to comment out this line as #24834 has mostly resolved the issue. Previously it fuses NaryEltwise and Concat regardless the CPU fallback in NaryEltwise. So there is data both in the host and device in the output of Concat, leading to the missing data reported in those related issues. The CPU fallback happened previously when broadcasting or operation is not supported. The broadcast support is done, leaving the risk of unsupported operation. However, it should also be fine since common operations hves been supported (add, sub, mul, div) and we do not get performance regression with disabling this fusion.
There was a problem hiding this comment.
@fengyuentau, yes, I understand this but this PR before #24834 merged shows that even with disabled broadcast all the tests passed so having an exclusion like this is redundant.
There was a problem hiding this comment.
shows that even with disabled broadcast all the tests passed
All tests were passed because:
- there is no
naryeltwise + concatmodel case in the tests before this PR, - this PR added the test case but commented out the
naryeltwise + concatfusion.
|
As original issue was resolved by a different PR, I'd like to close this PR and suggest refactor CUDA part later. @fengyuentau pointed right that benchmarking is required before touching this sensitive part of code. @Abdurrahheem, thanks! |
This is an experimental PR to check what fails if exception for NaryEltwise is removed. This PR is based on suggestion made in #24721
Data for this PR is located in 1136
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.