Conversation
|
Please choose 3.4 as a target branch (more details in #19474 (comment) ) |
17f89b3 to
8bbd282
Compare
d2d4c19 to
fe28986
Compare
alalek
left a comment
There was a problem hiding this comment.
Could we add explicit input parameter which changes operating mode?
We need to ensure that "layers fusing" is still properly handled (fused code may not call the ".forward()" method).
| for (size_t x = 0; x < xSize; x++) | ||
| { | ||
| outIdx[2] = x; | ||
| inputs[i].at<float>(outIdx.data()) = tmpInput.at<float>(idx.data()); |
There was a problem hiding this comment.
IMHO, generating duplicated data doesn't look good from performance perspective.
There was a problem hiding this comment.
@alalek, there are two options to resolve the issue:
- support Mat (shape:
[1, m, k, n]) + Vec (shape:[1, 1, 1, n]) operation in multiple implementations (ex.: OCL), which is quite complex and error-prone - expand Vec (shape:
[1, 1, 1, n]) to Mat to support a new case using an existing implementation at the expense of extra memory usage
| if (!allOnes && !isVecFound) | ||
| { | ||
| vecIdx = i; | ||
| isVecFound = true; |
There was a problem hiding this comment.
IMHO, such stuff should be explicitly specified through parameters instead of implicit internal handling.
There was a problem hiding this comment.
@alalek, it seems that ''explicit input parameter'' is an extra option. This PR adds support Mat (shape: [1, m, k, n] ) + Vec (shape: [1, 1, 1, n]). This support was implemented with Vec (shape: [1, 1, 1, n]) to Mat expansion to avoid possible reconstruction of eltwise_layer.cpp summation core. Thus, the current approach should not lead to further corruptions:
We need to ensure that "layers fusing" is still properly handled (fused code may not call the ".forward()" method).
There was a problem hiding this comment.
Main concern here is about correct and accurate implementation of supportBackend() callback.
Only OpenCV/CPU backend supports new mode (CUDA, Halide, Vulkan, even OpenCV/OpenCL doesn't support that).
Unsupported backends should be filtered out in the right way.
There was a problem hiding this comment.
@alalek, could you, please, clarify the following: if we don't change the computation core (forward(...) and run(...) functions were not changed) and modify only one of the inputs' shape, expanding it in getMemoryShapes(...), we should not impact on the chosen backend. Thus, do we really need supportBackend() callback corrections here?
There was a problem hiding this comment.
forward() is modified below, see if (channelsModeInput == ELTWISE_CHANNNELS_SAME && inputs[0].dims > 2)
Tests filter out non-working "OpenCL" cases (as mentioned there this should not be done for OpenCV/OCL code path)
modify only one of the inputs' shape
Probably we can't do that. Input is always a some output of another layer.
Thus, do we really need supportBackend() callback corrections here?
supportBackend() must be revised if we add support for new mode.
There was a problem hiding this comment.
@alalek, thank you for the review, added corrections for supportBackend().
9077b10 to
c496eba
Compare
| CV_Assert(outputs.size() == 1); | ||
| const int nstripes = getNumThreads(); | ||
|
|
||
| if (channelsModeInput == ELTWISE_CHANNNELS_SAME && inputs[0].dims > 2) |
There was a problem hiding this comment.
Please add bailout code in forward_ocl() (line 546) for new functionality:
if ((inputs_.depth() == CV_16S && op != SUM) || (channelsMode != ELTWISE_CHANNNELS_SAME))
return false;
+ if (hasVecInput)
+ return false; // TODO not implemented yet: https://github.com/opencv/opencv/pull/19477Support Mat (shape: [1, m, k, n] ) + Vec (shape: [1, 1, 1, n]) operation by vec to mat expansion
fea5e3d to
5718f8a
Compare
Merge with extra: opencv/opencv_extra#849
Fixes #17364: inappropriate OpenCV processing of specific tensor summation case in LaneNet BiseNetv2 front end.
Added tf.math.add and OCV
eltwise_layerbehaviour alignment.The initial problem problem is incorrect processing of the case, when, for example, there are two tensors:
a(shape: [1,1,1,n] ) andb(shape: [1,<some_val_1>,<some_val_2>,n] ).tf.math.addreturns the resultc(shape: [1,<some_val_1>,<some_val_2>,n] ), whereas OpenCV compares1vs<some_val_1>and1vs<some_val_2>, throws exception ingetMemoryShapes()and that's all.The solution for such cases (for example, for tensor shape
1x10x1x1 (NxCxHxW)+1x10x5x5 (NxCxHxW)) was:1x10x5x5 (NxCxHxW)1x10x1x1 (NxCxHxW)for further correct summation with1x10x5x5 (NxCxHxW).The initial problem problem is incorrect processing of the case, when, for example, there are two tensors:
a(shape: [1,1,1,n] ) andb(shape: [1,<some_val_1>,<some_val_2>,n] ).tf.math.addreturns the resultc(shape: [1,<some_val_1>,<some_val_2>,n] ), whereas OpenCV compares1vs<some_val_1>and1vs<some_val_2>, throws exception ingetMemoryShapes()and that's all.The solution for such cases (for example, for tensor shape
1x10x1x1 (NxCxHxW)+1x10x5x5 (NxCxHxW)) was:1x10x5x5 (NxCxHxW)1x10x1x1 (NxCxHxW)for further correct summation with1x10x5x5 (NxCxHxW).Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.