-
-
Notifications
You must be signed in to change notification settings - Fork 56.5k
dnn(opencl, cuda): missed eltwise fusion opportunity #17946
Description
System information (version)
- OpenCV => 3.4.11, 4.4.0
Detailed description
Convolution
Eltwise
ReLU
Eltwise is not fused with convolution if ReLU is the last layer of the network because of this check in 3.4:
opencv/modules/dnn/src/dnn.cpp
Line 2461 in 5bfa43f
| if( !nextActivLayer.empty() && pinsToKeep.count(lpNext) == 0 && |
I think the pinsToKeep check shouldn't be there at all. The lpNext is the pin to the activation layer's output blob. It was last set here:
opencv/modules/dnn/src/dnn.cpp
Line 2456 in 5bfa43f
| lpNext = LayerPin(eltwiseData->consumers[0].lid, 0); |
Why should the fusion not take place if the activation layer's output is required by the user?
Note that the following code sets the output blobs of the eltwise and the activation layer to that of the convolution layer.
opencv/modules/dnn/src/dnn.cpp
Lines 2492 to 2495 in 5bfa43f
| eltwiseData->outputBlobs = ld.outputBlobs; | |
| nextData->outputBlobs = ld.outputBlobs; | |
| eltwiseData->outputBlobsWrappers = ld.outputBlobsWrappers; | |
| nextData->outputBlobsWrappers = ld.outputBlobsWrappers; |
Therefore, returning the activation layer's blob will automatically return the fused convolution's output blob which is correct. So there is no problem with fusing even if the output blob of the activation layer is required?
Steps to reproduce
Issue submission checklist
- I report the issue, it's not a question
- I checked the problem with documentation, FAQ, open issues,
answers.opencv.org, Stack Overflow, etc and have not found solution - I updated to latest OpenCV version and the issue is still there
- There is reproducer code and related data files: videos, images, onnx, etc