Skip to content

enable concat layer fuse for OCL target#11959

Merged
opencv-pushbot merged 1 commit intoopencv:3.4from
pengli:3.4
Jul 17, 2018
Merged

enable concat layer fuse for OCL target#11959
opencv-pushbot merged 1 commit intoopencv:3.4from
pengli:3.4

Conversation

@pengli
Copy link
Copy Markdown

@pengli pengli commented Jul 13, 2018

force_builders=Docs,linux,ocllinux,windows,ocl,macosx,oclmacosx,linuxNoOpt

@pengli pengli force-pushed the 3.4 branch 6 times, most recently from 6ee14ed to 44f895b Compare July 16, 2018 04:15
@pengli
Copy link
Copy Markdown
Author

pengli commented Jul 16, 2018

I run GoogleNet fp16 test on SKL GT2, perf time improved from 10.2ms to 9.7ms.

@alalek
Copy link
Copy Markdown
Member

alalek commented Jul 16, 2018

i7-6700 (Fedora 28 + Intel OpenCL NEO 18.26.10987) results:

Name of Test origin patch x
AlexNet::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 14.354 14.363 1.00
AlexNet::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 8.592 8.642 0.99
DenseNet_121::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 55.803 62.607 0.89
DenseNet_121::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 69.040 79.798 0.87
EAST_text_detection::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 96.957 97.601 0.99
EAST_text_detection::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 76.413 77.883 0.98
ENet::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 22.353 37.204 0.60
FastNeuralStyle_eccv16::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 99.908 100.513 0.99
GoogLeNet::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 14.292 13.701 1.04
GoogLeNet::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 11.287 10.734 1.05
Inception_5h::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 16.351 15.729 1.04
Inception_5h::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 13.506 12.917 1.05
Inception_v2_SSD_TensorFlow::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 49.297 49.997 0.99
Inception_v2_SSD_TensorFlow::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 53.582 54.104 0.99
MobileNet_SSD_Caffe::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 20.860 20.909 1.00
MobileNet_SSD_Caffe::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 17.746 17.748 1.00
MobileNet_SSD_v1_TensorFlow::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 22.165 22.242 1.00
MobileNet_SSD_v1_TensorFlow::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 18.711 18.483 1.01
MobileNet_SSD_v2_TensorFlow::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 27.008 27.103 1.00
MobileNet_SSD_v2_TensorFlow::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 26.102 26.008 1.00
OpenFace::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 6.851 6.887 0.99
OpenFace::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 8.321 8.358 1.00
ResNet_50::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 30.511 30.581 1.00
ResNet_50::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 22.046 22.229 0.99
SSD::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 517.671 520.363 0.99
SSD::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 344.160 343.189 1.00
SqueezeNet_v1_1::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 7.318 6.683 1.09
SqueezeNet_v1_1::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 5.657 5.209 1.09
YOLOv3::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 376.247 378.513 0.99
YOLOv3::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 323.300 326.254 0.99
opencv_face_detector::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL) 20.171 20.165 1.00
opencv_face_detector::DNNTestNetwork::(DNN_BACKEND_OPENCV, DNN_TARGET_OPENCL_FP16) 23.896 24.039 0.99

Could you take a look on performance regressions for DenseNet (11%), ENet (40%) ?

@pengli
Copy link
Copy Markdown
Author

pengli commented Jul 17, 2018

@alalek , ENet and DenseNet perf regression is fixed in the update. Pls give it a try. Thanks.

Signed-off-by: Li Peng <peng.li@intel.com>
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Thank you 👍

@opencv-pushbot opencv-pushbot merged commit f0cadaa into opencv:3.4 Jul 17, 2018
opencv-pushbot pushed a commit that referenced this pull request Jul 17, 2018
@alalek alalek mentioned this pull request Jul 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants