Use Carotene implementation of TEGRA_GaussianBlurBinomial 3x3 and 5x5 on ARM#25799
Conversation
|
Jetson TK1 (ARM v7+NEON):
|
fengyuentau
left a comment
There was a problem hiding this comment.
Is Carateneonly for ARMv7?
|
No, it's for aarch64 too. I'm working with Jetson Orin right now to get perf numbers. Mac Mx results are welcome too. |
fengyuentau
left a comment
There was a problem hiding this comment.
I got many warnings during compilation (Apple clang 15.0.0):
In file included from /Workspace/fytao/opencv/modules/imgproc/src/thresh.cpp:43:
In file included from /Workspace/fytao/opencv/modules/imgproc/src/precomp.hpp:56:
In file included from /Workspace/fytao/opencv/modules/imgproc/src/hal_replacement.hpp:1114:
In file included from /Workspace/fytao/opencv/build/custom_hal.hpp:5:
/Workspace/fytao/opencv/build/carotene/tegra_hal.hpp:1926:9: warning: 'cv_hal_gaussianBlurBinomial' macro redefined [-Wmacro-redefined]
#define cv_hal_gaussianBlurBinomial TEGRA_GaussianBlurBinomial
^
/Workspace/fytao/opencv/modules/imgproc/src/hal_replacement.hpp:974:9: note: previous definition is here
#define cv_hal_gaussianBlurBinomial hal_ni_gaussianBlurBinomial
^
In file included from /Workspace/fytao/opencv/modules/imgproc/src/utils.cpp:42:
In file included from /Workspace/fytao/opencv/modules/imgproc/src/precomp.hpp:56:
In file included from /Workspace/fytao/opencv/modules/imgproc/src/hal_replacement.hpp:1114:
In file included from /Workspace/fytao/opencv/build/custom_hal.hpp:5:
/Workspace/fytao/opencv/build/carotene/tegra_hal.hpp:1926:9: warning: 'cv_hal_gaussianBlurBinomial' macro redefined [-Wmacro-redefined]
#define cv_hal_gaussianBlurBinomial TEGRA_GaussianBlurBinomial
^
/Workspace/fytao/opencv/modules/imgproc/src/hal_replacement.hpp:974:9: note: previous definition is here
#define cv_hal_gaussianBlurBinomial hal_ni_gaussianBlurBinomial
fengyuentau
left a comment
There was a problem hiding this comment.
Perf on macOS 14.5 (Apple M1):
Geometric mean (ms)
Name of Test hal hal hal
gaussianBlur gaussianBlur gaussianBlur
Carotene Carotene.patch Carotene.patch
vs
hal
gaussianBlur
Carotene
(x-factor)
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 8UC1, BORDER_CONSTANT) 0.006 0.002 2.70
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 8UC1, BORDER_REPLICATE) 0.006 0.002 2.64
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 16UC1, BORDER_CONSTANT) 0.013 0.008 1.53
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 16UC1, BORDER_REPLICATE) 0.011 0.010 1.17
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 16SC1, BORDER_CONSTANT) 0.011 0.011 1.01
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 16SC1, BORDER_REPLICATE) 0.010 0.010 0.99
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 32FC1, BORDER_CONSTANT) 0.010 0.010 1.00
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 32FC1, BORDER_REPLICATE) 0.009 0.009 1.02
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 8UC4, BORDER_CONSTANT) 0.024 0.021 1.15
gaussianBlur3x3::Size_MatType_BorderType3x3::(127x61, 8UC4, BORDER_REPLICATE) 0.014 0.015 0.95
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 8UC1, BORDER_CONSTANT) 0.035 0.013 2.61
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 8UC1, BORDER_REPLICATE) 0.029 0.013 2.17
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 16UC1, BORDER_CONSTANT) 0.040 0.044 0.91
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 16UC1, BORDER_REPLICATE) 0.042 0.044 0.96
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 16SC1, BORDER_CONSTANT) 0.061 0.060 1.01
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 16SC1, BORDER_REPLICATE) 0.060 0.061 0.98
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 32FC1, BORDER_CONSTANT) 0.052 0.052 0.99
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 32FC1, BORDER_REPLICATE) 0.051 0.053 0.96
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 8UC4, BORDER_CONSTANT) 0.064 0.067 0.95
gaussianBlur3x3::Size_MatType_BorderType3x3::(320x240, 8UC4, BORDER_REPLICATE) 0.070 0.064 1.08
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 8UC1, BORDER_CONSTANT) 0.066 0.050 1.32
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 8UC1, BORDER_REPLICATE) 0.068 0.050 1.35
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 16UC1, BORDER_CONSTANT) 0.078 0.088 0.89
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 16UC1, BORDER_REPLICATE) 0.080 0.087 0.93
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 16SC1, BORDER_CONSTANT) 0.218 0.243 0.90
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 16SC1, BORDER_REPLICATE) 0.232 0.241 0.96
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 32FC1, BORDER_CONSTANT) 0.192 0.217 0.88
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 32FC1, BORDER_REPLICATE) 0.183 0.214 0.86
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 8UC4, BORDER_CONSTANT) 0.131 0.148 0.88
gaussianBlur3x3::Size_MatType_BorderType3x3::(640x480, 8UC4, BORDER_REPLICATE) 0.140 0.154 0.91
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_CONSTANT) 0.117 0.161 0.73
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_REPLICATE) 0.097 0.170 0.57
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_CONSTANT) 0.188 0.329 0.57
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_REPLICATE) 0.185 0.262 0.71
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_CONSTANT) 0.691 1.536 0.45
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_REPLICATE) 0.796 1.420 0.56
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_CONSTANT) 0.565 0.575 0.98
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_REPLICATE) 0.560 0.833 0.67
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_CONSTANT) 0.333 0.340 0.98
gaussianBlur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_REPLICATE) 0.338 0.334 1.01
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC1, BORDER_CONSTANT) 0.014 0.008 1.62
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC1, BORDER_REFLECT101) 0.013 0.009 1.47
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC1, BORDER_REFLECT) 0.014 0.009 1.61
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC1, BORDER_REPLICATE) 0.018 0.009 2.04
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16UC1, BORDER_CONSTANT) 0.013 0.012 1.07
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16UC1, BORDER_REFLECT101) 0.014 0.013 1.08
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16UC1, BORDER_REFLECT) 0.018 0.013 1.34
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16UC1, BORDER_REPLICATE) 0.016 0.013 1.20
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16SC1, BORDER_CONSTANT) 0.016 0.016 1.01
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16SC1, BORDER_REFLECT101) 0.016 0.016 1.01
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16SC1, BORDER_REFLECT) 0.016 0.016 1.00
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 16SC1, BORDER_REPLICATE) 0.016 0.016 1.00
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 32FC1, BORDER_CONSTANT) 0.014 0.014 1.01
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 32FC1, BORDER_REFLECT101) 0.013 0.013 1.00
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 32FC1, BORDER_REFLECT) 0.013 0.014 0.99
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 32FC1, BORDER_REPLICATE) 0.013 0.013 1.01
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC4, BORDER_CONSTANT) 0.030 0.031 0.97
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC4, BORDER_REFLECT101) 0.030 0.031 0.96
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC4, BORDER_REFLECT) 0.033 0.031 1.07
gaussianBlur5x5::Size_MatType_BorderType::(127x61, 8UC4, BORDER_REPLICATE) 0.031 0.031 0.99
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC1, BORDER_CONSTANT) 0.050 0.058 0.86
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC1, BORDER_REFLECT101) 0.049 0.062 0.80
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC1, BORDER_REFLECT) 0.050 0.063 0.79
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC1, BORDER_REPLICATE) 0.051 0.061 0.83
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16UC1, BORDER_CONSTANT) 0.064 0.103 0.62
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16UC1, BORDER_REFLECT101) 0.066 0.107 0.62
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16UC1, BORDER_REFLECT) 0.067 0.107 0.63
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16UC1, BORDER_REPLICATE) 0.065 0.107 0.60
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16SC1, BORDER_CONSTANT) 0.133 0.098 1.35
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16SC1, BORDER_REFLECT101) 0.135 0.099 1.36
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16SC1, BORDER_REFLECT) 0.133 0.099 1.34
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 16SC1, BORDER_REPLICATE) 0.135 0.099 1.37
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 32FC1, BORDER_CONSTANT) 0.112 0.082 1.36
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 32FC1, BORDER_REFLECT101) 0.111 0.083 1.33
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 32FC1, BORDER_REFLECT) 0.117 0.083 1.41
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 32FC1, BORDER_REPLICATE) 0.111 0.082 1.35
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC4, BORDER_CONSTANT) 0.103 0.278 0.37
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC4, BORDER_REFLECT101) 0.103 0.274 0.37
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC4, BORDER_REFLECT) 0.113 0.272 0.42
gaussianBlur5x5::Size_MatType_BorderType::(320x240, 8UC4, BORDER_REPLICATE) 0.103 0.274 0.38
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_CONSTANT) 0.130 0.227 0.57
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REFLECT101) 0.174 0.237 0.73
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REFLECT) 0.137 0.231 0.59
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REPLICATE) 0.126 0.229 0.55
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_CONSTANT) 0.195 0.373 0.52
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REFLECT101) 0.155 0.364 0.43
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REFLECT) 0.158 0.372 0.43
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REPLICATE) 0.194 0.370 0.52
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_CONSTANT) 0.486 0.334 1.45
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REFLECT101) 0.386 0.326 1.18
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REFLECT) 0.437 0.328 1.33
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REPLICATE) 0.604 0.325 1.86
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_CONSTANT) 0.314 0.277 1.13
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REFLECT101) 0.314 0.276 1.14
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REFLECT) 0.300 0.264 1.14
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REPLICATE) 0.314 0.273 1.15
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_CONSTANT) 0.551 0.971 0.57
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REFLECT101) 0.338 0.984 0.34
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REFLECT) 0.480 0.984 0.49
gaussianBlur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REPLICATE) 0.524 0.961 0.55
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_CONSTANT) 0.177 0.609 0.29
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REFLECT101) 0.180 0.610 0.30
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REFLECT) 0.164 0.609 0.27
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REPLICATE) 0.179 0.619 0.29
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_CONSTANT) 0.471 1.086 0.43
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REFLECT101) 0.344 1.090 0.32
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REFLECT) 0.313 1.119 0.28
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REPLICATE) 0.331 1.091 0.30
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_CONSTANT) 1.115 0.940 1.19
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REFLECT101) 0.946 0.946 1.00
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REFLECT) 0.951 0.946 1.01
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REPLICATE) 1.458 0.936 1.56
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_CONSTANT) 0.734 0.751 0.98
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REFLECT101) 0.760 0.752 1.01
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REFLECT) 0.742 0.742 1.00
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REPLICATE) 0.748 0.759 0.99
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_CONSTANT) 0.631 2.930 0.22
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REFLECT101) 0.802 2.911 0.28
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REFLECT) 0.773 2.898 0.27
gaussianBlur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REPLICATE) 0.536 2.934 0.18
b7bcf51 to
e80d681
Compare
e80d681 to
2799c74
Compare
|
I got similar results for Jetson Orin. The patch leads to perf degradation there. |
|
@fengyuentau I fixed the warning and added NEON version check to activate the branch for old CPUs only. Could you take a look again? |
|
Perf results are better than before but still volatile. I guess this is nothing related to this patch because I can confirm that the new code is not enabled. |
|
@fengyuentau Do you approve the patch then? |
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.