Skip to content

Update U8 processing for non-bitexact linear resize#1

Closed
terfendail wants to merge 3 commits intopmur:resizefrom
terfendail:resize_u8
Closed

Update U8 processing for non-bitexact linear resize#1
terfendail wants to merge 3 commits intopmur:resizefrom
terfendail:resize_u8

Conversation

@terfendail
Copy link
Copy Markdown

relates opencv#15257

This pullrequest changes

I've investigated resize performance degradation for SSE2/SSE3 baselines and it looks like source data vector gathering issue. I've updated U8 processing with channel number specific branches and it provides performance improvement of 1.5 for SSE2/SSE3.
Could you please check whether this change works for VSX?

pmur and others added 3 commits September 20, 2019 08:48
There appears to be a 2x unroll of the HResizeLinear against k,
however the k value is only incremented by 1 during the unroll. This
results in k - 1 duplicate passes when k > 1.

Likewise, the final pass may not respect the work done by the vector
loop. Start it with the offset returned by the vector op if
implemented. Note, no vector ops are implemented today.

The performance is most noticable on a linear downscale. A set of
performance tests are added to characterize this.  The performance
improvement is 10-50% depending on the scaling.
Performance is mostly gated by the gather operations
for x inputs.

Likewise, provide a 2x unroll against k, this reduces the
number of alpha gathers by 1/2 for larger k.

While not a 4x improvement, it still performs substantially
better under P9 for a 1.4x improvement. P8 baseline is
1.05-1.10x due to reduced VSX instruction set.

Likewise, for float types, this results in a more modest
1.2x improvement.
@terfendail
Copy link
Copy Markdown
Author

Performance for SSE2 baseline
Performance test Reference time PR time Speedup
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 320x240)) 0.014 0.013 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (960x540, 640x480)) 0.354 0.211 1.68
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 213x120)) 0.042 0.031 1.36
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 320x240)) 0.117 0.081 1.45
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 640x480)) 0.396 0.250 1.59
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 320x240)) 0.029 0.028 1.03
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (960x540, 640x480)) 0.481 0.440 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 213x120)) 0.065 0.059 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 320x240)) 0.190 0.174 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 640x480)) 0.587 0.544 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 320x240)) 0.051 0.054 0.94
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (960x540, 640x480)) 0.317 0.289 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 213x120)) 0.057 0.053 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 320x240)) 0.135 0.135 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 640x480)) 0.361 0.343 1.05
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 320x240)) 0.480 0.482 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (960x540, 640x480)) 0.685 0.357 1.92
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 213x120)) 0.081 0.049 1.64
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 320x240)) 0.240 0.142 1.69
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 640x480)) 0.782 0.432 1.81
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 320x240)) 0.434 0.442 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (960x540, 640x480)) 0.937 0.872 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 213x120)) 0.128 0.118 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 320x240)) 0.372 0.341 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 640x480)) 1.165 1.058 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 320x240)) 0.337 0.341 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (960x540, 640x480)) 0.644 0.582 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 213x120)) 0.114 0.115 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 320x240)) 0.262 0.250 1.05
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 640x480)) 0.791 0.717 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 320x240)) 0.118 0.121 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (960x540, 640x480)) 1.017 0.565 1.80
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 213x120)) 0.117 0.075 1.55
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 320x240)) 0.336 0.208 1.62
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 640x480)) 1.166 0.670 1.74
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 320x240)) 0.121 0.123 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (960x540, 640x480)) 1.392 1.294 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 213x120)) 0.186 0.173 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 320x240)) 0.549 0.505 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 640x480)) 1.730 1.595 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 320x240)) 0.443 0.460 0.96
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (960x540, 640x480)) 1.005 0.902 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 213x120)) 0.170 0.168 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 320x240)) 0.406 0.399 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 640x480)) 1.258 1.181 1.07
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 320x240)) 0.087 0.089 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (960x540, 640x480)) 1.354 0.536 2.53
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 213x120)) 0.152 0.073 2.08
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 320x240)) 0.448 0.189 2.37
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 640x480)) 1.538 0.608 2.53
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 320x240)) 0.131 0.132 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (960x540, 640x480)) 1.843 1.697 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 213x120)) 0.246 0.224 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 320x240)) 0.729 0.660 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 640x480)) 2.297 2.078 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 320x240)) 0.231 0.234 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (960x540, 640x480)) 1.368 1.231 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 213x120)) 0.244 0.227 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 320x240)) 0.639 0.612 1.04
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 640x480)) 1.794 1.603 1.12
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 960x540)) 0.504 0.289 1.74
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 1280x720)) 0.680 0.411 1.66
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 960x540)) 0.679 0.608 1.12
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 1280x720)) 0.936 0.863 1.08
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 960x540)) 0.456 0.421 1.08
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 1280x720)) 0.650 0.596 1.09
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 960x540)) 0.946 0.489 1.94
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 1280x720)) 1.335 0.754 1.77
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 960x540)) 1.315 1.197 1.10
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 1280x720)) 1.850 1.697 1.09
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 960x540)) 0.934 0.854 1.09
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 1280x720)) 1.306 1.236 1.06
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 960x540)) 1.424 0.790 1.80
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 1280x720)) 2.101 1.205 1.74
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 960x540)) 1.955 1.808 1.08
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 1280x720)) 2.832 2.612 1.08
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 960x540)) 1.362 1.296 1.05
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 1280x720)) 2.043 1.925 1.06
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 960x540)) 1.982 0.783 2.53
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 1280x720)) 2.884 1.234 2.34
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 960x540)) 2.681 2.449 1.09
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 1280x720)) 3.788 3.543 1.07
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 960x540)) 1.943 1.755 1.11
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 1280x720)) 2.926 2.652 1.10
Performance for SSE3 baseline
Performance test Reference time PR time Speedup
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 320x240)) 0.014 0.013 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (960x540, 640x480)) 0.349 0.206 1.69
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 213x120)) 0.042 0.031 1.37
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 320x240)) 0.117 0.082 1.43
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 640x480)) 0.395 0.243 1.63
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 320x240)) 0.029 0.027 1.04
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (960x540, 640x480)) 0.480 0.411 1.17
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 213x120)) 0.064 0.056 1.16
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 320x240)) 0.190 0.171 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 640x480)) 0.589 0.513 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 320x240)) 0.052 0.049 1.07
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (960x540, 640x480)) 0.318 0.282 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 213x120)) 0.055 0.048 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 320x240)) 0.135 0.130 1.04
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 640x480)) 0.362 0.344 1.05
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 320x240)) 0.473 0.473 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (960x540, 640x480)) 0.686 0.347 1.98
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 213x120)) 0.081 0.049 1.66
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 320x240)) 0.244 0.137 1.79
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 640x480)) 0.778 0.419 1.86
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 320x240)) 0.436 0.411 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (960x540, 640x480)) 0.944 0.815 1.16
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 213x120)) 0.129 0.112 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 320x240)) 0.372 0.332 1.12
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 640x480)) 1.171 1.028 1.14
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 320x240)) 0.332 0.339 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (960x540, 640x480)) 0.629 0.573 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 213x120)) 0.115 0.110 1.05
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 320x240)) 0.256 0.242 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 640x480)) 0.783 0.715 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 320x240)) 0.121 0.118 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (960x540, 640x480)) 1.039 0.595 1.75
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 213x120)) 0.121 0.073 1.66
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 320x240)) 0.336 0.195 1.73
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 640x480)) 1.165 0.646 1.80
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 320x240)) 0.121 0.119 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (960x540, 640x480)) 1.394 1.296 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 213x120)) 0.189 0.173 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 320x240)) 0.550 0.508 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 640x480)) 1.731 1.606 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 320x240)) 0.448 0.456 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (960x540, 640x480)) 0.974 0.880 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 213x120)) 0.170 0.160 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 320x240)) 0.402 0.382 1.05
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 640x480)) 1.241 1.174 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 320x240)) 0.087 0.083 1.05
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (960x540, 640x480)) 1.351 0.502 2.69
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 213x120)) 0.154 0.067 2.29
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 320x240)) 0.445 0.176 2.53
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 640x480)) 1.536 0.573 2.68
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 320x240)) 0.129 0.124 1.04
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (960x540, 640x480)) 1.850 1.713 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 213x120)) 0.247 0.225 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 320x240)) 0.730 0.660 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 640x480)) 2.308 2.084 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 320x240)) 0.228 0.229 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (960x540, 640x480)) 1.357 1.229 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 213x120)) 0.241 0.210 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 320x240)) 0.635 0.567 1.12
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 640x480)) 1.782 1.595 1.12
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 960x540)) 0.484 0.300 1.62
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 1280x720)) 0.688 0.428 1.61
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 960x540)) 0.678 0.611 1.11
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 1280x720)) 0.931 0.863 1.08
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 960x540)) 0.459 0.405 1.13
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 1280x720)) 0.655 0.569 1.15
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 960x540)) 0.948 0.501 1.89
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 1280x720)) 1.340 0.751 1.78
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 960x540)) 1.319 1.201 1.10
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 1280x720)) 1.865 1.708 1.09
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 960x540)) 0.949 0.822 1.15
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 1280x720)) 1.350 1.184 1.14
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 960x540)) 1.425 0.781 1.82
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 1280x720)) 2.019 1.148 1.76
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 960x540)) 1.945 1.829 1.06
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 1280x720)) 2.873 2.616 1.10
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 960x540)) 1.443 1.255 1.15
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 1280x720)) 2.147 1.837 1.17
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 960x540)) 1.931 0.779 2.48
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 1280x720)) 2.930 1.345 2.18
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 960x540)) 2.724 2.408 1.13
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 1280x720)) 3.878 3.488 1.11
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 960x540)) 1.969 1.687 1.17
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 1280x720)) 2.955 2.591 1.14
Performance for SSE4_2 baseline
Performance test Reference time PR time Speedup
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 320x240)) 0.014 0.014 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (960x540, 640x480)) 0.347 0.197 1.76
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 213x120)) 0.042 0.029 1.46
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 320x240)) 0.118 0.076 1.56
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 640x480)) 0.406 0.231 1.76
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 320x240)) 0.028 0.029 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (960x540, 640x480)) 0.453 0.413 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 213x120)) 0.064 0.057 1.12
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 320x240)) 0.188 0.169 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 640x480)) 0.570 0.516 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 320x240)) 0.053 0.049 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (960x540, 640x480)) 0.318 0.271 1.17
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 213x120)) 0.055 0.049 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 320x240)) 0.134 0.126 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 640x480)) 0.359 0.314 1.14
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 320x240)) 0.477 0.472 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (960x540, 640x480)) 0.681 0.282 2.42
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 213x120)) 0.081 0.046 1.78
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 320x240)) 0.242 0.125 1.93
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 640x480)) 0.780 0.362 2.16
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 320x240)) 0.434 0.443 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (960x540, 640x480)) 0.887 0.815 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 213x120)) 0.127 0.113 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 320x240)) 0.366 0.327 1.12
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 640x480)) 1.130 1.014 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 320x240)) 0.335 0.335 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (960x540, 640x480)) 0.630 0.536 1.18
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 213x120)) 0.113 0.107 1.05
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 320x240)) 0.256 0.236 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 640x480)) 0.788 0.659 1.20
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 320x240)) 0.110 0.112 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (960x540, 640x480)) 1.016 0.504 2.01
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 213x120)) 0.117 0.063 1.86
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 320x240)) 0.337 0.171 1.97
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 640x480)) 1.146 0.572 2.00
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 320x240)) 0.131 0.134 0.97
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (960x540, 640x480)) 1.322 1.214 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 213x120)) 0.186 0.166 1.12
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 320x240)) 0.545 0.485 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 640x480)) 1.683 1.517 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 320x240)) 0.450 0.444 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (960x540, 640x480)) 0.981 0.849 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 213x120)) 0.170 0.158 1.08
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 320x240)) 0.407 0.373 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 640x480)) 1.240 1.100 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 320x240)) 0.087 0.087 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (960x540, 640x480)) 1.348 0.536 2.51
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 213x120)) 0.153 0.070 2.19
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 320x240)) 0.469 0.187 2.51
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 640x480)) 1.534 0.608 2.52
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 320x240)) 0.130 0.128 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (960x540, 640x480)) 1.756 1.616 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 213x120)) 0.245 0.217 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 320x240)) 0.723 0.627 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 640x480)) 2.243 2.005 1.12
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 320x240)) 0.229 0.227 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (960x540, 640x480)) 1.354 1.178 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 213x120)) 0.244 0.206 1.19
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 320x240)) 0.633 0.577 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 640x480)) 1.769 1.538 1.15
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 960x540)) 0.501 0.280 1.79
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 1280x720)) 0.705 0.394 1.79
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 960x540)) 0.621 0.563 1.10
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 1280x720)) 0.867 0.782 1.11
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 960x540)) 0.456 0.376 1.21
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 1280x720)) 0.645 0.527 1.23
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 960x540)) 0.987 0.375 2.63
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 1280x720)) 1.394 0.598 2.33
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 960x540)) 1.229 1.134 1.08
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 1280x720)) 1.692 1.579 1.07
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 960x540)) 0.934 0.761 1.23
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 1280x720)) 1.329 1.090 1.22
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 960x540)) 1.481 0.724 2.05
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 1280x720)) 2.093 1.115 1.88
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 960x540)) 1.830 1.657 1.10
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 1280x720)) 2.562 2.358 1.09
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 960x540)) 1.410 1.175 1.20
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 1280x720)) 2.104 1.750 1.20
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 960x540)) 1.975 0.803 2.46
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 1280x720)) 2.896 1.318 2.20
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 960x540)) 2.458 2.231 1.10
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 1280x720)) 3.497 3.151 1.11
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 960x540)) 1.940 1.702 1.14
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 1280x720)) 2.931 2.618 1.12
Performance for AVX2 baseline
Performance test Reference time PR time Speedup
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 320x240)) 0.010 0.010 0.97
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (960x540, 640x480)) 0.329 0.186 1.77
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 213x120)) 0.043 0.029 1.47
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 320x240)) 0.117 0.073 1.60
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 640x480)) 0.375 0.219 1.71
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 320x240)) 0.017 0.019 0.86
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (960x540, 640x480)) 0.454 0.393 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 213x120)) 0.055 0.055 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 320x240)) 0.162 0.159 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 640x480)) 0.521 0.488 1.07
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 320x240)) 0.055 0.050 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (960x540, 640x480)) 0.342 0.270 1.27
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 213x120)) 0.055 0.048 1.14
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 320x240)) 0.134 0.123 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 640x480)) 0.371 0.304 1.22
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 320x240)) 0.453 0.480 0.94
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (960x540, 640x480)) 0.674 0.250 2.70
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 213x120)) 0.082 0.044 1.87
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 320x240)) 0.255 0.118 2.16
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 640x480)) 0.797 0.330 2.41
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 320x240)) 0.429 0.434 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (960x540, 640x480)) 0.922 0.772 1.19
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 213x120)) 0.110 0.109 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 320x240)) 0.314 0.314 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 640x480)) 1.068 0.960 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 320x240)) 0.365 0.364 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (960x540, 640x480)) 0.692 0.530 1.31
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 213x120)) 0.115 0.105 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 320x240)) 0.260 0.234 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 640x480)) 0.801 0.645 1.24
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 320x240)) 0.079 0.078 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (960x540, 640x480)) 0.983 0.471 2.09
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 213x120)) 0.122 0.061 1.99
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 320x240)) 0.337 0.154 2.18
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 640x480)) 1.138 0.525 2.17
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 320x240)) 0.124 0.121 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (960x540, 640x480)) 1.347 1.178 1.14
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 213x120)) 0.158 0.159 0.99
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 320x240)) 0.467 0.459 1.02
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 640x480)) 1.559 1.465 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 320x240)) 0.495 0.492 1.01
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (960x540, 640x480)) 1.073 0.844 1.27
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 213x120)) 0.173 0.157 1.10
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 320x240)) 0.411 0.371 1.11
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 640x480)) 1.262 1.117 1.13
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 320x240)) 0.073 0.068 1.07
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (960x540, 640x480)) 1.316 0.455 2.89
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 213x120)) 0.155 0.067 2.30
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 320x240)) 0.441 0.180 2.45
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 640x480)) 1.520 0.523 2.91
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 320x240)) 0.122 0.115 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (960x540, 640x480)) 1.746 1.524 1.15
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 213x120)) 0.205 0.206 1.00
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 320x240)) 0.612 0.626 0.98
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 640x480)) 2.020 1.942 1.04
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 320x240)) 0.237 0.224 1.06
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (960x540, 640x480)) 1.459 1.189 1.23
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 213x120)) 0.242 0.203 1.19
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 320x240)) 0.630 0.576 1.09
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 640x480)) 1.803 1.538 1.17
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 960x540)) 0.488 0.274 1.78
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 1280x720)) 0.652 0.379 1.72
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 960x540)) 0.647 0.545 1.19
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 1280x720)) 0.917 0.742 1.24
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 960x540)) 0.508 0.391 1.30
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 1280x720)) 0.710 0.547 1.30
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 960x540)) 0.917 0.358 2.56
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 1280x720)) 1.277 0.475 2.69
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 960x540)) 1.322 1.066 1.24
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 1280x720)) 1.767 1.454 1.21
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 960x540)) 1.019 0.782 1.30
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 1280x720)) 1.457 1.140 1.28
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 960x540)) 1.431 0.689 2.08
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 1280x720)) 1.935 0.948 2.04
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 960x540)) 1.952 1.584 1.23
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 1280x720)) 2.826 2.215 1.28
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 960x540)) 1.555 1.236 1.26
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 1280x720)) 2.298 1.881 1.22
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 960x540)) 1.947 0.745 2.61
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 1280x720)) 2.647 1.213 2.18
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 960x540)) 2.644 2.067 1.28
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 1280x720)) 3.720 2.966 1.25
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 960x540)) 2.126 1.724 1.23
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 1280x720)) 3.160 2.588 1.22

Copy link
Copy Markdown
Owner

@pmur pmur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 4 channel numbers look good. I see a minor regression on 8u3 against the parent commit, but that may still be faster than neither.

Geometric mean (ms)

                              Name of Test                               resize resize   resize  
                                                                                           vs    
                                                                                         resize  
                                                                                       (x-factor)
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 320x240))   0.081  0.084     0.97   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (960x540, 640x480))   1.799  1.372     1.31   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 213x120))  0.215  0.183     1.17   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 320x240))  0.597  0.495     1.20   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC1, (1280x720, 640x480))  2.026  1.592     1.27   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 320x240))  0.149  0.151     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (960x540, 640x480))  2.180  2.230     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 213x120)) 0.262  0.267     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 320x240)) 0.755  0.769     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC1, (1280x720, 640x480)) 2.505  2.556     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 320x240))  0.179  0.180     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (960x540, 640x480))  2.140  2.087     1.03   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 213x120)) 0.258  0.255     1.01   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 320x240)) 0.736  0.702     1.05   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC1, (1280x720, 640x480)) 2.450  2.379     1.03   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 320x240))   2.368  2.493     0.95   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (960x540, 640x480))   3.497  2.039     1.71   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 213x120))  0.404  0.296     1.37   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 320x240))  1.170  0.850     1.38   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC2, (1280x720, 640x480))  3.952  2.560     1.54   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 320x240))  2.233  2.284     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (960x540, 640x480))  4.287  4.303     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 213x120)) 0.513  0.512     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 320x240)) 1.469  1.495     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC2, (1280x720, 640x480)) 5.024  5.031     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 320x240))  2.175  2.154     1.01   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (960x540, 640x480))  4.116  4.100     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 213x120)) 0.482  0.484     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 320x240)) 1.381  1.395     0.99   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC2, (1280x720, 640x480)) 4.743  4.711     1.01   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 320x240))   0.394  0.394     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (960x540, 640x480))   5.204  5.379     0.97   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 213x120))  0.594  0.691     0.86   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 320x240))  1.726  2.019     0.86   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC3, (1280x720, 640x480))  5.839  6.475     0.90   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 320x240))  0.923  0.923     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (960x540, 640x480))  6.432  6.459     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 213x120)) 0.753  0.749     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 320x240)) 2.229  2.225     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC3, (1280x720, 640x480)) 7.483  7.412     1.01   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 320x240))  3.149  3.264     0.96   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (960x540, 640x480))  6.221  6.042     1.03   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 213x120)) 0.715  0.715     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 320x240)) 2.116  2.069     1.02   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC3, (1280x720, 640x480)) 7.092  7.045     1.01   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 320x240))   0.304  0.304     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (960x540, 640x480))   6.821  3.707     1.84   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 213x120))  0.767  0.420     1.83   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 320x240))  2.295  1.217     1.89   
resizeDownLinearNonExact::MatInfo_SizePair::(8UC4, (1280x720, 640x480))  7.738  4.081     1.90   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 320x240))  0.561  0.561     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (960x540, 640x480))  8.522  8.668     0.98   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 213x120)) 1.001  0.999     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 320x240)) 2.913  2.882     1.01   
resizeDownLinearNonExact::MatInfo_SizePair::(16UC4, (1280x720, 640x480)) 9.792  9.931     0.99   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 320x240))  0.399  0.391     1.02   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (960x540, 640x480))  8.231  7.931     1.04   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 213x120)) 0.941  0.941     1.00   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 320x240)) 2.850  2.769     1.03   
resizeDownLinearNonExact::MatInfo_SizePair::(32FC4, (1280x720, 640x480)) 9.594  9.336     1.03   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 960x540))     2.658  2.028     1.31   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC1, (640x480, 1280x720))    3.853  2.944     1.31   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 960x540))    3.204  3.226     0.99   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC1, (640x480, 1280x720))   4.495  4.549     0.99   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 960x540))    3.036  2.955     1.03   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC1, (640x480, 1280x720))   4.242  4.193     1.01   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 960x540))     5.109  3.003     1.70   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC2, (640x480, 1280x720))    7.289  4.574     1.59   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 960x540))    6.172  6.320     0.98   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC2, (640x480, 1280x720))   8.790  8.962     0.98   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 960x540))    5.985  5.899     1.01   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC2, (640x480, 1280x720))   8.385  8.210     1.02   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 960x540))     7.625  7.836     0.97   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC3, (640x480, 1280x720))    11.091 11.137    1.00   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 960x540))    9.243  9.463     0.98   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC3, (640x480, 1280x720))   13.014 13.376    0.97   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 960x540))    8.932  8.717     1.02   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC3, (640x480, 1280x720))   12.762 12.150    1.05   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 960x540))     10.124 5.679     1.78   
resizeUpLinearNonExact::MatInfo_SizePair::(8UC4, (640x480, 1280x720))    14.647 8.679     1.69   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 960x540))    12.291 12.495    0.98   
resizeUpLinearNonExact::MatInfo_SizePair::(16UC4, (640x480, 1280x720))   17.474 17.744    0.98   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 960x540))    11.949 11.492    1.04   
resizeUpLinearNonExact::MatInfo_SizePair::(32FC4, (640x480, 1280x720))   16.816 16.245    1.04  

@pmur
Copy link
Copy Markdown
Owner

pmur commented Nov 26, 2019

The regression is an artifact of the suboptimal v_load_expand_q on PPC. I pulled in my copymask PR which rewrites this. The regression is gone. Thanks! @terfendail what is the preferred path this patch into the PR? I hold no strong opinions.

I can keep the HAL improvement in PR 15596 or move it to a separate PR.

@terfendail
Copy link
Copy Markdown
Author

It looks like there is a reasonable architecture/API discussion related to universal intrinsics in PR#15596. While improvement of suboptimal intrinsic is certainly a must have. I prefer to extract the improvement to separate PR that could be merged easily and quickly so all PPC users benefit ASAP

@pmur pmur force-pushed the resize branch 2 times, most recently from 8706162 to 3e14ba5 Compare December 5, 2019 14:31
@terfendail terfendail closed this Dec 9, 2019
@terfendail terfendail deleted the resize_u8 branch December 9, 2019 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants