Skip to content

Resize reworked using wide universal intrinsics#13781

Merged
alalek merged 3 commits intoopencv:3.4from
terfendail:warp_wintr
Feb 20, 2019
Merged

Resize reworked using wide universal intrinsics#13781
alalek merged 3 commits intoopencv:3.4from
terfendail:warp_wintr

Conversation

@terfendail
Copy link
Copy Markdown
Contributor

@terfendail terfendail commented Feb 8, 2019

This pullrequest changes

Reworked bit-exact linear resize using new wide LUT intrinsics and add support for CV_8UC3

force_builders=Custom
buildworker:Custom=linux-1,linux-2
docker_image:Custom=powerpc64le

@terfendail terfendail force-pushed the warp_wintr branch 12 times, most recently from 7715075 to e56f8b0 Compare February 18, 2019 13:11
@terfendail terfendail force-pushed the warp_wintr branch 2 times, most recently from 0f64726 to f447cbb Compare February 19, 2019 09:22
@terfendail terfendail changed the title Resize and remap reworked using wide universal intrinsics Resize reworked using wide universal intrinsics Feb 19, 2019
@terfendail
Copy link
Copy Markdown
Contributor Author

Performance for SSE2 baseline
Performance test Reference time PR time Speedup
resizeDownLinear::MatInfo_Size_Size::(8UC1, 640x480, 320x240) 0.013 0.013 1.01
resizeDownLinear::MatInfo_Size_Size::(8UC1, 960x540, 640x480) 0.300 0.300 1.00
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 213x120) 0.055 0.055 1.01
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 320x240) 0.127 0.129 0.99
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 640x480) 0.363 0.365 0.99
resizeDownLinear::MatInfo_Size_Size::(8UC2, 640x480, 320x240) 0.238 0.193 1.24
resizeDownLinear::MatInfo_Size_Size::(8UC2, 960x540, 640x480) 0.580 0.471 1.23
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 213x120) 0.093 0.079 1.18
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 320x240) 0.242 0.196 1.23
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 640x480) 0.709 0.564 1.26
resizeDownLinear::MatInfo_Size_Size::(8UC3, 640x480, 320x240) 0.117 0.118 0.99
resizeDownLinear::MatInfo_Size_Size::(8UC3, 960x540, 640x480) 1.693 1.035 1.64
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 213x120) 0.256 0.164 1.56
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 320x240) 0.729 0.453 1.61
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 640x480) 2.184 1.317 1.66
resizeDownLinear::MatInfo_Size_Size::(8UC4, 640x480, 320x240) 0.086 0.087 1.00
resizeDownLinear::MatInfo_Size_Size::(8UC4, 960x540, 640x480) 0.860 0.798 1.08
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 213x120) 0.132 0.124 1.06
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 320x240) 0.354 0.330 1.07
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 640x480) 1.052 0.977 1.08
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 960x540) 0.424 0.406 1.04
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 1280x720) 0.596 0.597 1.00
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 960x540) 0.809 0.635 1.27
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 1280x720) 1.144 0.951 1.20
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 960x540) 2.296 1.421 1.62
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 1280x720) 3.169 2.049 1.55
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 960x540) 1.204 1.135 1.06
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 1280x720) 1.749 1.662 1.05
Performance for SSE3 baseline
Performance test Reference time PR time Speedup
resizeDownLinear::MatInfo_Size_Size::(8UC1, 640x480, 320x240) 0.014 0.013 1.02
resizeDownLinear::MatInfo_Size_Size::(8UC1, 960x540, 640x480) 0.299 0.301 0.99
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 213x120) 0.055 0.055 1.00
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 320x240) 0.130 0.130 1.00
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 640x480) 0.362 0.363 1.00
resizeDownLinear::MatInfo_Size_Size::(8UC2, 640x480, 320x240) 0.239 0.196 1.22
resizeDownLinear::MatInfo_Size_Size::(8UC2, 960x540, 640x480) 0.579 0.471 1.23
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 213x120) 0.093 0.077 1.20
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 320x240) 0.247 0.202 1.22
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 640x480) 0.708 0.565 1.25
resizeDownLinear::MatInfo_Size_Size::(8UC3, 640x480, 320x240) 0.117 0.118 1.00
resizeDownLinear::MatInfo_Size_Size::(8UC3, 960x540, 640x480) 1.692 1.045 1.62
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 213x120) 0.257 0.164 1.57
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 320x240) 0.748 0.463 1.62
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 640x480) 2.177 1.325 1.64
resizeDownLinear::MatInfo_Size_Size::(8UC4, 640x480, 320x240) 0.086 0.087 0.99
resizeDownLinear::MatInfo_Size_Size::(8UC4, 960x540, 640x480) 0.860 0.795 1.08
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 213x120) 0.132 0.124 1.07
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 320x240) 0.362 0.337 1.08
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 640x480) 1.053 0.974 1.08
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 960x540) 0.416 0.434 0.96
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 1280x720) 0.574 0.608 0.94
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 960x540) 0.809 0.676 1.20
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 1280x720) 1.085 0.972 1.12
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 960x540) 2.234 1.471 1.52
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 1280x720) 3.028 2.035 1.49
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 960x540) 1.148 1.158 0.99
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 1280x720) 1.715 1.648 1.04
Performance for SSE4_2 baseline
Performance test Reference time PR time Speedup
resizeDownLinear::MatInfo_Size_Size::(8UC1, 640x480, 320x240) 0.014 0.013 1.03
resizeDownLinear::MatInfo_Size_Size::(8UC1, 960x540, 640x480) 0.253 0.250 1.01
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 213x120) 0.048 0.047 1.02
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 320x240) 0.109 0.106 1.04
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 640x480) 0.307 0.296 1.04
resizeDownLinear::MatInfo_Size_Size::(8UC2, 640x480, 320x240) 0.215 0.144 1.49
resizeDownLinear::MatInfo_Size_Size::(8UC2, 960x540, 640x480) 0.510 0.356 1.43
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 213x120) 0.082 0.060 1.36
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 320x240) 0.217 0.146 1.49
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 640x480) 0.638 0.422 1.51
resizeDownLinear::MatInfo_Size_Size::(8UC3, 640x480, 320x240) 0.112 0.110 1.02
resizeDownLinear::MatInfo_Size_Size::(8UC3, 960x540, 640x480) 1.692 0.773 2.19
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 213x120) 0.262 0.125 2.10
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 320x240) 0.747 0.329 2.27
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 640x480) 2.182 0.969 2.25
resizeDownLinear::MatInfo_Size_Size::(8UC4, 640x480, 320x240) 0.087 0.086 1.02
resizeDownLinear::MatInfo_Size_Size::(8UC4, 960x540, 640x480) 0.772 0.605 1.28
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 213x120) 0.120 0.094 1.28
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 320x240) 0.320 0.239 1.34
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 640x480) 0.966 0.718 1.34
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 960x540) 0.363 0.354 1.03
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 1280x720) 0.511 0.507 1.01
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 960x540) 0.730 0.508 1.44
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 1280x720) 1.044 0.749 1.39
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 960x540) 2.303 1.071 2.15
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 1280x720) 3.187 1.535 2.08
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 960x540) 1.095 0.870 1.26
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 1280x720) 1.603 1.300 1.23
Performance for AVX2 baseline
Performance test Reference time PR time Speedup
resizeDownLinear::MatInfo_Size_Size::(8UC1, 640x480, 320x240) 0.010 0.010 1.02
resizeDownLinear::MatInfo_Size_Size::(8UC1, 960x540, 640x480) 0.304 0.313 0.97
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 213x120) 0.056 0.059 0.95
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 320x240) 0.131 0.136 0.96
resizeDownLinear::MatInfo_Size_Size::(8UC1, 1280x720, 640x480) 0.370 0.381 0.97
resizeDownLinear::MatInfo_Size_Size::(8UC2, 640x480, 320x240) 0.204 0.151 1.35
resizeDownLinear::MatInfo_Size_Size::(8UC2, 960x540, 640x480) 0.500 0.372 1.34
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 213x120) 0.086 0.067 1.28
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 320x240) 0.212 0.155 1.37
resizeDownLinear::MatInfo_Size_Size::(8UC2, 1280x720, 640x480) 0.615 0.444 1.39
resizeDownLinear::MatInfo_Size_Size::(8UC3, 640x480, 320x240) 0.077 0.079 0.97
resizeDownLinear::MatInfo_Size_Size::(8UC3, 960x540, 640x480) 1.673 0.620 2.70
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 213x120) 0.256 0.109 2.35
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 320x240) 0.734 0.271 2.70
resizeDownLinear::MatInfo_Size_Size::(8UC3, 1280x720, 640x480) 2.164 0.757 2.86
resizeDownLinear::MatInfo_Size_Size::(8UC4, 640x480, 320x240) 0.069 0.071 0.97
resizeDownLinear::MatInfo_Size_Size::(8UC4, 960x540, 640x480) 0.705 0.555 1.27
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 213x120) 0.119 0.094 1.27
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 320x240) 0.300 0.217 1.38
resizeDownLinear::MatInfo_Size_Size::(8UC4, 1280x720, 640x480) 0.901 0.660 1.37
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 960x540) 0.429 0.439 0.98
resizeUpLinear::MatInfo_Size_Size::(8UC1, 640x480, 1280x720) 0.570 0.616 0.93
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 960x540) 0.707 0.534 1.32
resizeUpLinear::MatInfo_Size_Size::(8UC2, 640x480, 1280x720) 0.956 0.781 1.22
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 960x540) 2.270 0.858 2.65
resizeUpLinear::MatInfo_Size_Size::(8UC3, 640x480, 1280x720) 3.144 1.281 2.45
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 960x540) 1.019 0.805 1.27
resizeUpLinear::MatInfo_Size_Size::(8UC4, 640x480, 1280x720) 1.557 1.231 1.26

@alalek
Copy link
Copy Markdown
Member

alalek commented Feb 19, 2019

@terfendail There is build issues with powerpc64le build.

@terfendail
Copy link
Copy Markdown
Contributor Author

powerpc64le build fixed

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Thank you 👍

@alalek alalek merged commit 334c4d6 into opencv:3.4 Feb 20, 2019
@terfendail terfendail deleted the warp_wintr branch February 20, 2019 11:38
@alalek alalek mentioned this pull request Feb 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants