Skip to content

Use generic SIMD in warpAffineBlocklineNN#26203

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
FantasqueX:generic-simd-warpAffineBlocklineNN
Nov 1, 2024
Merged

Use generic SIMD in warpAffineBlocklineNN#26203
asmorkalov merged 1 commit intoopencv:4.xfrom
FantasqueX:generic-simd-warpAffineBlocklineNN

Conversation

@FantasqueX
Copy link
Copy Markdown
Contributor

Closes: #26185

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@FantasqueX
Copy link
Copy Markdown
Contributor Author

I'll add perf results of SSE4_1 and AVX two weeks later since I'll go on a vocation.

@asmorkalov asmorkalov changed the title Use generic SIMD in warpAffineBlocklineNN WIP: Use generic SIMD in warpAffineBlocklineNN Oct 10, 2024
@FantasqueX FantasqueX force-pushed the generic-simd-warpAffineBlocklineNN branch from a4c968b to 45b9398 Compare October 13, 2024 17:28
@FantasqueX
Copy link
Copy Markdown
Contributor Author

Here is the perf result on my 8845hs

CPU_BASELINE=SSE3

Geometric mean (ms)

                                 Name of Test                                  before after   after   
                                                                                                vs    
                                                                                              before  
                                                                                            (x-factor)
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_CONSTANT)     0.121  0.120    1.01   
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_REPLICATE)    0.121  0.121    1.00   
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_CONSTANT)    0.069  0.061    1.13   
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_REPLICATE)   0.069  0.058    1.19   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_CONSTANT)    0.257  0.257    1.00   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_REPLICATE)   0.377  0.363    1.04   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_CONSTANT)   0.141  0.120    1.17   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_REPLICATE)  0.159  0.138    1.15   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)   0.528  0.523    1.01   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)  0.920  0.904    1.02   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)  0.296  0.235    1.26   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_REPLICATE) 0.370  0.329    1.12   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_CONSTANT)     0.157  0.164    0.96   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_REPLICATE)    0.156  0.156    1.00   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_CONSTANT)    0.130  0.127    1.02   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_REPLICATE)   0.098  0.088    1.12   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_CONSTANT)    0.324  0.321    1.01   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_REPLICATE)   0.664  0.666    1.00   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_CONSTANT)   0.229  0.209    1.10   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_REPLICATE)  0.262  0.325    0.81   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)   0.654  0.658    0.99   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)  1.836  1.843    1.00   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)  0.527  0.477    1.10   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_REPLICATE) 0.624  0.574    1.09

CPU_BASELINE=AVX2

Geometric mean (ms)

                                 Name of Test                                   avx2  avx2     avx2   
                                                                               before after   after   
                                                                                                vs    
                                                                                               avx2   
                                                                                              before  
                                                                                            (x-factor)
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_CONSTANT)     0.111  0.117    0.94   
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_REPLICATE)    0.111  0.123    0.90   
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_CONSTANT)    0.078  0.056    1.38   
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_REPLICATE)   0.071  0.056    1.27   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_CONSTANT)    0.232  0.233    0.99   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_REPLICATE)   0.360  0.359    1.00   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_CONSTANT)   0.257  0.115    2.24   
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_REPLICATE)  0.155  0.136    1.14   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)   0.458  0.531    0.86   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)  0.877  0.913    0.96   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)  0.282  0.239    1.18   
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_REPLICATE) 0.405  0.360    1.13   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_CONSTANT)     0.142  0.146    0.97   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_REPLICATE)    0.144  0.144    1.00   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_CONSTANT)    0.100  0.151    0.66   
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_REPLICATE)   0.109  0.087    1.25   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_CONSTANT)    0.306  0.312    0.98   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_REPLICATE)   0.666  0.649    1.03   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_CONSTANT)   0.238  0.213    1.12   
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_REPLICATE)  0.259  0.240    1.08   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)   0.634  0.615    1.03   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)  1.807  1.822    0.99   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)  0.594  0.517    1.15   
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_REPLICATE) 0.647  0.603    1.07

@FantasqueX FantasqueX changed the title WIP: Use generic SIMD in warpAffineBlocklineNN Use generic SIMD in warpAffineBlocklineNN Oct 13, 2024
@FantasqueX
Copy link
Copy Markdown
Contributor Author

@asmorkalov Hi, could you please review this PR?

1 similar comment
@FantasqueX
Copy link
Copy Markdown
Contributor Author

@asmorkalov Hi, could you please review this PR?

@asmorkalov
Copy link
Copy Markdown
Contributor

I tried the patch with several devices and it looks useful.
Perf numbers for core i5-2700k (no avx2):

Geometric mean (ms)

                                           Name of Test                                            4.x-no-ocl-3 patch-no-ocl-3 patch-no-ocl-3
                                                                                                                                     vs      
                                                                                                                                4.x-no-ocl-3 
                                                                                                                                 (x-factor)  
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC1, INTER_CUBIC)                                       1.596         1.616           0.99     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC1, INTER_LINEAR)                                      0.652         0.652           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC1, INTER_NEAREST)                                     0.258         0.213           1.21     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC1, INTER_CUBIC)                                      1.577         1.582           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC1, INTER_LINEAR)                                     0.753         0.750           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC1, INTER_NEAREST)                                    0.263         0.222           1.19     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC3, INTER_CUBIC)                                       2.637         2.648           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC3, INTER_LINEAR)                                      0.877         0.881           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC3, INTER_NEAREST)                                     0.421         0.377           1.12     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC3, INTER_CUBIC)                                      3.081         3.066           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC3, INTER_LINEAR)                                     1.808         1.837           0.98     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC3, INTER_NEAREST)                                    0.715         0.675           1.06     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC4, INTER_CUBIC)                                       3.122         3.123           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC4, INTER_LINEAR)                                      0.780         0.791           0.99     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC4, INTER_NEAREST)                                     0.456         0.415           1.10     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC4, INTER_CUBIC)                                      3.614         3.605           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC4, INTER_LINEAR)                                     2.136         2.126           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC4, INTER_NEAREST)                                    0.986         0.939           1.05     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC1, INTER_CUBIC)                                      3.418         3.430           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC1, INTER_LINEAR)                                     1.377         1.379           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC1, INTER_NEAREST)                                    0.556         0.455           1.22     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC1, INTER_CUBIC)                                     3.569         3.535           1.01     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC1, INTER_LINEAR)                                    1.774         1.769           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC1, INTER_NEAREST)                                   0.882         0.790           1.12     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC3, INTER_CUBIC)                                      5.718         5.696           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC3, INTER_LINEAR)                                     1.948         1.957           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC3, INTER_NEAREST)                                    1.057         0.960           1.10     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC3, INTER_CUBIC)                                     6.761         6.777           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC3, INTER_LINEAR)                                    4.388         4.413           0.99     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC3, INTER_NEAREST)                                   2.192         2.123           1.03     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC4, INTER_CUBIC)                                      6.893         6.852           1.01     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC4, INTER_LINEAR)                                     1.845         1.872           0.99     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC4, INTER_NEAREST)                                    1.263         1.169           1.08     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC4, INTER_CUBIC)                                     8.085         8.090           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC4, INTER_LINEAR)                                    3.955         3.957           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC4, INTER_NEAREST)                                   2.600         2.549           1.02     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC1, INTER_CUBIC)                                     6.674         6.680           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC1, INTER_LINEAR)                                    2.786         2.784           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC1, INTER_NEAREST)                                   1.150         0.954           1.21     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC1, INTER_CUBIC)                                    7.218         7.169           1.01     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC1, INTER_LINEAR)                                   3.947         3.936           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC1, INTER_NEAREST)                                  2.516         2.343           1.07     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC3, INTER_CUBIC)                                     11.235        11.176          1.01     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC3, INTER_LINEAR)                                    4.224         4.211           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC3, INTER_NEAREST)                                   2.606         2.422           1.08     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC3, INTER_CUBIC)                                    13.355        13.266          1.01     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC3, INTER_LINEAR)                                   8.910         8.931           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC3, INTER_NEAREST)                                  4.548         4.433           1.03     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC4, INTER_CUBIC)                                     13.346        13.374          1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC4, INTER_LINEAR)                                    4.114         4.103           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC4, INTER_NEAREST)                                   3.063         2.859           1.07     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC4, INTER_CUBIC)                                    15.638        15.661          1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC4, INTER_LINEAR)                                   9.832         9.807           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC4, INTER_NEAREST)                                  5.730         5.621           1.02     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC1, INTER_CUBIC)                                     25.342        25.154          1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC1, INTER_LINEAR)                                    11.168        11.220          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC1, INTER_NEAREST)                                   4.903         4.169           1.18     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC1, INTER_CUBIC)                                    26.172        26.175          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC1, INTER_LINEAR)                                   14.531        14.548          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC1, INTER_NEAREST)                                  9.178         8.511           1.08     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC3, INTER_CUBIC)                                     40.752        40.557          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC3, INTER_LINEAR)                                    16.467        16.426          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC3, INTER_NEAREST)                                   10.384        9.670           1.07     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC3, INTER_CUBIC)                                    51.827        51.538          1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC3, INTER_LINEAR)                                   35.744        35.478          1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC3, INTER_NEAREST)                                  17.344        16.941          1.02     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC4, INTER_CUBIC)                                     48.283        47.940          1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC4, INTER_LINEAR)                                    15.000        15.034          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC4, INTER_NEAREST)                                   11.707        10.975          1.07     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC4, INTER_CUBIC)                                    60.384        60.277          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC4, INTER_LINEAR)                                   33.871        33.642          1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC4, INTER_NEAREST)                                  21.165        20.908          1.01     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_CONSTANT)                            0.626         0.628           1.00     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_REPLICATE)                           0.625         0.626           1.00     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_CONSTANT)                           0.262         0.219           1.20     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_REPLICATE)                          0.256         0.213           1.20     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_CONSTANT)                           1.359         1.364           1.00     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_REPLICATE)                          1.778         1.786           1.00     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_CONSTANT)                          0.549         0.454           1.21     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_REPLICATE)                         0.643         0.539           1.19     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)                          2.724         2.721           1.00     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)                         4.380         4.389           1.00     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)                         1.072         0.885           1.21     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_REPLICATE)                        1.461         1.273           1.15     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_CONSTANT)                            0.731         0.740           0.99     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_REPLICATE)                           0.731         0.737           0.99     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_CONSTANT)                           0.389         0.345           1.13     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_REPLICATE)                          0.389         0.346           1.12     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_CONSTANT)                           1.730         1.740           0.99     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_REPLICATE)                          2.638         2.653           0.99     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_CONSTANT)                          1.164         1.062           1.10     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_REPLICATE)                         1.277         1.187           1.08     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)                          3.721         3.746           0.99     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)                         7.233         7.225           1.00     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)                         3.008         2.815           1.07     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_REPLICATE)                        3.420         3.212           1.06     

And almost similar story for AMD ryzen 7 2700:

Geometric mean (ms)

                                           Name of Test                                            4.x-no-ocl-3 patch-no-ocl-3 patch-no-ocl-3
                                                                                                                                     vs      
                                                                                                                                4.x-no-ocl-3 
                                                                                                                                 (x-factor)  
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC1, INTER_CUBIC)                                       0.722         0.720           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC1, INTER_LINEAR)                                      0.273         0.285           0.95     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC1, INTER_NEAREST)                                     0.222         0.173           1.29     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC1, INTER_CUBIC)                                      0.670         0.672           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC1, INTER_LINEAR)                                     0.315         0.314           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC1, INTER_NEAREST)                                    0.116         0.094           1.23     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC3, INTER_CUBIC)                                       1.335         1.336           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC3, INTER_LINEAR)                                      0.329         0.329           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC3, INTER_NEAREST)                                     0.165         0.147           1.12     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC3, INTER_CUBIC)                                      1.204         1.218           0.99     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC3, INTER_LINEAR)                                     0.528         0.526           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC3, INTER_NEAREST)                                    0.195         0.169           1.15     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC4, INTER_CUBIC)                                       1.646         1.651           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC4, INTER_LINEAR)                                      0.313         0.312           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 8UC4, INTER_NEAREST)                                     0.189         0.167           1.13     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC4, INTER_CUBIC)                                      1.489         1.489           1.00     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC4, INTER_LINEAR)                                     0.654         0.647           1.01     
WarpAffine::OCL_WarpAffineFixture::(640x480, 32FC4, INTER_NEAREST)                                    0.244         0.217           1.13     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC1, INTER_CUBIC)                                      1.126         1.208           0.93     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC1, INTER_LINEAR)                                     0.484         0.876           0.55     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC1, INTER_NEAREST)                                    0.321         0.327           0.98     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC1, INTER_CUBIC)                                     1.055         1.072           0.98     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC1, INTER_LINEAR)                                    0.548         0.552           0.99     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC1, INTER_NEAREST)                                   0.224         0.193           1.16     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC3, INTER_CUBIC)                                      2.331         2.370           0.98     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC3, INTER_LINEAR)                                     0.565         0.584           0.97     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC3, INTER_NEAREST)                                    0.333         0.303           1.10     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC3, INTER_CUBIC)                                     2.215         2.196           1.01     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC3, INTER_LINEAR)                                    1.094         1.069           1.02     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC3, INTER_NEAREST)                                   0.804         0.792           1.01     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC4, INTER_CUBIC)                                      2.852         2.782           1.03     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC4, INTER_LINEAR)                                     0.541         0.557           0.97     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 8UC4, INTER_NEAREST)                                    0.384         0.355           1.08     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC4, INTER_CUBIC)                                     2.804         2.804           1.00     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC4, INTER_LINEAR)                                    1.419         1.439           0.99     
WarpAffine::OCL_WarpAffineFixture::(1280x720, 32FC4, INTER_NEAREST)                                   1.196         1.189           1.01     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC1, INTER_CUBIC)                                     2.042         2.097           0.97     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC1, INTER_LINEAR)                                    1.177         1.136           1.04     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC1, INTER_NEAREST)                                   0.469         0.390           1.20     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC1, INTER_CUBIC)                                    1.954         1.972           0.99     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC1, INTER_LINEAR)                                   1.157         1.153           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC1, INTER_NEAREST)                                  0.581         0.524           1.11     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC3, INTER_CUBIC)                                     3.977         3.989           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC3, INTER_LINEAR)                                    1.132         1.163           0.97     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC3, INTER_NEAREST)                                   0.692         0.631           1.10     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC3, INTER_CUBIC)                                    4.166         4.178           1.00     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC3, INTER_LINEAR)                                   3.217         2.849           1.13     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC3, INTER_NEAREST)                                  2.366         2.331           1.01     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC4, INTER_CUBIC)                                     4.869         4.905           0.99     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC4, INTER_LINEAR)                                    1.146         1.172           0.98     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 8UC4, INTER_NEAREST)                                   0.888         0.803           1.11     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC4, INTER_CUBIC)                                    5.106         5.019           1.02     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC4, INTER_LINEAR)                                   3.723         3.665           1.02     
WarpAffine::OCL_WarpAffineFixture::(1920x1080, 32FC4, INTER_NEAREST)                                  2.988         2.953           1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC1, INTER_CUBIC)                                     7.663         7.706           0.99     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC1, INTER_LINEAR)                                    3.648         3.792           0.96     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC1, INTER_NEAREST)                                   1.775         1.465           1.21     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC1, INTER_CUBIC)                                    7.669         7.663           1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC1, INTER_LINEAR)                                   5.078         4.955           1.02     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC1, INTER_NEAREST)                                  3.468         3.314           1.05     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC3, INTER_CUBIC)                                     14.168        14.216          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC3, INTER_LINEAR)                                    4.959         5.071           0.98     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC3, INTER_NEAREST)                                   3.751         3.590           1.04     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC3, INTER_CUBIC)                                    14.536        14.552          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC3, INTER_LINEAR)                                   10.115        10.125          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC3, INTER_NEAREST)                                  8.639         8.587           1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC4, INTER_CUBIC)                                     16.873        17.069          0.99     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC4, INTER_LINEAR)                                    5.120         5.174           0.99     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 8UC4, INTER_NEAREST)                                   4.408         4.227           1.04     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC4, INTER_CUBIC)                                    17.450        17.402          1.00     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC4, INTER_LINEAR)                                   11.938        11.824          1.01     
WarpAffine::OCL_WarpAffineFixture::(3840x2160, 32FC4, INTER_NEAREST)                                  10.935        10.953          1.00     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_CONSTANT)                            0.255         0.252           1.01     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_LINEAR, BORDER_REPLICATE)                           0.255         0.254           1.00     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_CONSTANT)                           0.108         0.143           0.76     
WarpAffine::TestWarpAffine::(8UC1, 640x480, INTER_NEAREST, BORDER_REPLICATE)                          0.108         0.113           0.96     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_CONSTANT)                           0.430         0.458           0.94     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_LINEAR, BORDER_REPLICATE)                          0.576         0.595           0.97     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_CONSTANT)                          0.211         0.170           1.24     
WarpAffine::TestWarpAffine::(8UC1, 1280x720, INTER_NEAREST, BORDER_REPLICATE)                         0.274         0.227           1.20     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)                          0.876         0.939           0.93     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)                         1.360         1.400           0.97     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)                         0.403         0.329           1.23     
WarpAffine::TestWarpAffine::(8UC1, 1920x1080, INTER_NEAREST, BORDER_REPLICATE)                        0.665         0.517           1.29     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_CONSTANT)                            0.300         0.297           1.01     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_LINEAR, BORDER_REPLICATE)                           0.303         0.296           1.02     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_CONSTANT)                           0.174         0.155           1.12     
WarpAffine::TestWarpAffine::(8UC4, 640x480, INTER_NEAREST, BORDER_REPLICATE)                          0.173         0.152           1.14     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_CONSTANT)                           0.517         0.538           0.96     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_LINEAR, BORDER_REPLICATE)                          0.923         0.990           0.93     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_CONSTANT)                          0.354         0.323           1.10     
WarpAffine::TestWarpAffine::(8UC4, 1280x720, INTER_NEAREST, BORDER_REPLICATE)                         0.419         0.397           1.05     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_CONSTANT)                          1.046         1.096           0.95     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_LINEAR, BORDER_REPLICATE)                         2.626         2.638           1.00     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_CONSTANT)                         0.743         0.657           1.13     
WarpAffine::TestWarpAffine::(8UC4, 1920x1080, INTER_NEAREST, BORDER_REPLICATE)                        0.965         0.925           1.04 

@asmorkalov asmorkalov self-assigned this Oct 31, 2024
@asmorkalov asmorkalov merged commit ee95bfe into opencv:4.x Nov 1, 2024
@asmorkalov asmorkalov mentioned this pull request Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable SIMD256 for warpAffineBlocklineNN

2 participants