Skip to content

hal/riscv_rvv: implemented flip_inplace to boost cv::rotate#27263

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
fengyuentau:4x/hal_rvv/rotate
Apr 28, 2025
Merged

hal/riscv_rvv: implemented flip_inplace to boost cv::rotate#27263
asmorkalov merged 1 commit intoopencv:4.xfrom
fengyuentau:4x/hal_rvv/rotate

Conversation

@fengyuentau
Copy link
Copy Markdown
Member

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@fengyuentau
Copy link
Copy Markdown
Member Author

Performance results (K1, K1 vs RK3568):

Details

K1 GCC

              Name of Test                base-gcc patch-gcc patch-gcc 
                                                                 vs    
                                                              base-gcc 
                                                             (x-factor)
rotate::RotateTest::(640x480, 0, 8UC1)     1.473     0.827      1.78   
rotate::RotateTest::(640x480, 0, 8SC1)     1.476     0.834      1.77   
rotate::RotateTest::(640x480, 0, 16SC1)    2.457     1.615      1.52   
rotate::RotateTest::(640x480, 0, 32SC1)    7.469     4.864      1.54   
rotate::RotateTest::(640x480, 0, 32FC1)    7.608     4.877      1.56   
rotate::RotateTest::(640x480, 0, 8UC2)     2.447     1.619      1.51   
rotate::RotateTest::(640x480, 0, 16SC2)    7.479     4.874      1.53   
rotate::RotateTest::(640x480, 0, 8UC3)     8.338     4.246      1.96   
rotate::RotateTest::(640x480, 0, 16SC3)    19.447   14.469      1.34   
rotate::RotateTest::(640x480, 0, 8UC4)     7.664     4.869      1.57   
rotate::RotateTest::(640x480, 0, 16SC4)    15.993    9.252      1.73   
rotate::RotateTest::(640x480, 1, 8UC1)     0.267     0.271      0.98   
rotate::RotateTest::(640x480, 1, 8SC1)     0.267     0.271      0.99   
rotate::RotateTest::(640x480, 1, 16SC1)    0.445     0.450      0.99   
rotate::RotateTest::(640x480, 1, 32SC1)    0.889     0.892      1.00   
rotate::RotateTest::(640x480, 1, 32FC1)    0.890     0.894      1.00   
rotate::RotateTest::(640x480, 1, 8UC2)     0.444     0.450      0.99   
rotate::RotateTest::(640x480, 1, 16SC2)    0.889     0.892      1.00   
rotate::RotateTest::(640x480, 1, 8UC3)     0.354     0.333      1.06   
rotate::RotateTest::(640x480, 1, 16SC3)    0.713     0.686      1.04   
rotate::RotateTest::(640x480, 1, 8UC4)     0.889     0.892      1.00   
rotate::RotateTest::(640x480, 1, 16SC4)    1.801     1.806      1.00   
rotate::RotateTest::(640x480, 2, 8UC1)     0.679     0.636      1.07   
rotate::RotateTest::(640x480, 2, 8SC1)     0.681     0.648      1.05   
rotate::RotateTest::(640x480, 2, 16SC1)    1.185     1.198      0.99   
rotate::RotateTest::(640x480, 2, 32SC1)    3.943     3.761      1.05   
rotate::RotateTest::(640x480, 2, 32FC1)    3.942     3.768      1.05   
rotate::RotateTest::(640x480, 2, 8UC2)     1.173     1.201      0.98   
rotate::RotateTest::(640x480, 2, 16SC2)    3.938     3.760      1.05   
rotate::RotateTest::(640x480, 2, 8UC3)     3.568     3.562      1.00   
rotate::RotateTest::(640x480, 2, 16SC3)    11.558   10.434      1.11   
rotate::RotateTest::(640x480, 2, 8UC4)     3.935     3.760      1.05   
rotate::RotateTest::(640x480, 2, 16SC4)    7.430     6.412      1.16   
rotate::RotateTest::(1280x720, 0, 8UC1)    5.303     3.788      1.40   
rotate::RotateTest::(1280x720, 0, 8SC1)    5.277     3.775      1.40   
rotate::RotateTest::(1280x720, 0, 16SC1)   17.767   10.605      1.68   
rotate::RotateTest::(1280x720, 0, 32SC1)   27.081   16.958      1.60   
rotate::RotateTest::(1280x720, 0, 32FC1)   27.062   17.002      1.59   
rotate::RotateTest::(1280x720, 0, 8UC2)    18.169   10.578      1.72   
rotate::RotateTest::(1280x720, 0, 16SC2)   26.974   16.978      1.59   
rotate::RotateTest::(1280x720, 0, 8UC3)    35.864   24.277      1.48   
rotate::RotateTest::(1280x720, 0, 16SC3)   66.634   44.261      1.51   
rotate::RotateTest::(1280x720, 0, 8UC4)    27.056   17.000      1.59   
rotate::RotateTest::(1280x720, 0, 16SC4)   33.622   18.788      1.79   
rotate::RotateTest::(1280x720, 1, 8UC1)    0.663     0.671      0.99   
rotate::RotateTest::(1280x720, 1, 8SC1)    0.664     0.670      0.99   
rotate::RotateTest::(1280x720, 1, 16SC1)   1.339     1.349      0.99   
rotate::RotateTest::(1280x720, 1, 32SC1)   2.722     2.724      1.00   
rotate::RotateTest::(1280x720, 1, 32FC1)   2.722     2.725      1.00   
rotate::RotateTest::(1280x720, 1, 8UC2)    1.340     1.349      0.99   
rotate::RotateTest::(1280x720, 1, 16SC2)   2.723     2.725      1.00   
rotate::RotateTest::(1280x720, 1, 8UC3)    1.096     1.053      1.04   
rotate::RotateTest::(1280x720, 1, 16SC3)   2.240     2.158      1.04   
rotate::RotateTest::(1280x720, 1, 8UC4)    2.720     2.728      1.00   
rotate::RotateTest::(1280x720, 1, 16SC4)   5.455     5.440      1.00   
rotate::RotateTest::(1280x720, 2, 8UC1)    3.132     2.993      1.05   
rotate::RotateTest::(1280x720, 2, 8SC1)    3.141     2.988      1.05   
rotate::RotateTest::(1280x720, 2, 16SC1)   9.147     8.583      1.07   
rotate::RotateTest::(1280x720, 2, 32SC1)   13.867   11.944      1.16   
rotate::RotateTest::(1280x720, 2, 32FC1)   13.980   11.958      1.17   
rotate::RotateTest::(1280x720, 2, 8UC2)    9.116     8.523      1.07   
rotate::RotateTest::(1280x720, 2, 16SC2)   13.915   11.953      1.16   
rotate::RotateTest::(1280x720, 2, 8UC3)    22.692   20.632      1.10   
rotate::RotateTest::(1280x720, 2, 16SC3)   42.330   35.427      1.19   
rotate::RotateTest::(1280x720, 2, 8UC4)    13.896   11.940      1.16   
rotate::RotateTest::(1280x720, 2, 16SC4)   9.661     8.446      1.14   
rotate::RotateTest::(1920x1080, 0, 8UC1)   14.889   10.823      1.38   
rotate::RotateTest::(1920x1080, 0, 8SC1)   14.781   10.881      1.36   
rotate::RotateTest::(1920x1080, 0, 16SC1)  47.082   29.307      1.61   
rotate::RotateTest::(1920x1080, 0, 32SC1)  87.530   37.351      2.34   
rotate::RotateTest::(1920x1080, 0, 32FC1)  87.036   37.346      2.33   
rotate::RotateTest::(1920x1080, 0, 8UC2)   46.982   29.349      1.60   
rotate::RotateTest::(1920x1080, 0, 16SC2)  87.593   37.728      2.32   
rotate::RotateTest::(1920x1080, 0, 8UC3)   85.585   62.332      1.37   
rotate::RotateTest::(1920x1080, 0, 16SC3) 140.956   100.544     1.40   
rotate::RotateTest::(1920x1080, 0, 8UC4)   87.216   37.726      2.31   
rotate::RotateTest::(1920x1080, 0, 16SC4)  71.931   46.696      1.54   
rotate::RotateTest::(1920x1080, 1, 8UC1)   1.605     1.620      0.99   
rotate::RotateTest::(1920x1080, 1, 8SC1)   1.606     1.619      0.99   
rotate::RotateTest::(1920x1080, 1, 16SC1)  3.062     3.075      1.00   
rotate::RotateTest::(1920x1080, 1, 32SC1)  6.096     6.123      1.00   
rotate::RotateTest::(1920x1080, 1, 32FC1)  6.097     6.125      1.00   
rotate::RotateTest::(1920x1080, 1, 8UC2)   3.064     3.073      1.00   
rotate::RotateTest::(1920x1080, 1, 16SC2)  6.146     6.117      1.00   
rotate::RotateTest::(1920x1080, 1, 8UC3)   2.525     2.434      1.04   
rotate::RotateTest::(1920x1080, 1, 16SC3)  5.032     4.811      1.05   
rotate::RotateTest::(1920x1080, 1, 8UC4)   6.135     6.115      1.00   
rotate::RotateTest::(1920x1080, 1, 16SC4)  12.213   12.198      1.00   
rotate::RotateTest::(1920x1080, 2, 8UC1)   10.490    8.663      1.21   
rotate::RotateTest::(1920x1080, 2, 8SC1)   10.478    8.669      1.21   
rotate::RotateTest::(1920x1080, 2, 16SC1)  27.543   24.120      1.14   
rotate::RotateTest::(1920x1080, 2, 32SC1)  50.447   25.030      2.02   
rotate::RotateTest::(1920x1080, 2, 32FC1)  50.791   25.051      2.03   
rotate::RotateTest::(1920x1080, 2, 8UC2)   27.531   24.143      1.14   
rotate::RotateTest::(1920x1080, 2, 16SC2)  50.799   24.887      2.04   
rotate::RotateTest::(1920x1080, 2, 8UC3)   57.720   50.694      1.14   
rotate::RotateTest::(1920x1080, 2, 16SC3)  96.726   83.326      1.16   
rotate::RotateTest::(1920x1080, 2, 8UC4)   51.464   24.879      2.07   
rotate::RotateTest::(1920x1080, 2, 16SC4)  23.573   21.244      1.11 

K1 vs RK3568

             Name of Test                  rk   patch-gcc patch-clang patch-gcc  patch-clang
                                                                           vs         vs     
                                                                           rk         rk     
                                                                       (x-factor) (x-factor) 
rotate::RotateTest::(640x480, 0, 8UC1)    0.828    0.827      0.891       1.00       0.93    
rotate::RotateTest::(640x480, 0, 8SC1)    0.831    0.834      0.879       1.00       0.94    
rotate::RotateTest::(640x480, 0, 16SC1)   2.598    1.615      1.582       1.61       1.64    
rotate::RotateTest::(640x480, 0, 32SC1)   5.223    4.864      4.576       1.07       1.14    
rotate::RotateTest::(640x480, 0, 32FC1)   5.174    4.877      4.560       1.06       1.13    
rotate::RotateTest::(640x480, 0, 8UC2)    2.599    1.619      1.595       1.61       1.63    
rotate::RotateTest::(640x480, 0, 16SC2)   5.206    4.874      4.567       1.07       1.14    
rotate::RotateTest::(640x480, 0, 8UC3)    6.390    4.246      4.465       1.51       1.43    
rotate::RotateTest::(640x480, 0, 16SC3)   10.642  14.469     12.632       0.74       0.84    
rotate::RotateTest::(640x480, 0, 8UC4)    5.254    4.869      4.567       1.08       1.15    
rotate::RotateTest::(640x480, 0, 16SC4)   10.541   9.252      8.709       1.14       1.21    
rotate::RotateTest::(640x480, 1, 8UC1)    0.301    0.271      0.264       1.11       1.14    
rotate::RotateTest::(640x480, 1, 8SC1)    0.291    0.271      0.264       1.08       1.10    
rotate::RotateTest::(640x480, 1, 16SC1)   0.964    0.450      0.442       2.14       2.18    
rotate::RotateTest::(640x480, 1, 32SC1)   2.256    0.892      0.886       2.53       2.55    
rotate::RotateTest::(640x480, 1, 32FC1)   2.264    0.894      0.887       2.53       2.55    
rotate::RotateTest::(640x480, 1, 8UC2)    1.003    0.450      0.442       2.23       2.27    
rotate::RotateTest::(640x480, 1, 16SC2)   2.196    0.892      0.887       2.46       2.48    
rotate::RotateTest::(640x480, 1, 8UC3)    2.667    0.333      0.333       8.01       8.01    
rotate::RotateTest::(640x480, 1, 16SC3)   4.746    0.686      0.683       6.91       6.94    
rotate::RotateTest::(640x480, 1, 8UC4)    2.229    0.892      0.887       2.50       2.51    
rotate::RotateTest::(640x480, 1, 16SC4)   4.535    1.806      1.804       2.51       2.51    
rotate::RotateTest::(640x480, 2, 8UC1)    0.761    0.636      1.010       1.20       0.75    
rotate::RotateTest::(640x480, 2, 8SC1)    0.751    0.648      1.007       1.16       0.75    
rotate::RotateTest::(640x480, 2, 16SC1)   2.460    1.198      1.871       2.05       1.31    
rotate::RotateTest::(640x480, 2, 32SC1)   4.671    3.761      4.619       1.24       1.01    
rotate::RotateTest::(640x480, 2, 32FC1)   4.685    3.768      4.608       1.24       1.02    
rotate::RotateTest::(640x480, 2, 8UC2)    2.449    1.201      1.872       2.04       1.31    
rotate::RotateTest::(640x480, 2, 16SC2)   4.695    3.760      4.620       1.25       1.02    
rotate::RotateTest::(640x480, 2, 8UC3)    4.480    3.562      4.903       1.26       0.91    
rotate::RotateTest::(640x480, 2, 16SC3)   8.284   10.434     11.903       0.79       0.70    
rotate::RotateTest::(640x480, 2, 8UC4)    4.655    3.760      4.614       1.24       1.01    
rotate::RotateTest::(640x480, 2, 16SC4)   9.100    6.412      8.424       1.42       1.08    
rotate::RotateTest::(1280x720, 0, 8UC1)   6.443    3.788      3.550       1.70       1.81    
rotate::RotateTest::(1280x720, 0, 8SC1)   6.500    3.775      3.549       1.72       1.83    
rotate::RotateTest::(1280x720, 0, 16SC1)  12.011  10.605      9.422       1.13       1.27    
rotate::RotateTest::(1280x720, 0, 32SC1)  28.098  16.958     16.473       1.66       1.71    
rotate::RotateTest::(1280x720, 0, 32FC1)  28.099  17.002     16.513       1.65       1.70    
rotate::RotateTest::(1280x720, 0, 8UC2)   12.000  10.578      9.450       1.13       1.27    
rotate::RotateTest::(1280x720, 0, 16SC2)  27.971  16.978     16.517       1.65       1.69    
rotate::RotateTest::(1280x720, 0, 8UC3)   22.391  24.277     22.781       0.92       0.98    
rotate::RotateTest::(1280x720, 0, 16SC3)  37.854  44.261     43.622       0.86       0.87    
rotate::RotateTest::(1280x720, 0, 8UC4)   27.965  17.000     16.500       1.64       1.69    
rotate::RotateTest::(1280x720, 0, 16SC4)  38.146  18.788     18.578       2.03       2.05    
rotate::RotateTest::(1280x720, 1, 8UC1)   1.600    0.671      0.659       2.39       2.43    
rotate::RotateTest::(1280x720, 1, 8SC1)   1.611    0.670      0.660       2.40       2.44    
rotate::RotateTest::(1280x720, 1, 16SC1)  3.386    1.349      1.339       2.51       2.53    
rotate::RotateTest::(1280x720, 1, 32SC1)  6.678    2.724      2.722       2.45       2.45    
rotate::RotateTest::(1280x720, 1, 32FC1)  6.666    2.725      2.723       2.45       2.45    
rotate::RotateTest::(1280x720, 1, 8UC2)   3.389    1.349      1.339       2.51       2.53    
rotate::RotateTest::(1280x720, 1, 16SC2)  6.869    2.725      2.723       2.52       2.52    
rotate::RotateTest::(1280x720, 1, 8UC3)   8.663    1.053      1.052       8.22       8.24    
rotate::RotateTest::(1280x720, 1, 16SC3)  14.379   2.158      2.131       6.66       6.75    
rotate::RotateTest::(1280x720, 1, 8UC4)   6.726    2.728      2.722       2.47       2.47    
rotate::RotateTest::(1280x720, 1, 16SC4)  13.201   5.440      5.450       2.43       2.42    
rotate::RotateTest::(1280x720, 2, 8UC1)   5.733    2.993      3.782       1.92       1.52    
rotate::RotateTest::(1280x720, 2, 8SC1)   5.716    2.988      3.781       1.91       1.51    
rotate::RotateTest::(1280x720, 2, 16SC1)  9.806    8.583      9.319       1.14       1.05    
rotate::RotateTest::(1280x720, 2, 32SC1)  25.903  11.944     15.091       2.17       1.72    
rotate::RotateTest::(1280x720, 2, 32FC1)  25.882  11.958     15.138       2.16       1.71    
rotate::RotateTest::(1280x720, 2, 8UC2)   9.818    8.523      9.232       1.15       1.06    
rotate::RotateTest::(1280x720, 2, 16SC2)  25.885  11.953     15.126       2.17       1.71    
rotate::RotateTest::(1280x720, 2, 8UC3)   16.579  20.632     22.945       0.80       0.72    
rotate::RotateTest::(1280x720, 2, 16SC3)  28.902  35.427     39.818       0.82       0.73    
rotate::RotateTest::(1280x720, 2, 8UC4)   25.954  11.940     15.083       2.17       1.72    
rotate::RotateTest::(1280x720, 2, 16SC4)  33.502   8.446     15.421       3.97       2.17    
rotate::RotateTest::(1920x1080, 0, 8UC1)  15.608  10.823      9.851       1.44       1.58    
rotate::RotateTest::(1920x1080, 0, 8SC1)  15.583  10.881      9.879       1.43       1.58    
rotate::RotateTest::(1920x1080, 0, 16SC1) 28.304  29.307     26.996       0.97       1.05    
rotate::RotateTest::(1920x1080, 0, 32SC1) 58.190  37.351     35.347       1.56       1.65    
rotate::RotateTest::(1920x1080, 0, 32FC1) 58.225  37.346     35.349       1.56       1.65    
rotate::RotateTest::(1920x1080, 0, 8UC2)  28.338  29.349     26.803       0.97       1.06    
rotate::RotateTest::(1920x1080, 0, 16SC2) 57.923  37.728     35.293       1.54       1.64    
rotate::RotateTest::(1920x1080, 0, 8UC3)  52.298  62.332     60.117       0.84       0.87    
rotate::RotateTest::(1920x1080, 0, 16SC3) 90.399  100.544    98.567       0.90       0.92    
rotate::RotateTest::(1920x1080, 0, 8UC4)  57.942  37.726     35.275       1.54       1.64    
rotate::RotateTest::(1920x1080, 0, 16SC4) 86.510  46.696     46.517       1.85       1.86    
rotate::RotateTest::(1920x1080, 1, 8UC1)  3.714    1.620      1.602       2.29       2.32    
rotate::RotateTest::(1920x1080, 1, 8SC1)  3.666    1.619      1.602       2.26       2.29    
rotate::RotateTest::(1920x1080, 1, 16SC1) 7.514    3.075      3.065       2.44       2.45    
rotate::RotateTest::(1920x1080, 1, 32SC1) 15.067   6.123      6.130       2.46       2.46    
rotate::RotateTest::(1920x1080, 1, 32FC1) 15.042   6.125      6.128       2.46       2.45    
rotate::RotateTest::(1920x1080, 1, 8UC2)  7.511    3.073      3.063       2.44       2.45    
rotate::RotateTest::(1920x1080, 1, 16SC2) 15.013   6.117      6.128       2.45       2.45    
rotate::RotateTest::(1920x1080, 1, 8UC3)  18.756   2.434      2.371       7.71       7.91    
rotate::RotateTest::(1920x1080, 1, 16SC3) 31.672   4.811      4.778       6.58       6.63    
rotate::RotateTest::(1920x1080, 1, 8UC4)  15.019   6.115      6.129       2.46       2.45    
rotate::RotateTest::(1920x1080, 1, 16SC4) 29.099  12.198     12.232       2.39       2.38    
rotate::RotateTest::(1920x1080, 2, 8UC1)  13.198   8.663      9.856       1.52       1.34    
rotate::RotateTest::(1920x1080, 2, 8SC1)  13.201   8.669      9.767       1.52       1.35    
rotate::RotateTest::(1920x1080, 2, 16SC1) 22.716  24.120     26.190       0.94       0.87    
rotate::RotateTest::(1920x1080, 2, 32SC1) 46.976  25.030     30.923       1.88       1.52    
rotate::RotateTest::(1920x1080, 2, 32FC1) 46.888  25.051     30.919       1.87       1.52    
rotate::RotateTest::(1920x1080, 2, 8UC2)  22.694  24.143     26.107       0.94       0.87    
rotate::RotateTest::(1920x1080, 2, 16SC2) 46.994  24.887     30.904       1.89       1.52    
rotate::RotateTest::(1920x1080, 2, 8UC3)  38.701  50.694     55.418       0.76       0.70    
rotate::RotateTest::(1920x1080, 2, 16SC3) 71.227  83.326     92.992       0.85       0.77    
rotate::RotateTest::(1920x1080, 2, 8UC4)  46.917  24.879     30.901       1.89       1.52    
rotate::RotateTest::(1920x1080, 2, 16SC4) 76.545  21.244     37.281       3.60       2.05

perf-flipi.zip

@fengyuentau fengyuentau changed the title hal_rvv: implemented flip_inplace in hal_rvv to boost cv::rotate hal/riscv_rvv: implemented flip_inplace in hal_rvv to boost cv::rotate Apr 28, 2025
@fengyuentau fengyuentau changed the title hal/riscv_rvv: implemented flip_inplace in hal_rvv to boost cv::rotate hal/riscv_rvv: implemented flip_inplace to boost cv::rotate Apr 28, 2025
@asmorkalov asmorkalov merged commit 956f583 into opencv:4.x Apr 28, 2025
27 of 28 checks passed
@fengyuentau fengyuentau deleted the 4x/hal_rvv/rotate branch April 28, 2025 08:04
@asmorkalov asmorkalov mentioned this pull request Apr 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants