Skip to content

[HAL RVV] reuse atan | impl cart_to_polar | add perf test#27000

Merged
asmorkalov merged 2 commits intoopencv:4.xfrom
GenshinImpactStarts:cart_to_polar
Mar 13, 2025
Merged

[HAL RVV] reuse atan | impl cart_to_polar | add perf test#27000
asmorkalov merged 2 commits intoopencv:4.xfrom
GenshinImpactStarts:cart_to_polar

Conversation

@GenshinImpactStarts
Copy link
Copy Markdown
Contributor

@GenshinImpactStarts GenshinImpactStarts commented Mar 3, 2025

Implement through the existing cv_hal_cartToPolar32f and cv_hal_cartToPolar64f interfaces.

Add cartToPolar performance tests.

cv_hal_rvv::fast_atan is modified to make it more reusable because it's needed in cartToPolar.

UPDATE: UI enabled. Since the vec type of RVV can't be stored in struct. UI implementation of v_atan_f32 is modified. Both fastAtan and cartToPolar are affected so the test result for atan is also appended. I have tested the modified UI on RVV and AVX2 and no regressions appears.

Perf test done on MUSE-PI. AVX2 test done on Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz.

$ opencv_test_core --gtest_filter="*CartToPolar*:*Core_CartPolar_reverse*:*Phase*" 
$ opencv_perf_core --gtest_filter="*CartToPolar*:*phase*" --perf_min_samples=300 --perf_force_samples=300

Test result between enabled UI and HAL:

                   Name of Test                       ui    rvv      rvv    
                                                                      vs    
                                                                      ui    
                                                                  (x-factor)
CartToPolar::CartToPolarFixture::(127x61, 32FC1)    0.106  0.059     1.80   
CartToPolar::CartToPolarFixture::(127x61, 64FC1)    0.155  0.070     2.20   
CartToPolar::CartToPolarFixture::(640x480, 32FC1)   4.188  2.317     1.81   
CartToPolar::CartToPolarFixture::(640x480, 64FC1)   6.593  2.889     2.28   
CartToPolar::CartToPolarFixture::(1280x720, 32FC1)  12.600 7.057     1.79   
CartToPolar::CartToPolarFixture::(1280x720, 64FC1)  19.860 8.797     2.26   
CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 28.295 15.809    1.79   
CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 44.573 19.398    2.30   
phase32f::VectorLength::128                         0.002  0.002     1.20   
phase32f::VectorLength::1000                        0.008  0.006     1.32   
phase32f::VectorLength::131072                      1.061  0.731     1.45   
phase32f::VectorLength::524288                      3.997  2.976     1.34   
phase32f::VectorLength::1048576                     8.001  5.959     1.34   
phase64f::VectorLength::128                         0.002  0.002     1.33   
phase64f::VectorLength::1000                        0.012  0.008     1.58   
phase64f::VectorLength::131072                      1.648  0.931     1.77   
phase64f::VectorLength::524288                      6.836  3.837     1.78   
phase64f::VectorLength::1048576                     14.060 7.540     1.86   

Test result before and after enabling UI on RVV:

                   Name of Test                      perf   perf     perf   
                                                      ui     ui       ui    
                                                     orig    pr       pr    
                                                                      vs    
                                                                     perf   
                                                                      ui    
                                                                     orig   
                                                                  (x-factor)
CartToPolar::CartToPolarFixture::(127x61, 32FC1)    0.141  0.106     1.33   
CartToPolar::CartToPolarFixture::(127x61, 64FC1)    0.187  0.155     1.20   
CartToPolar::CartToPolarFixture::(640x480, 32FC1)   5.990  4.188     1.43   
CartToPolar::CartToPolarFixture::(640x480, 64FC1)   8.370  6.593     1.27   
CartToPolar::CartToPolarFixture::(1280x720, 32FC1)  18.214 12.600    1.45   
CartToPolar::CartToPolarFixture::(1280x720, 64FC1)  25.365 19.860    1.28   
CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 40.437 28.295    1.43   
CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 56.699 44.573    1.27   
phase32f::VectorLength::128                         0.003  0.002     1.54   
phase32f::VectorLength::1000                        0.016  0.008     1.90   
phase32f::VectorLength::131072                      2.048  1.061     1.93   
phase32f::VectorLength::524288                      8.219  3.997     2.06   
phase32f::VectorLength::1048576                     16.426 8.001     2.05   
phase64f::VectorLength::128                         0.003  0.002     1.44   
phase64f::VectorLength::1000                        0.020  0.012     1.60   
phase64f::VectorLength::131072                      2.621  1.648     1.59   
phase64f::VectorLength::524288                      10.780 6.836     1.58   
phase64f::VectorLength::1048576                     22.723 14.060    1.62   

Test result before and after modifying UI on AVX2:

                   Name of Test                     perf  perf     perf   
                                                    avx2  avx2     avx2   
                                                    orig   pr       pr    
                                                                    vs    
                                                                   perf   
                                                                   avx2   
                                                                   orig   
                                                                (x-factor)
CartToPolar::CartToPolarFixture::(127x61, 32FC1)    0.006 0.005    1.14   
CartToPolar::CartToPolarFixture::(127x61, 64FC1)    0.010 0.009    1.08   
CartToPolar::CartToPolarFixture::(640x480, 32FC1)   0.273 0.264    1.03   
CartToPolar::CartToPolarFixture::(640x480, 64FC1)   0.511 0.487    1.05   
CartToPolar::CartToPolarFixture::(1280x720, 32FC1)  0.760 0.723    1.05   
CartToPolar::CartToPolarFixture::(1280x720, 64FC1)  2.009 1.937    1.04   
CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 1.996 1.923    1.04   
CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 5.721 5.509    1.04   
phase32f::VectorLength::128                         0.000 0.000    0.98   
phase32f::VectorLength::1000                        0.001 0.001    0.97   
phase32f::VectorLength::131072                      0.105 0.111    0.95   
phase32f::VectorLength::524288                      0.402 0.402    1.00   
phase32f::VectorLength::1048576                     0.775 0.767    1.01   
phase64f::VectorLength::128                         0.000 0.000    1.00   
phase64f::VectorLength::1000                        0.001 0.001    1.01   
phase64f::VectorLength::131072                      0.163 0.162    1.01   
phase64f::VectorLength::524288                      0.669 0.653    1.02   
phase64f::VectorLength::1048576                     1.660 1.634    1.02   

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@GenshinImpactStarts
Copy link
Copy Markdown
Contributor Author

This patch also can be benefit from PR #27015. Wait #27015 to be merged first.

@fengyuentau fengyuentau self-requested a review March 10, 2025 07:16
@fengyuentau
Copy link
Copy Markdown
Member

My performance results (K1 vs RK3568):

                   Name of Test                       rk   patch-gcc patch-clang patch-gcc  patch-clang
                                                                                     vs         vs     
                                                                                     rk         rk     
                                                                                 (x-factor) (x-factor) 
CartToPolar::CartToPolarFixture::(127x61, 32FC1)    0.059    0.070      0.070       0.84       0.84    
CartToPolar::CartToPolarFixture::(127x61, 64FC1)    0.108    0.082      0.081       1.31       1.34    
CartToPolar::CartToPolarFixture::(640x480, 32FC1)   2.783    2.721      2.724       1.02       1.02    
CartToPolar::CartToPolarFixture::(640x480, 64FC1)   5.061    3.297      3.202       1.53       1.58    
CartToPolar::CartToPolarFixture::(1280x720, 32FC1)  8.077    8.188      8.224       0.99       0.98    
CartToPolar::CartToPolarFixture::(1280x720, 64FC1)  15.240   9.932      9.647       1.53       1.58    
CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 16.341  18.392     18.484       0.89       0.88    
CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 34.559  22.289     21.517       1.55       1.61 

@GenshinImpactStarts
Copy link
Copy Markdown
Contributor Author

This patch can be benefit from #27015. I think we should do perf test after using sqrt in that PR

@fengyuentau
Copy link
Copy Markdown
Member

This patch can be benefit from #27015. I think we should do perf test after using sqrt in that PR

Thank you for the hint. I will redo performance testing once updated.

@fengyuentau
Copy link
Copy Markdown
Member

This patch can be benefit from #27015. I think we should do perf test after using sqrt in that PR

#27015 is just merged.

@GenshinImpactStarts
Copy link
Copy Markdown
Contributor Author

UI enabled. Some explanation and the perf result is updated.

UPDATE: UI enabled. Since the vec type of RVV can't be stored in struct. UI implementation of v_atan_f32 is modified. Both fastAtan and cartToPolar are affected so the test result for atan is also appended. I have tested the modified UI on RVV and AVX2 and no regressions appears.

@fengyuentau
Copy link
Copy Markdown
Member

Updated performance results (K1 vs RK3568):

                   Name of Test                       rk   k1-patch-gcc k1-patch-clang k1-patch-gcc k1-patch-clang
                                                                                            vs            vs
                                                                                            rk            rk
                                                                                        (x-factor)    (x-factor)
CartToPolar::CartToPolarFixture::(127x61, 32FC1)    0.060     0.060         0.060          1.00          1.01
CartToPolar::CartToPolarFixture::(127x61, 64FC1)    0.117     0.072         0.073          1.63          1.61
CartToPolar::CartToPolarFixture::(640x480, 32FC1)   2.485     2.337         2.329          1.06          1.07
CartToPolar::CartToPolarFixture::(640x480, 64FC1)   5.246     2.838         2.813          1.85          1.87
CartToPolar::CartToPolarFixture::(1280x720, 32FC1)  7.857     6.973         6.964          1.13          1.13
CartToPolar::CartToPolarFixture::(1280x720, 64FC1)  15.673    8.522         8.429          1.84          1.86
CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 16.160    15.675        15.598         1.03          1.04
CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 35.563    19.060        18.825         1.87          1.89

Nice 👍

@asmorkalov asmorkalov added this to the 4.12.0 milestone Mar 13, 2025
@fengyuentau
Copy link
Copy Markdown
Member

@GenshinImpactStarts Please resolve conflicts.

GenshinImpactStarts and others added 2 commits March 13, 2025 13:27
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@asmorkalov asmorkalov merged commit 2a8d4b8 into opencv:4.x Mar 13, 2025
23 of 28 checks passed
@GenshinImpactStarts GenshinImpactStarts deleted the cart_to_polar branch March 13, 2025 14:20
@asmorkalov asmorkalov mentioned this pull request Apr 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants