Skip to content

Impl RISC-V HAL for norm_hamming#26918

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
GenshinImpactStarts:norm_hamming
Feb 26, 2025
Merged

Impl RISC-V HAL for norm_hamming#26918
asmorkalov merged 1 commit intoopencv:4.xfrom
GenshinImpactStarts:norm_hamming

Conversation

@GenshinImpactStarts
Copy link
Copy Markdown
Contributor

Implement through the existing cv_hal_normHamming8u and cv_hal_normHammingDiff8u interfaces.

Modified modules/core/src/norm.cpp:680 to ensure HAL calls are not bypassed.

Tested on

  • MUSE-PI
  • Compiler: gcc 14.2 (riscv-collab/riscv-gnu-toolchain Nightly: December 16, 2024)
$ opencv_test_core --gtest_filter="Core_Norm/ElemWiseTest*"
$ opencv_perf_core --gtest_filter="PerfHamming*" --perf_min_samples=300 --perf_force_samples=300

Compare with scalar:

Geometric mean (ms)

                    Name of Test                     scalar  hal      hal    
                                                                       vs    
                                                                     scalar  
                                                                   (x-factor)
norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480)   0.962  0.128     7.52   
norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 6.506  0.803     8.10   
norm2::PerfHamming::(NORM_HAMMING, 8UC1, 640x480)    0.964  0.137     7.05   
norm2::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080)  6.426  0.660     9.74   
norm::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480)    0.606  0.067     9.04   
norm::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080)  4.122  0.427     9.65   
norm::PerfHamming::(NORM_HAMMING, 8UC1, 640x480)     0.610  0.049    12.55   
norm::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080)   4.135  0.333    12.43   

Compare with ui:

Geometric mean (ms)

                    Name of Test                       ui    hal      hal    
                                                                       vs    
                                                                       ui    
                                                                   (x-factor)
norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480)   0.734  0.128     5.74   
norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 4.915  0.803     6.12   
norm2::PerfHamming::(NORM_HAMMING, 8UC1, 640x480)    0.716  0.137     5.24   
norm2::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080)  4.771  0.660     7.23   
norm::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480)    0.690  0.067    10.29   
norm::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080)  4.670  0.427    10.93   
norm::PerfHamming::(NORM_HAMMING, 8UC1, 640x480)     0.652  0.049    13.40   
norm::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080)   4.433  0.333    13.33  

While working on this, I noticed that the optimization logic in cv::norm could be improved:

  1. The recently introduced cv_hal_norm is too restrictive. I think optimization also can be performed after unfolding using map iteration.
  2. cv_hal_norm overlaps with cv_hal_normHamming8u, and due to its strictness, I implemented the optimization through the latter.
  3. cv::hal::normHamming has four overloads—two in norm.cpp and two in stat.dispatch.cpp. The latter should only be used by the former, as CALL_HAL_RET is present only in the two in norm.cpp. Since they have similar parameters, to avoid confusion, I suggest renaming the two in stat.dispatch.cpp.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov
Copy link
Copy Markdown
Contributor

Please rebase and fix conflicts.

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@asmorkalov
Copy link
Copy Markdown
Contributor

Perf results for Spacemit Muse Pi v30 (GCC 14.2):

norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480) 	0.740 	0.128 	5.80
norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 	4.925 	0.805 	6.11
norm2::PerfHamming::(NORM_HAMMING, 8UC1, 640x480) 	0.724 	0.125 	5.80
norm2::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080) 	4.842 	0.656 	7.38
norm::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480) 	0.698 	0.063 	11.01
norm::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 	4.706 	0.429 	10.98
norm::PerfHamming::(NORM_HAMMING, 8UC1, 640x480) 	0.656 	0.047 	13.87
norm::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080) 	4.506 	0.343 	13.16 

@asmorkalov asmorkalov self-assigned this Feb 25, 2025
@asmorkalov asmorkalov merged commit 3db1247 into opencv:4.x Feb 26, 2025
27 of 28 checks passed
@asmorkalov asmorkalov mentioned this pull request Mar 4, 2025
@GenshinImpactStarts GenshinImpactStarts deleted the norm_hamming branch March 12, 2025 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants