Impl RISC-V HAL for norm_hamming#26918
Merged
asmorkalov merged 1 commit intoopencv:4.xfrom Feb 26, 2025
Merged
Conversation
asmorkalov
reviewed
Feb 15, 2025
Contributor
|
Please rebase and fix conflicts. |
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
768327d to
33d632f
Compare
Contributor
|
Perf results for Spacemit Muse Pi v30 (GCC 14.2): |
asmorkalov
approved these changes
Feb 25, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement through the existing
cv_hal_normHamming8uandcv_hal_normHammingDiff8uinterfaces.Modified
modules/core/src/norm.cpp:680to ensure HAL calls are not bypassed.Tested on
Compare with scalar:
Geometric mean (ms) Name of Test scalar hal hal vs scalar (x-factor) norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480) 0.962 0.128 7.52 norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 6.506 0.803 8.10 norm2::PerfHamming::(NORM_HAMMING, 8UC1, 640x480) 0.964 0.137 7.05 norm2::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080) 6.426 0.660 9.74 norm::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480) 0.606 0.067 9.04 norm::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 4.122 0.427 9.65 norm::PerfHamming::(NORM_HAMMING, 8UC1, 640x480) 0.610 0.049 12.55 norm::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080) 4.135 0.333 12.43Compare with ui:
Geometric mean (ms) Name of Test ui hal hal vs ui (x-factor) norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480) 0.734 0.128 5.74 norm2::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 4.915 0.803 6.12 norm2::PerfHamming::(NORM_HAMMING, 8UC1, 640x480) 0.716 0.137 5.24 norm2::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080) 4.771 0.660 7.23 norm::PerfHamming::(NORM_HAMMING2, 8UC1, 640x480) 0.690 0.067 10.29 norm::PerfHamming::(NORM_HAMMING2, 8UC1, 1920x1080) 4.670 0.427 10.93 norm::PerfHamming::(NORM_HAMMING, 8UC1, 640x480) 0.652 0.049 13.40 norm::PerfHamming::(NORM_HAMMING, 8UC1, 1920x1080) 4.433 0.333 13.33While working on this, I noticed that the optimization logic in
cv::normcould be improved:cv_hal_normis too restrictive. I think optimization also can be performed after unfolding using map iteration.cv_hal_normoverlaps withcv_hal_normHamming8u, and due to its strictness, I implemented the optimization through the latter.cv::hal::normHamminghas four overloads—two innorm.cppand two instat.dispatch.cpp. The latter should only be used by the former, asCALL_HAL_RETis present only in the two innorm.cpp. Since they have similar parameters, to avoid confusion, I suggest renaming the two instat.dispatch.cpp.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.