3rdparty: NDSRVP - Part 1.5: New Interfaces#25786
Conversation
modules/imgproc/src/imgwarp.cpp
Outdated
| opt_SSE4_1::WarpAffineInvoker_Blockline_SSE41(adelta + x, bdelta + x, xy, X0, Y0, bw); | ||
| else | ||
| #endif | ||
| if( cv_hal_warpAffineBlocklineNN(adelta + x, bdelta + x, xy, X0, Y0, bw) != CV_HAL_ERROR_OK ) |
There was a problem hiding this comment.
Please extract the whole block to some cv::hal:: function. Define it here: https://github.com/opencv/opencv/blob/4.x/modules/imgproc/include/opencv2/imgproc/hal/hal.hpp Implementation can reside somewhere in this file (imgwarp.cpp).
This new function should first try external HAL function (using CALL_HAL macro), then try AVX, SSE, LASX, universal intrinsics, then fallback implementation.
Then cv::warpAffine should call this new cv::hal:: function for CPU processing.
There are some functions implemented this way, e.g. cv::hal::normHamming (
opencv/modules/core/src/norm.cpp
Lines 53 to 100 in 8d935e2
There was a problem hiding this comment.
4 new functions have been extracted. Further accuracy checks might be needed for other related platforms(AVX2, LASX, etc.).
|
@Junyan721113 friendly reminder. |
|
cc @vpisarev |
22d10fe to
f3729de
Compare
e4d8dd2 to
7a0336d
Compare
|
@fengyuentau Please attention on the changes. |
This comment was marked as outdated.
This comment was marked as outdated.
|
Okay, it seems the changes are not modifying the core method regarding warpAffine and warpPerspective. The new kernel is written fully with universal intrinsics (some parts are using neon intrinsics for the best performance). Could this be merged soon? Otherwise it can lead to merge conflicts. |
7a0336d to
35463e0
Compare
|
@asmorkalov is there any other change needed to be made? |
Summary
Previous context
From PR #25167:
Part 1.5: New Interfaces (Ready for PR)
cv::ndsrvp::warpAffine&cv::ndsrvp::warpPerspectiveinto...Blockline&...BlocklineNNcv::ndsrvp::remapvia newcv_hal_remap32finterfaceWhat's noticing is that the
remapfunction called bywarpAffineandwarpPerspectivedoes not use HAL interfacecv_hal_remap32f.Performance tests
Remap
Geometric mean (ms)
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.