bug fixes for universal intrinsics of RISC-V back-end#20412
bug fixes for universal intrinsics of RISC-V back-end#20412alalek merged 3 commits intoopencv:masterfrom
Conversation
Set all bits to one for return value of int and fp comparators.
|
cc @asmorkalov |
|
Related: #20393 |
| uint64 CV_DECL_ALIGNED(32) ptr[2] = {0x0908060504020100, 0xFFFFFFFF0E0D0C0A}; | ||
| return v_int8x16((vint8m1_t)vrgather_vv_u8m1((vuint8m1_t)vint8m1_t(vec), (vuint8m1_t)vle64_v_u64m1(ptr, 2), 16)); | ||
| } | ||
| inline v_uint8x16 v_pack_triplets(const v_uint8x16& vec) { return v_reinterpret_as_u8(v_pack_triplets(v_reinterpret_as_s8(vec))); } |
There was a problem hiding this comment.
Please move implementation to the new line as in function above.
|
@joy2myself Friendly reminder. |
|
I have tried to optimize the memory usage in the whole implementation these days. The optimization also includes memory usage in the packaging operations mentioned above. Code of this commit can be viewed as a reference for the current optimization. However, most of these optimizations rely on the newly added native intrinsics So my suggestion is, we merge the current implementation for now and optimize it later when rvv-intrinsics is updated to a new stable version and new native intrinsics are supported in the compiler. @asmorkalov What do you think about it? |
asmorkalov
left a comment
There was a problem hiding this comment.
👍 Let's go ahead with the current solution. Tested with QEMU, all tests are green. Let's do another optimization iteration, when set/get instructions are available in QEMU and compiler.
bug fixes for universal intrinsics of RISC-V back-end * Align universal intrinsic comparator behaviour with other platforms Set all bits to one for return value of int and fp comparators. * fix v_pack_triplets, v_pack_store and v_pack_u_store * Remove redundant CV_DECL_ALIGNED statements Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>
Fixed comparison operations and some packing operations.
All of the failures in core module for RISC-V with RVV.
See #20278 for previous state
Currently, all the core tests have passed:
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.