Skip to content

bug fixes for universal intrinsics of RISC-V back-end#20412

Merged
alalek merged 3 commits intoopencv:masterfrom
joy2myself:rvv-0.10
Jul 23, 2021
Merged

bug fixes for universal intrinsics of RISC-V back-end#20412
alalek merged 3 commits intoopencv:masterfrom
joy2myself:rvv-0.10

Conversation

@joy2myself
Copy link
Copy Markdown
Contributor

@joy2myself joy2myself commented Jul 15, 2021

Fixed comparison operations and some packing operations.

All of the failures in core module for RISC-V with RVV.
See #20278 for previous state

Currently, all the core tests have passed:

[----------] Global test environment tear-down
[ SKIPSTAT ] 14 tests skipped
[ SKIPSTAT ] TAG='mem_6gb' skip 1 tests
[ SKIPSTAT ] TAG='skip_other' skip 13 tests
[==========] 11542 tests from 249 test cases ran. (3737861 ms total)
[  PASSED  ] 11542 tests.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Set all bits to one for return value of int and fp comparators.
@joy2myself
Copy link
Copy Markdown
Contributor Author

cc @asmorkalov

@asmorkalov
Copy link
Copy Markdown
Contributor

Related: #20393

uint64 CV_DECL_ALIGNED(32) ptr[2] = {0x0908060504020100, 0xFFFFFFFF0E0D0C0A};
return v_int8x16((vint8m1_t)vrgather_vv_u8m1((vuint8m1_t)vint8m1_t(vec), (vuint8m1_t)vle64_v_u64m1(ptr, 2), 16));
}
inline v_uint8x16 v_pack_triplets(const v_uint8x16& vec) { return v_reinterpret_as_u8(v_pack_triplets(v_reinterpret_as_s8(vec))); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move implementation to the new line as in function above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved.

@asmorkalov
Copy link
Copy Markdown
Contributor

@joy2myself Friendly reminder.

@joy2myself
Copy link
Copy Markdown
Contributor Author

I have tried to optimize the memory usage in the whole implementation these days. The optimization also includes memory usage in the packaging operations mentioned above. Code of this commit can be viewed as a reference for the current optimization.

However, most of these optimizations rely on the newly added native intrinsics vset/vget family. They have just been added by the recent commit in rvv-intrinsic-doc. And they are not currently supported by the compiler (riscv-gnu-toolchian).

So my suggestion is, we merge the current implementation for now and optimize it later when rvv-intrinsics is updated to a new stable version and new native intrinsics are supported in the compiler.

@asmorkalov What do you think about it?

@asmorkalov asmorkalov self-requested a review July 23, 2021 08:11
Copy link
Copy Markdown
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Let's go ahead with the current solution. Tested with QEMU, all tests are green. Let's do another optimization iteration, when set/get instructions are available in QEMU and compiler.

@alalek alalek merged commit acc5766 into opencv:master Jul 23, 2021
@alalek alalek mentioned this pull request Oct 15, 2021
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
bug fixes for universal intrinsics of RISC-V back-end

* Align universal intrinsic comparator behaviour with other platforms

Set all bits to one for return value of int and fp comparators.

* fix v_pack_triplets, v_pack_store and v_pack_u_store

* Remove redundant CV_DECL_ALIGNED statements

Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants