Update RVV backend for using Clang.#21012
Merged
alalek merged 10 commits intoopencv:4.xfrom Dec 3, 2021
Merged
Conversation
Contributor
Author
|
/cc @joy2myself |
Contributor
|
LGTM. Great update. 👍 |
asmorkalov
reviewed
Nov 10, 2021
| #endif | ||
|
|
||
| #define OPENCV_HAL_IMPL_RVV_ONE_TIME_REINTERPRET(_Tpvec1, _Tpvec2, _nTpvec1, _nTpvec2, suffix1, suffix2, nsuffix1, nsuffix2, width1, width2, vl1, vl2) \ | ||
| #define OPENCV_HAL_IMPL_RVV_NATIVE_REINTERPRET(_Tpvec1, _Tp1, _Tpvec2, _Tp2, _nTpvec1, _nTpvec2, suffix1, suffix2, nsuffix1, nsuffix2, width1, width2, vl1, vl2) \ |
Contributor
There was a problem hiding this comment.
width1, width2, vl1, vl2 are not used any more. I propose to simplify the macro and it's usage.
Contributor
Author
There was a problem hiding this comment.
Agree.
_Tp1, _Tpvec2, _Tp2, _nTpvec1, width1, width2, vl1, vl2 in NATIVE_REINTERPRET and
_Tp1, _Tpvec2, _Tp2, _nTpvec1, vl1, vl2 in TWO_TIMES_REINTERPRET are not used.
Removed.
Merged
Merged
a-sajjad72
pushed a commit
to a-sajjad72/opencv
that referenced
this pull request
Mar 30, 2023
Update RVV backend for using Clang. * Update cmake file of clang. * Modify the RVV optimization on DNN to adapt to clang. * Modify intrin_rvv: Disable some existing types. * Modify intrin_rvv: Reinterpret instead of load&cast. * Modify intrin_rvv: Update load&store without cast. * Modify intrin_rvv: Rename vfredsum to fredosum. * Modify intrin_rvv: Rewrite Check all/any by using vpopc. * Modify intrin_rvv: Use reinterpret instead of c-style casting. * Remove all macros which is not used in v_reinterpret * Rename vpopc to vcpop according to spec.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch is going to fix and update the RVV cross-compile using clang.
In terms of support for RVV, Clang/LLVM is more active. With the update of spec and Intrinsic, many new functions have been supported by LLVM, but not in GNU yet. Therefore, for RVV related development in OpenCV, I propose to use clang/LLVM as the compiler.
There are a few differences between the GNU tool-chain and LLVM:
For that, the following file is updated:
platforms/linux/riscv64-clang.toolchain.cmake: the CMake file for RVV cross-compile. Modify RVV version in compile flag(0p9 -> 0p10). Addition, add-O2to release flag.modules/dnn/src/layers/layers_common.simd.hpp: the optimized implementation file for DNN using RVV Native intrinsics. Some types defined in GNU but not with LLVM are modified (float32_t -> float). Renamevfredsumtovfredosumaccording to spec.modules/core/include/opencv2/core/hal/intrin_rvv.hpp: WUI implementation file for RVV.reinterpretby using RVV native intrinsic.reinterpretor rewrite the function to replace C-style conversion, some of them using overloaded intrinsics.vfredsumtovfredosumIn order to be compatible with the existing GNU toolchain, most of the changes are wrapped in macros:
Currently, the RVV draft has been frozen. When both GNU and LLVM are updated to 1.0 and support the same intrinsic (LLVM is already very close), we will no longer need these macros.
This patch is tested on QEMU, both GNU and LLVM toolchain is work.
qemu-riscv64 -cpu rv64,x-v=true ./bin/opencv_test_core --gtest_filter="hal*"Note: Steps for cross-compiling using clang/LLVM
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.