Skip to content

Update RVV backend for using Clang.#21012

Merged
alalek merged 10 commits intoopencv:4.xfrom
hanliutong:rvv_clang
Dec 3, 2021
Merged

Update RVV backend for using Clang.#21012
alalek merged 10 commits intoopencv:4.xfrom
hanliutong:rvv_clang

Conversation

@hanliutong
Copy link
Copy Markdown
Contributor

@hanliutong hanliutong commented Nov 5, 2021

This patch is going to fix and update the RVV cross-compile using clang.

In terms of support for RVV, Clang/LLVM is more active. With the update of spec and Intrinsic, many new functions have been supported by LLVM, but not in GNU yet. Therefore, for RVV related development in OpenCV, I propose to use clang/LLVM as the compiler.

There are a few differences between the GNU tool-chain and LLVM:

1. C-style casting on RVV type is invalid. Clang has checked on it, not in GCC yet.
2. Some intrinsic is renamed.
3. New instruction/intrinsic. E.g. `vpopc`
4. Vector type vxxxmf<n> is implemented.
5. Overloaded intrinsic.

For that, the following file is updated:

  • platforms/linux/riscv64-clang.toolchain.cmake : the CMake file for RVV cross-compile. Modify RVV version in compile flag(0p9 -> 0p10). Addition, add -O2 to release flag.
  • modules/dnn/src/layers/layers_common.simd.hpp: the optimized implementation file for DNN using RVV Native intrinsics. Some types defined in GNU but not with LLVM are modified (float32_t -> float). Rename vfredsum to vfredosum according to spec.
  • modules/core/include/opencv2/core/hal/intrin_rvv.hpp: WUI implementation file for RVV.
    1. Modify reinterpret by using RVV native intrinsic.
    2. Use reinterpret or rewrite the function to replace C-style conversion, some of them using overloaded intrinsics.
    3. Rename vfredsum to vfredosum
    4. Disable vxxxmf when using clang.

In order to be compatible with the existing GNU toolchain, most of the changes are wrapped in macros:

#ifndef __clang__
// current code work with GNU
#else
// new code work with LLVM
#endif

Currently, the RVV draft has been frozen. When both GNU and LLVM are updated to 1.0 and support the same intrinsic (LLVM is already very close), we will no longer need these macros.

This patch is tested on QEMU, both GNU and LLVM toolchain is work.

qemu-riscv64 -cpu rv64,x-v=true ./bin/opencv_test_core --gtest_filter="hal*"

Note: Steps for cross-compiling using clang/LLVM

git clone https://github.com/llvm/llvm-project.git
mkdir llvm-project/build && cd llvm-project/build
cmake -DLLVM_TARGETS_TO_BUILD="X86;RISCV" -DLLVM_ENABLE_PROJECTS="clang" -DCMAKE_BUILD_TYPE=RELEASE -G Ninja ../llvm
ninja
ln -s ~/llvm-project/build /opt/rvv-llvm #optional
cd opencv
mkdir opencv/.clang_build && cd opencv/.clang_build
cmake -DWITH_GTK=OFF -DCMAKE_TOOLCHAIN_FILE=../platforms/linux/riscv64-clang.toolchain.cmake ../
make

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@hanliutong
Copy link
Copy Markdown
Contributor Author

/cc @joy2myself

@joy2myself
Copy link
Copy Markdown
Contributor

LGTM. Great update. 👍

#endif

#define OPENCV_HAL_IMPL_RVV_ONE_TIME_REINTERPRET(_Tpvec1, _Tpvec2, _nTpvec1, _nTpvec2, suffix1, suffix2, nsuffix1, nsuffix2, width1, width2, vl1, vl2) \
#define OPENCV_HAL_IMPL_RVV_NATIVE_REINTERPRET(_Tpvec1, _Tp1, _Tpvec2, _Tp2, _nTpvec1, _nTpvec2, suffix1, suffix2, nsuffix1, nsuffix2, width1, width2, vl1, vl2) \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

width1, width2, vl1, vl2 are not used any more. I propose to simplify the macro and it's usage.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.
_Tp1, _Tpvec2, _Tp2, _nTpvec1, width1, width2, vl1, vl2 in NATIVE_REINTERPRET and
_Tp1, _Tpvec2, _Tp2, _nTpvec1, vl1, vl2 in TWO_TIMES_REINTERPRET are not used.
Removed.

Copy link
Copy Markdown
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alalek alalek merged commit 4935b14 into opencv:4.x Dec 3, 2021
@hanliutong hanliutong deleted the rvv_clang branch December 22, 2021 05:36
@alalek alalek mentioned this pull request Dec 30, 2021
@alalek alalek mentioned this pull request Feb 22, 2022
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
Update RVV backend for using Clang.

* Update cmake file of clang.

* Modify the RVV optimization on DNN to adapt to clang.

* Modify intrin_rvv: Disable some existing types.

* Modify intrin_rvv: Reinterpret instead of load&cast.

* Modify intrin_rvv: Update load&store without cast.

* Modify intrin_rvv: Rename vfredsum to fredosum.

* Modify intrin_rvv: Rewrite Check all/any by using vpopc.

* Modify intrin_rvv: Use reinterpret instead of c-style casting.

* Remove all macros which is not used in v_reinterpret

* Rename vpopc to vcpop according to spec.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants