Skip to content

Upgrade RISC-V Vector intrinsic and cleanup the obsolete RVV backend.#25883

Merged
asmorkalov merged 6 commits intoopencv:4.xfrom
hanliutong:rvv-intrin-upgrade
Jul 19, 2024
Merged

Upgrade RISC-V Vector intrinsic and cleanup the obsolete RVV backend.#25883
asmorkalov merged 6 commits intoopencv:4.xfrom
hanliutong:rvv-intrin-upgrade

Conversation

@hanliutong
Copy link
Copy Markdown
Contributor

This patch upgrade RISC-V Vector intrinsic from v0.10 to v0.12/v1.0:

  • Update cmake check and options;
  • Upgrade RVV implement for Universal Intrinsic;
  • Upgrade RVV optimized DNN kernel.
  • Cleanup the obsolete RVV backend (intrin_rvv.hpp) and compatable header file.

With this patch, RVV backend require Clang 17+ or GCC 14+ (which means __riscv_v_intrinsic >= 12000, see https://godbolt.org/z/es7ncETE3)

This patch is test with Clang 17.0.6 (require extra -DWITH_PNG=OFF due to ICE), Clang 18.1.8 and GCC 14.1.0 on QEMU and k230 (with --gtest_filter="*hal_*").

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@hanliutong hanliutong mentioned this pull request Jul 8, 2024
6 tasks
@asmorkalov asmorkalov requested a review from mshabunin July 8, 2024 12:17
@asmorkalov asmorkalov added platform: riscv cleanup Code cleanup (e.g, drop legacy C-API, legacy unmaintained code) labels Jul 8, 2024
@asmorkalov asmorkalov added this to the 4.11.0 milestone Jul 8, 2024
@asmorkalov
Copy link
Copy Markdown
Contributor

RISC-V config in CI failed:

-- Performing Test HAVE_CXX_MARCH_RV64GCV (check file: cmake/checks/cpu_rvv.cpp)
CMake Error at cmake/OpenCVCompilerOptimizations.cmake:602 (message):
-- Performing Test HAVE_CXX_MARCH_RV64GCV - Failed
  Required baseline optimization is not supported: RVV
  (CPU_BASELINE_REQUIRE=RVV)
Call Stack (most recent call first):
  cmake/OpenCVCompilerOptions.cmake:335 (include)
-- RVV is not supported by C++ compiler
-- Optimization RVV is not available, skipped
-- Performing Test HAVE_CPU_BASELINE_FLAGS
  CMakeLists.txt:687 (include)


-- Performing Test HAVE_CPU_BASELINE_FLAGS - Success

@hanliutong
Copy link
Copy Markdown
Contributor Author

hanliutong commented Jul 9, 2024

--   C/C++:
--     Built as dynamic libs?:      YES
--     C++ standard:                11
--     C++ Compiler:                /opt/rvv-llvm/bin/clang++  (ver 16.0.6)

clang 16 work with RVV intrinsic v0.11, but we need v0.12 or higher.

With this patch, RVV backend require Clang 17+ or GCC 14+ (which means __riscv_v_intrinsic >= 12000, see https://godbolt.org/z/es7ncETE3)

Maybe we need upgrade LLVM version on the CI server

@asmorkalov
Copy link
Copy Markdown
Contributor

@mshabunin could you take a look?

@mshabunin
Copy link
Copy Markdown
Contributor

Updated workflow (use clang-17 in precommit): opencv/ci-gha-workflow#176

@asmorkalov
Copy link
Copy Markdown
Contributor

@mshabunin I merged CI patch and re-triggered RISC-V jobs.

@mshabunin
Copy link
Copy Markdown
Contributor

Restarting pipelines didn't help. Perhaps we need to update PR itself.

@hanliutong
Copy link
Copy Markdown
Contributor Author

Should I push an empty commit or do something else?

@mshabunin
Copy link
Copy Markdown
Contributor

@hanliutong , yes, please push a new commit or rebase the whole branch.

@hanliutong hanliutong force-pushed the rvv-intrin-upgrade branch from c0974f5 to c750f57 Compare July 15, 2024 09:23
@hanliutong
Copy link
Copy Markdown
Contributor Author

@mshabunin rebased, please re-trigger the CI

@mshabunin
Copy link
Copy Markdown
Contributor

There is another issue with T-Head compiler - now it can not compile dnn inline kernels, because they use new intrinsics version:

/work/opencv/modules/dnn/src/layers/cpu_kernels/conv_depthwise.simd.hpp:267:26: error: '__riscv_vsetvl_e32m8' was not declared in this scope
  267 |                     vl = __riscv_vsetvl_e32m8(avl);
      |                          ^~~~~~~~~~~~~~~~~~~~

It seems that now both CV_RVV and CV_RVV071 are defined in cv_cpu_dispatch.h:

Let's try to combine both pieces into one, so that inline snippets in dnn wouldn't be compiled. Something like this:

#ifdef CV_CPU_COMPILE_RVV
#  ifdef __riscv_vector_071
#    define CV_RVV071 1
#  else
#    define CV_RVV 1
#  endif
#include <riscv_vector.h>
#endif

@hanliutong hanliutong requested a review from mshabunin July 17, 2024 02:13
@mshabunin
Copy link
Copy Markdown
Contributor

There are some issues left with XuanTie toolchain (https://github.com/XUANTIE-RV/xuantie-gnu-toolchain/, I used prebuilt 2.10.0 - https://www.xrvm.cn/community/download?id=4333581795569242112)

Command line:

$ PATH=/work/chains/xuantie-toolchain-2.10.0/bin:${PATH} \
  cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=/opencv/platforms/linux/riscv64-071-gcc.toolchain.cmake -DBUILD_SHARED_LIBS=OFF -DWITH_OPENCL=OFF -DCORE=C908V -DCMAKE_BUILD_TYPE=Release ../opencv
$ ninja

Linker errors:

/work/chains/xuantie-toolchain-2.10.0/bin/../lib/gcc/riscv64-unknown-linux-gnu/10.4.0/../../../../riscv64-unknown-linux-gnu/bin/ld: lib/libopencv_dnn.a(convolution_layer.cpp.o): in function `.L62':
convolution_layer.cpp:(.text._ZNK2cv3dnn22DeConvolutionLayerImpl13MatMulInvokerclERKNS_5RangeE[_ZNK2cv3dnn22DeConvolutionLayerImpl13MatMulInvokerclERKNS_5RangeE]+0x44e): undefined reference to `cv::dnn::opt_RVV::fastGEMM(float const*, unsigned long, float const*, unsigned long, float*, unsigned long, int, int, int)'
/work/chains/xuantie-toolchain-2.10.0/bin/../lib/gcc/riscv64-unknown-linux-gnu/10.4.0/../../../../riscv64-unknown-linux-gnu/bin/ld: lib/libopencv_dnn.a(fully_connected_layer.cpp.o): in function `.L51':
fully_connected_layer.cpp:(.text._ZNK2cv3dnn23FullyConnectedLayerImpl14FullyConnectedclERKNS_5RangeE[_ZNK2cv3dnn23FullyConnectedLayerImpl14FullyConnectedclERKNS_5RangeE]+0x134): undefined reference to `cv::dnn::opt_RVV::fastGEMM1T(float const*, float const*, unsigned long, float const*, float*, int, int)'
/work/chains/xuantie-toolchain-2.10.0/bin/../lib/gcc/riscv64-unknown-linux-gnu/10.4.0/../../../../riscv64-unknown-linux-gnu/bin/ld: lib/libopencv_dnn.a(conv_depthwise.cpp.o): in function `.L0 ':
conv_depthwise.cpp:(.text._ZNSt17_Function_handlerIFvRKN2cv5RangeEEZNS0_3dnn12runDepthwiseERKNS0_11_InputArrayERKNS0_12_OutputArrayERKNS0_3PtrINS5_8FastConvEEEPNS5_14dnn4_v2024052115ActivationLayerERKSt6vectorIfSaIfEEbEUlS3_E_E9_M_invokeERKSt9_Any_dataS3_+0x1b2): undefined reference to `cv::dnn::opt_RVV::fastDepthwiseConv(float const*, int, int, int, int, int, int, int, int, float const*, float const*, float const*, int, int, float*, int, int, int)'
collect2: error: ld returned 1 exit status

It seems that optimized functions are declared/called but not defined with current preprocessor macros state.

@hanliutong
Copy link
Copy Markdown
Contributor Author

Fixed by using macro CV_TRY_RVV and CV_RVV together when calling in-place optimizing functions:

--- #if CV_TRY_RVV
+++ #if CV_TRY_RVV && CV_RVV

then the in-place optimizing function only called with the upsteam toolchain which using new RVV intrinsic.

#if CV_TRY_RVV && CV_RVV
    if(useRVV)
        opt_RVV::fastConv(wptr, wstep, biasptr, rowbuf0, data_out0 + ofs0,
                        outShape, bsz, vsz, vsz_a, outZp, multptr, cn0 == 0, cn1 == inpCn);
    else
#endif

Note: macro state:

Macro state with upstream state with XuanTie
CV_CPU_COMPILE_RVV defined defined
CV_RVV 1 0
CV_RVV071 0 1
CV_TRY_RVV 1 1

@asmorkalov
Copy link
Copy Markdown
Contributor

@hanliutong @mshabunin DO we need the table above somewhere in code as comment?

@hanliutong
Copy link
Copy Markdown
Contributor Author

added on cv_cpu_dispatch.h

@asmorkalov asmorkalov merged commit b5ea321 into opencv:4.x Jul 19, 2024
@asmorkalov asmorkalov mentioned this pull request Jul 25, 2024
fengyuentau pushed a commit to fengyuentau/opencv that referenced this pull request Aug 15, 2024
Upgrade RISC-V Vector intrinsic and cleanup the obsolete RVV backend. opencv#25883

This patch upgrade RISC-V Vector intrinsic from `v0.10` to `v0.12`/`v1.0`:
- Update cmake check and options;
- Upgrade RVV implement for Universal Intrinsic;
- Upgrade RVV optimized DNN kernel.
- Cleanup the obsolete RVV backend (`intrin_rvv.hpp`) and compatable header file.

With this patch, RVV backend require Clang 17+ or GCC 14+ (which means `__riscv_v_intrinsic >= 12000`, see https://godbolt.org/z/es7ncETE3)

This patch is test with Clang 17.0.6 (require extra `-DWITH_PNG=OFF` due to ICE), Clang 18.1.8 and GCC 14.1.0 on QEMU and k230 (with `--gtest_filter="*hal_*"`).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup Code cleanup (e.g, drop legacy C-API, legacy unmaintained code) platform: riscv

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants