Skip to content

RISC-V RVV 0.7: v_add/v_sub saturation and avoiding 64-bit VLEN#23198

Merged
opencv-pushbot merged 1 commit intoopencv:4.xfrom
mshabunin:fix-rvv-07
Jan 31, 2023
Merged

RISC-V RVV 0.7: v_add/v_sub saturation and avoiding 64-bit VLEN#23198
opencv-pushbot merged 1 commit intoopencv:4.xfrom
mshabunin:fix-rvv-07

Conversation

@mshabunin
Copy link
Copy Markdown
Contributor

@mshabunin mshabunin commented Jan 30, 2023

This PR includes two fixes for RISC-V RVV 0.7 intrinsics:

  • Use non-saturated add/sub instructions for 32 and 64 bit int
    Apparently 32- and 64-bit types differ from 8- and 16-bit, e.g. NEON intrinsics have same pattern (vqadd - saturated, vadd - wraps on overflow):
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_uint8x16, vqaddq_u8)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_uint8x16, vqsubq_u8)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_int8x16, vqaddq_s8)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_int8x16, vqsubq_s8)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_uint16x8, vqaddq_u16)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_uint16x8, vqsubq_u16)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_int16x8, vqaddq_s16)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_int16x8, vqsubq_s16)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_int32x4, vaddq_s32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_int32x4, vsubq_s32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(*, v_int32x4, vmulq_s32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_uint32x4, vaddq_u32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_uint32x4, vsubq_u32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(*, v_uint32x4, vmulq_u32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_float32x4, vaddq_f32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_float32x4, vsubq_f32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(*, v_float32x4, vmulq_f32)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_int64x2, vaddq_s64)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_int64x2, vsubq_s64)
    OPENCV_HAL_IMPL_NEON_BIN_OP(+, v_uint64x2, vaddq_u64)
    OPENCV_HAL_IMPL_NEON_BIN_OP(-, v_uint64x2, vsubq_u64)
  • v_check_ intrinsics use 32-bit element size to avoid 64-bit operations which are not supported by HW
force_builders=Custom
Xbuild_image:Custom=riscv-gcc
Xbuild_image:Custom=riscv-gcc-rvv
build_image:Custom=riscv-gcc-rvv-128
Xbuild_image:Custom=riscv-clang
Xbuild_image:Custom=riscv-clang-rvv
Xbuild_image:Custom=riscv-clang-rvv-128
test_modules:Custom=core,imgproc,dnn
buildworker:Custom=linux-1,linux-4
test_timeout:Custom=1200
build_contrib:Custom=OFF

@opencv-pushbot opencv-pushbot merged commit a6b178a into opencv:4.x Jan 31, 2023
@mshabunin mshabunin deleted the fix-rvv-07 branch February 1, 2023 12:33
@asmorkalov asmorkalov mentioned this pull request May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants