Optimize the v_lut* functions for RISC-V Vector(RVV). by hanliutong · Pull Request #24582 · opencv/opencv

hanliutong · 2023-11-23T12:16:17Z

This patch is going to optimize the implementation of Universal Intrinsic functions v_lut_pairs and v_lut_quads on the RVV backend: when generating index, vector instructions are used to replace loops and std::vector in the existing implementation.

In the core module, v_lut_quads is used in transform_32f in matmul.simd.hpp. According to the experimental results on k230, this patch improves performance by nearly 10x (although it is still slow than the scalar version, may improve with longer VLEN)

Name of Test	scalar	vector	vector_opt	vector vs scalar	vector_opt vs scalar)
Mat_Transform::Size_MatType::(127x61,_32FC3)	0.268	6.146	0.633	0.04	0.42
Mat_Transform::Size_MatType::(640x480,_32FC3)	11.642	246.761	25.622	0.05	0.45
Mat_Transform::Size_MatType::(1920x1080,_32FC3)	76.516	1654.286	172.701	0.05	0.44
Mat_Transform::Size_MatType::(1280x720,_32FC3)	35.173	735.625	76.856	0.05	0.46

Full result are here: core.zip

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

asmorkalov · 2023-11-29T14:48:06Z

@mshabunin Friendly reminder.

mshabunin

Looks good to me.

Perhaps this optimization (transform_32f) should be reimplemented in a different way or disabled for RISC-V in future. As I can see that vectorized part is disabled for ARM platforms for now.

Optimize the v_lut for RVV.

ce05162

asmorkalov added the platform: riscv label Nov 23, 2023

asmorkalov added this to the 4.9.0 milestone Nov 23, 2023

asmorkalov added the optimization label Nov 23, 2023

asmorkalov requested a review from mshabunin November 23, 2023 12:32

asmorkalov assigned mshabunin Nov 29, 2023

mshabunin approved these changes Nov 29, 2023

View reviewed changes

asmorkalov merged commit e202501 into opencv:4.x Nov 30, 2023

asmorkalov mentioned this pull request Jan 19, 2024

5.x merge 4.x #24862

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize the v_lut* functions for RISC-V Vector(RVV).#24582

Optimize the v_lut* functions for RISC-V Vector(RVV).#24582
asmorkalov merged 1 commit intoopencv:4.xfrom
hanliutong:rvv-lut

hanliutong commented Nov 23, 2023

Uh oh!

asmorkalov commented Nov 29, 2023

Uh oh!

mshabunin left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

hanliutong commented Nov 23, 2023

Pull Request Readiness Checklist

Uh oh!

asmorkalov commented Nov 29, 2023

Uh oh!

mshabunin left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants