[Refactor](HAL RVV): Consolidate Helpers for Code Reusability#26977
[Refactor](HAL RVV): Consolidate Helpers for Code Reusability#26977asmorkalov merged 5 commits intoopencv:4.xfrom
Conversation
|
@asmorkalov I believe this PR is simple and doesn’t introduce any issues. Could you review and merge it quickly? I have several other PRs waiting on this one. Thanks a lot! |
|
Did you consider using overloaded intrinsics variants where possible? Sections 6.4 - 6.6 and Appendices C and D of the spec. |
|
I think I have use overloaded intrinsics in all possible places. The helper class in this PR is designed to solve the macro problem, which provides unified interfaces. The remaining parts in the helper class that use macros instead of overloaded intrinsics are mainly due to the following two reasons:
For the two examples you mentioned, they do exactly the same thing as this PR, and parts of this patch are also inspired by them. They use macros to define the helper class and use the helper class for templates. And this PR is to introduce a unified helper class. |
Those PRs don't use macros, only templates: Other PRs use macros, yes. |
|
Sorry, I didn’t look closely. I thought you were referring to the If we fully expand those macros, the file would end up being several thousand lines long. Another option is to use inheritance, but even then, it would add nearly a thousand lines. If you think that’s better, I can use inheritance to remove all the macros. Alternatively, if we go with inheritance, perhaps it would be better to split the signed integers, unsigned integers, and floating-point numbers into three separate files? |
|
No, let's leave macros then. |
|
Since each unified function corresponds to a single intrinsic, I believe it’s acceptable to use macros to define these "function aliases." Developers can then use them just like the original intrinsics. |
ac55d31 to
19f297e
Compare
There was a problem hiding this comment.
I'm also looking for a proper way to collect rvv intrinsics into C++ functions.
For the functions in HAL_RVV_SIZE_UNRELATED, my current solution is to extract the scalar type from a rvv type first, then dispatch them according to the result of std::is_floating_point_v, std::is_unsigned_v statically, e.g.
template <typename Vtype, typename Enabled = void> struct rvv_scalar
{
using type = decltype(__riscv_vmv_x(std::declval<Vtype>()));
};
template <typename Vtype>
struct rvv_scalar<Vtype,
std::void_t<decltype(__riscv_vfmv_f(std::declval<Vtype>()))>>
{
using type = decltype(__riscv_vfmv_f(std::declval<Vtype>()));
};
template <typename Vtype> using rvv_scalar_t = typename rvv_scalar<Vtype>::type;
static_assert(std::is_same_v<rvv_scalar_t<vint32m1_t>, int>);template<typename V, typename T>
auto rvv_min(V a, T b, size_t vl) {
using scalar = rvv_scalar_t<V>;
if constexpr (std::is_floating_point_v<scalar>)
return __riscv_vfmin(a, b, vl);
else if constexpr (std::is_unsigned_v<scalar>)
return __riscv_vminu(a, b, vl);
else
return __riscv_vmin(a, b, vl);
}
...So that there's no need to define vmin in every struct RVV.
And users may write rvv_min(v1, v1, vl) directly instead of figuring out what VEC_T is then do VEC_T::vmin(....
🤔
The code above requires C++17, but I guess there is also C++11 equivalence.
mshabunin
left a comment
There was a problem hiding this comment.
There is potential conflict with #26865
cc @amane-ame
@horror-proton , yes it might be useful to implement switching between i, u and f variants. Perhaps regular if might work too or we would need to use std::enable_if maybe. On 4.x branch we are limited by C++11.
|
@horror-proton Indeed, in C++11, template specialization can be used for dispatching, but this would result in each function generating an additional three functions each with more than three lines of code. Also, if we don’t use macros, I believe using inheritance for categorization would look better. If macros are used, I think this is the shortest approach for now. As for the functions you mentioned that don't need to worry about @mshabunin You can merge that PR first. I will resolve conflict and try to make that PR use this PR's helper class. |
19f297e to
749d239
Compare
|
The related patch has been merged. Please rebase and fix conflicts. |
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
749d239 to
9d95704
Compare
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
9d95704 to
8ff280b
Compare
|
Since the last commit make a great change, I have tested testcases all mentioned in corresponding PR (#26865 #26892 #26958). The list is too long so a file is attached. solve_dxt_pyramids_report.txt Through some factor <0.9 and >1.1 appears, when rerun these testcases they have similar performance again. So it looks like this commit doesn't break anything. |
This PR introduces a new helper file with utility types and templates to standardize function interfaces. This refactor allows us to avoid duplicate code when types differ but logic remains the same.
The
flipandminmaximplementations have been updated to use the new generic helpers, replacing the previously defined, redundant classes.Due to the large number of functions, not all interfaces are unified yet. Future development can extend the types as needed. While the usage of function templates is currently limited, this will ease future development.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.