Conversation
|
Hello @vpisarev , I have already modified the SIMD macro as we discussed, and I also added the license information in the header of the file. Please review it if you have time. |
|
@hanliutong, do we need CV_VECTOR_SIMD_64F? Is there possibility that 32-bit floating-point arithmetics is available, but 64-bit one is not available? |
Yes, according to RISC-V V SPEC Chap 18. Standard Vector Extensions, 64-bit float is not available with Zvl32b, Zve32x and Zve32f. However, I think it will mostly happen on processors for embedded devices, which is not considered in our current implementation, then we always define Maybe we can keep it for future expansion. |
SIMD is already vectorized data. I see no logic behind adding VECTOR word here. May be we want something like VARIABLE_SIMD / SIMD_VARIABLE_VECTOR / SIMD_VVL (variable vector length)? |
|
@alalek, I would insist to use the proposed name. RVV stands for RISC-V Vector extension, ARM SVE stands for Scalable Vector Extension. In general, those instructions should be classified as vector extension. Just using CV_VECTOR would be confusing, because vector has different meanings. CV_VECTOR_SIMD is optimal, because it explicitly mentions vectors and explicitly mentions SIMD, i.e. our SIMD-based HAL. |
It says nothing about variable or fixed size of used SIMD vectors. |
|
Hello @vpisarev, @alalek and @asmorkalov , thank you for the suggestion about the designing of SIMD-related macros. Since there are a lot of different opinions, I did some work to find the best solution, please see below, and sorry for the long message. SIMD and VectorI would like to give more information firstly about SIMD, Vector, RISC-V Vector, and ARM SVE. SIMD is the term coming from Flynn's taxonomy in 1972 as a set of data-level parallelism computer architectures. Vector is also a set of data level parallelism (DLP) computer architectures but is not included in Flynn's taxonomy because Vector is released in 1977, a few years later than Flynn's paper. And as Vadim says, vector has different meanings, people usually think of this as a name of data structure, not a name set of ISA. Duncan's taxonomy extends Flynn's taxonomy in order to include vector processes in 1990, it believes that both Vector and SIMD belong to synchronous architectures. However, I think Duncan's taxonomy is not famous as Flynn's taxonomy with rarely used. John Hennessy and David Patterson give a broader definition of SIMD in their book: Computer Architecture A Quantitative Approach, they define Vector architecture (like RVV), multimedia SIMD instruction set extensions (Neon, SSE, AVX), and graphics processing units (GPUs) as 3 variations of SIMD (this SIMD is from Flynn's taxonomy, I think). ARM SVE is a next-generation SIMD extension to AArch64. Although there is a "vector" in the name, ARM SVE is also a SIMD extension, rather than a vector architecture. More precisely, ARM SVE is a SIMD extension that supports scalable vector length. (I just learned about this part, anyone more familiar with SVE may give more clearer information) RISC-V Vector is a real vector architecture with scalable vector length. By the way, the main different between vector architecture and multimedia SIMD instruction set extensions is how the instructions control is implemented (control register like vl and control instruction like setvl is unique in vector) and communication between execution unit and memory unit (elements are processed sequentially in the pipelined unit on the vector architecture). And whether the register is variable length is not one of the distinguishing features. SIMD macrosThe above information is given from the architectural perspective, but I think when we discuss SIMD-related macros in OpenCV, we should look at the programming model view. In my personal opinion, maybe we don't care if the architecture is traditional SIMD or SIMD with predication or pipelined Vector, from the perspective of the Universal Intrinsic designing and programming model, we only care if the length of the vector registers is fixed or variable, in another word, if the instructions are designed for fixed-size vector length only or for variable vector length. and then let's go back to the starting point and think about the significance of our use of SIMD macros
Accordingly, I find that we need a total of 3 macros. one for scalable vector length, one for fixed vector length, one for both vector length. ProposalSeems the following proposal is the most logical:
However, we need to modify the current CV_SIMD to CV_SIMD_FIXED first, which may cause confusion. And then we need to change it back from CV_SIMD_FIXED to CV_SIMD when some contributor modifies the loop with the new API. But it is reasonable because as I said in the initial comment in this PR, the macro for scalable vector length and for fixed vector length is temporary, when we modify all the SIMD loops with the new API, we only need one macro named CV_SIMD. Then here is another proposal, which will not require massive modification of the existing CV_SIMD
Maybe CV_ANY_SIMD is more fluent. |
|
Macro usage: We should keep "Scalable" code requires more limitations, so optimized "scalable" code is a subset of all optimized SIMD/vector code. Naming:
Details |
|
Hello @alalek , I agree with you that we should avoid use "vector" in the name. Not only to avoid confuses, but also because “scalable” is a more appropriate statement. About the macro usage, I have some different opinions:
I think all the code wrapped by
Yes. But the reason of using CV_SIMD_SCALABLE is we propose the new interface (API) of Universal Intrinsic (and the reason for proposing a new API is that the existing API is not compatible with variable-length architectures). In this case,
#if CV_SIMD //origin SIMD loop
for (int i = 0; i < N; i += v_float32::nlanes )
vx_load([i]...)
vc = va + vb; // not work in rvv back-end, need modify
v_store([i]...)
#endif
#if CV_SIMD || CV_SIMD_SCALABLE //modified SIMD loop
for (int i = 0; i < N; i += Vtraits<v_float32>::vlanes() )
vx_load([i]...)
vc = v_add(va, vb); //Compatible with CV_SIMD macros because of the compatibility layer
v_store([i]...)
#endifAnd there is a full example in #22520 For more detiles about rvv backed and the compatibility layer, please refer to intrin_rvv_scalable.hpp#L18 and intrin.hpp#L714 |
If |
does it means do some thing like this? #if CV_SIMD
#define CV_SIMD_SCALABLE 1
#endifThis does avoid complex macros, but I propose not to do this becasue of two reasons
|
|
Logic behind this is simple:
(the second case is tested regularly on CI in different ways) |
Yes, I totally agree with that! In fact, that is why I introduced I would like to clarify the state of these two macros on different backend:
The optimized code (written by new API) has aready support both variable backend and fixed-size backend, and then with I think we're very close to a consensus, the only difference is that I think a single macro that defined as |
|
Hello @alalek and @hanliutong @vpisarev . I want to refresh the discussion and finally drive to some conclusion. I vote for
Or even do not introduce hew shorter macro al all. Pros for the solution:
|
|
Just a note, that |
|
Hello @asmorkalov @alalek @vpisarev , I have updated the code according to the previous discussion. Thanks you all for your suggestions! I would like to clarify the original idea of introducing a shorter macro: the existing wide universal intrinsic code (guarded with |
|
Hello @hanliutong I apologize for large delay. We discussed the naming on core team meeting couple of times and finally have driven to consensus. The team decided not to introduce new macro, because of very low adoption rate for today and no consensus on naming. So,
|
|
Hello @asmorkalov, I have already update this PR and #22520 (HSV) according to your comment. |
In this patch, we rename the SIMD macro for more clearer, we also introduce a new macro named
CV_SIMD_OR_VECTORinstead ofCV_SIMD || CV_SIMD_SCALABLEto make it more conciseAnd I also think the new macro
CV_SIMD_OR_VECTORis also a temporary solution (please correct me if i'm wrong). SinceCV_SIMDis inherently designed for multiple register widths with size-agnostic code, all the code blocks wrapped inCV_SIMDshould be compatible with the RVV backend. If we can (and also should) translate/modify all the current CV_SIMD code blocks right now, we do not need theCV_SIMD_OR_VECTORmacro anymore. The reasonCV_SIMD_OR_VECTORis needed for now is that there are 442 use cases ofCV_SIMDin 53 files, modifying them would takes a long time.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.