Change the lsx to baseline features.#24565
Conversation
This patch change lsx to baseline feature, and lasx to dispatch feature. Additionally, the runtime detection methods for lasx and lsx have been modified.
|
@fengyuentau Please run acc and perf test. The last PR was closed for some reason, so I applied for a new one. |
|
Hello @CNClareChen , accuracy tests are all good, the only thing confused me is the performance regressions. Could you elaborate them? |
|
@fengyuentau This patch does not make many changes to SIMD optimization, so there should be no widespread performance degradation. After applying this patch, the compilation does not need to specify parameters such as -DCPU_BASELINE. I didn't see such a big performance gap in my local tests. I recommend that you can run a few more times. |
| { | ||
| __m256i res = __lasx_xvsrarni_w_d(a.val, a.val, n); | ||
| __lasx_xvstelm_d(res, ptr, 0, 0); | ||
| __lasx_xvstelm_d(res, ptr, 8, 2); |
There was a problem hiding this comment.
This patch does not make many changes to SIMD optimization
So these changes are unintended, right?
There was a problem hiding this comment.
No, these were intentionally modified by me. I just changed the implementation method to avoid mixing LSX instructions, and the number of instructions did not increase.
This patch change lsx to baseline feature, and lasx to dispatch feature. Additionally, the runtime detection methods for lasx and lsx have been modified.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.