Add support for v_exp (exponential)#24941
Conversation
|
The accuracy of the AVX512 instruction set will be worse when the result is too large. |
The difference comes from the errors of Remes algorithm. Different implements will return different results. e^83.4375 = All of them are different but the 6 significant digitals are the same. By definition, the error is always less than 1 ulp (unit in the last place). The larger the number, the larger the ulp will be. So I think it's acceptable and I propose to relax the tests. I use |
|
|
@WanliZhong, I think, we really need to hurry up. Please, finalize this PR as soon as possible. Then, submit another PR with v_log, v_sin and v_cos implemented. |
opencv-alalek
left a comment
There was a problem hiding this comment.
Take a look how to run SIMD emulator configuration (intrin_cpp.hpp):
https://pullrequest.opencv.org/buildbot/builders/4_x_etc-simd-emulator-lin64/builds/100378
|
@WanliZhong friendly reminder. |
|
Possible to merge it soon? This is needed by v_erf, which is needed by gelu acceleration. |
60b02a0 to
f9d3e58
Compare
|
@asmorkalov Hi Alexander, I think we can review this PR again. I have finalized the code as the comments that you, Vadim and Alekin left before. |
Add support for v_log (Natural Logarithm) #25781 This PR aims to implement `v_log(v_float16 x)`, `v_log(v_float32 x)` and `v_log(v_float64 x)`. Merged after #24941 TODO: - [x] double and half float precision - [x] tests for them - [x] doc to explain the implementation ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
|
There is regression detected by weekly build: http://pullrequest.opencv.org/buildbot/builders/4_x_etc-simd-emulator-lin64/builds/100410 |
|
I was able to reproduce the regression locally on Linux host with |
|
I have reproduced this error too, I am trying to fix it |
| #endif | ||
|
|
||
| inline v_float32 v_exp(const v_float32 &x) { | ||
| const v_float32 _vexp_lo_f32 = vx_setall_f32(-88.3762626647949f); |
There was a problem hiding this comment.
In general OpenCV SIMD provides for each compilation unit (.cpp file) all available of these:
- SIMD128 types and
v_functions - SIMD256 types and
v256_functions - SIMD512 types and
v512_functions - aliases for SIMDMAX types and necessary
vx_functions
Here we have just one definition.
And plus one SIMD128 implementation in intrin_cpp.hpp (which conflicts in SIMD128_CPP case)
See simd_utils.impl.hpp which provides 4 implementations inside. If you put your header near that then you should follow the similar scheme (or have the same result).
Note we should have 4 #include statements of that file (to avoid code duplication).
There was a problem hiding this comment.
So should I provide v_, v256_, v512_ and vx_ versions implementation in intrin_math.hpp like simd_utils.impl.hpp? That's really will make many duplicated code.
There was a problem hiding this comment.
It must be a better way of doing it without duplication.
This PR aims to implement
v_exp(v_float16 x),v_exp(v_float32 x)andv_exp(v_float64 x).Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.