core: add 64F intrinsic in HAL NEON#7175
Conversation
|
Here is the measurement performance
For Windows measurement, I removed AVX implementation and switched off the IPP.
Please note that this is not for improvement, so most of the measurement number doesn't change. Follow up commit of #7110 |
|
Here is an extra measurement result
|
|
@tomoaki0705 , looks like gcc from Android NDK doesn't support several intrinsics. Probably you should implement them similar to workaround made here: https://github.com/opencv/opencv/pull/6942/files#diff-b6faf5330d7cd50cbadecd85ae1bec5a |
* use universal intrinsic for accumulate series using float/double * accumulate, accumulateSquare, accumulateProduct and accumulateWeighted * add v_cvt_f64_high in both SSE/NEON * add test for conversion v_cvt_f64_high in test_intrin.cpp * improve some existing universal intrinsic by using new instructions in Aarch64 * add workaround for Android build in intrin_neon.hpp
9867582 to
7fef96b
Compare
|
Sorry, I accidentally pushed different commits and re-started the tests. It seems that Android build needs to be triggered manually by some one else, but this test seems promising that workaround proposed by @mshabunin works well. How does it looks ? |
|
Looks good to me! 👍 |
This pullrequest changes