Fix misaligned-pointer-use in intrin_sse.hpp#23131
Conversation
| /* Copy misaligned ptr to aligned &tmp */ \ | ||
| /* (memcpy() should be optimized out) */ \ | ||
| memcpy(&tmp, ptr, 4); \ | ||
| __m128i a = _mm_cvtsi32_si128(tmp); \ |
There was a problem hiding this comment.
Slowdown code execution to eliminate compiler warning is not an acceptable way.
memcpy() should be optimized out
Please provide evidences, e.g. on https://godbolt.org/
There was a problem hiding this comment.
https://godbolt.org/z/PxPzd8h11
Are there specific compiler flags for OpenCV? I tried default GCC and Clang configurations: PrintMisalignedPtr() and PrintAlignedPtr() have the same assembly body.
ARM GCC even noticed @ unaligned.
|
It is also Undefined Behavior according to C11:
|
|
Thanks @vrabaud for pointing out that However the presubmit |
| inline _Tpvec v_load_expand_q(const _Tp* ptr) \ | ||
| { \ | ||
| __m128i a = _mm_cvtsi32_si128(*(const int*)ptr); \ | ||
| __m128i a = _mm_loadu_si32(ptr); \ |
There was a problem hiding this comment.
_mm_loadu_si32
This won't compile (GCC 5 (4.8 on 3.4 branch) / MSVS 2015 are in the support list).
Check approach from this PR with used sanitizer: #14022
There was a problem hiding this comment.
Done. Thanks for the example.
I was not able to reproduce the sanitizer error internally anymore (I guess some config changed) so I cannot check if this solution works for me.
Fix misaligned-pointer-use in intrin_sse.hpp * Fix misaligned-pointer-use in intrin_sse.hpp * Use _mm_loadu_si32() instead of memcpy() * Use CV_DECL_ALIGNED instead of _mm_loadu_si32()
ptrmay generate amisaligned-pointer-usewarning with some sanitizers when cast toint*.memcpy()it to the alignedint tmp.summary.pyofrun.py -t coreat 3.4 head and with patch gave this report: opencv_summary_report.txtThe geometric mean and average of the right-most column is 0.98. The median is 0.99.
I guess this "speed loss" comes from performance testing imprecision.
Thanks to @vrabaud for this fix.
Pull Request Readiness Checklist
Patch to opencv_extra has the same branch name.