Fixed several cases of unaligned pointer cast#26547
Conversation
| { | ||
| result += (int)CV_POPCNT_U64(*(uint64*)(a + i)); | ||
| uint64_t val; | ||
| std::memcpy(&val, a + i, sizeof(val)); |
There was a problem hiding this comment.
Do we need extra code path under alignment check to avoid possible perf regressions?
There was a problem hiding this comment.
On x86_64 and AArch64 the machine code will be the same (https://godbolt.org/z/4sMoE8cWG), on ARMv7 and RISC-V it could be less efficient. IMO we can keep the simple code for now and then try to optimize it if it would be necessary.
It might be better to add separate scalar function <typename T> int popcnt64(T * ptr) instead of this macro and maybe sum bits for each byte instead of switching between aligned/unaligned load for specific platforms.
There was a problem hiding this comment.
After discussion decided to leave it without extra alignment checks because this is tail processing and shouldn't cause perf degradation on popular platforms.
modules/imgcodecs/src/utils.hpp
Outdated
| { | ||
| return (((const int*)"\0\x1\x2\x3\x4\x5\x6\x7")[0] & 255) != 0; | ||
| const uint32_t val = 1; | ||
| return (*(uint8_t *)&val != 1); |
There was a problem hiding this comment.
AFAIK, we have compiler definition from CMake scripts. See WORDS_BIGENDIAN
There was a problem hiding this comment.
Updated this function to use preprocessor macro. The same macro is used in libtiff, libwebp and softfloat libraries.
| v_int16 v_mul01 = v_reinterpret_as_s16(vx_setall_u32(*((uint32_t*)m))); | ||
| uint32_t val01; | ||
| std::memcpy(&val01, m, sizeof(val01)); | ||
| v_int16 v_mul01 = v_reinterpret_as_s16(vx_setall_u32(val01)); |
There was a problem hiding this comment.
Code is initially wrong relating to little/big endian correctness.
There was a problem hiding this comment.
OpenCV has not been tested on big endian architecture AFAIK, especially with SIMD optimizations. Perhaps we should leave it as-is for now.
|
BTW, this compiler automated check should also enable |
c17a4ee to
c58b6bf
Compare
Fixed several cases of casting from
uchar*to wider pointer type (e.g.uint64_t*) which can cause unaligned access.