Fixed invalid cast and unaligned memory access#23734
Conversation
|
Here you can see that code generation is as good as before: https://godbolt.org/z/bcW3qxd5K |
|
The one bot warning: Doesn't seem like it could have been caused by my change here... |
|
@seanm, sorry, this patch cannot be accepted. We use platform-agnostic little-endian data representation. On big-endian platforms the proposed code will provide wrong results. |
@vpisarev thanks for your review. It did look to me like there was supposed to be byte-swapping here, but if that's the intention, I think the code is already broken. Before my change, it had: #if defined __i386__ || defined(_M_IX86) || defined __x86_64__ || defined(_M_X64)
#define CV_UNALIGNED_LITTLE_ENDIAN_MEM_ACCESS 1
#else
#define CV_UNALIGNED_LITTLE_ENDIAN_MEM_ACCESS 0
#endifBut Intel is not the only little endian processor. As you know, almost everything is little endian now. Apple Silicon for example is little endian, but goes into the What is I actually have OpenBSD working on an old big endian PowerPC Mac, I could try to build OpenCV there... |
It is an internal optimization flag for some platforms which are LE and they support unaligned access by default (even x86 has "align" flag, but it is not the default - https://stackoverflow.com/questions/548164/mis-aligned-pointers-on-x86). If you have issue with sanitizers please update condition to filter out enabling of that flag. This patch introduces a bug - on different platforms data is interpreted / stored in a different way (LE / BE). This is critical for storing / transferring of data which should be platform-independent. See also: |
|
The patch was discussed on OpenCV core team meeting. The team decided to reject it, because it breaks some platforms. |
|
Please reopen. I can update the patch to fix the endian issue. But discussion is required first.
I think you misunderstood my question. I know about unaligned access on x86. I'm saying that, before my patch even, the code was wrong. This below test is not determining if the system is little endian vs big endian. It is determining if the system is x86 or not. x86 is not the only little endian system. #if defined __i386__ || defined(_M_IX86) || defined __x86_64__ || defined(_M_X64)
#define CV_UNALIGNED_LITTLE_ENDIAN_MEM_ACCESS 1
#else
#define CV_UNALIGNED_LITTLE_ENDIAN_MEM_ACCESS 0
#endifThen this code: static inline int readInt(const uchar* p)
{
#if CV_UNALIGNED_LITTLE_ENDIAN_MEM_ACCESS
return *(const int*)p;
#else
int val = (int)(p[0] | (p[1] << 8) | (p[2] << 16) | (p[3] << 24));
return val;
#endif
}Is using Little endian systems like Apple Silicon are going into the #else branch and doing the byte swap. Is that what you want? I though no, which I why I removed the byte swapping code. Which big endian systems does OpenCV actually support? I tried building on PowerPC OpenBSD, but OpenCV does not even compile. Does your CI have any big endian? |
There is no byte-swap in that code for other LE systems.
Please read the previous comment again. Your assumption is wrong.
There is no official support for *BSD. It is community driven activity. |
Although acceptible to Intel CPUs, it's still undefined behaviour according to the C++ standard. It can be replaced with memcpy, which makes the code simpler, and it generates the same assembly code with gcc and clang with -O2 (verified with godbolt). Also expanded the test to include other little endian CPUs by testing for __LITTLE_ENDIAN__.
You are correct! I got that very mixed up! Please see my updated commit (and commit message). |
|
@opencv-alalek does this seem correct and mergeable now? |
|
@vpisarev @opencv-alalek could you take a look? |
1 similar comment
|
@vpisarev @opencv-alalek could you take a look? |
opencv-alalek
left a comment
There was a problem hiding this comment.
godbolt shows that compilers are smart enough to optimize this properly, so no changes are expected for modern platforms.
Although acceptable to Intel CPUs, it's still undefined behaviour according to the C++ standard.
It can be replaced with memcpy, which makes the code much simpler, and it generates the same assembly code with gcc and clang with -O2 (verified with godbolt).