-
-
Notifications
You must be signed in to change notification settings - Fork 56.5k
ARMv8 CPU features management is broken for some cases #24588
Copy link
Copy link
Closed
Labels
bugcategory: build/installplatform: armARM boards related issues: RPi, NVIDIA TK/TX, etcARM boards related issues: RPi, NVIDIA TK/TX, etc
Milestone
Description
System Information
Platform: amrv8 + dotprodoct extension or fp16 neon extension
OS: Linux
Compiler: GCC.
Detailed description
OpenCV handles ARMv8 extensions incorrectly. Other architectures use extra flags to enable extensions, e.g. msse3, -mavx. ARMv8 defines extra architecture options in march as -march=armv8.2-a+fp16+bf16.
cmake -DCPU_BASELINE="NEON;NEON_FP16;NEON_DOTPROD" ../opencv
Substitutes wrong flags to compiler: -march=armv8.2-a+dotprod -march=armv8.2-a+fp16
It leads to build error:
[ 57%] Building CXX object modules/core/CMakeFiles/opencv_core.dir/src/matrix.cpp.o
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/cv_cpu_dispatch.h:83,
from /home/aromanov/opencv/modules/core/include/opencv2/core/cvdef.h:361,
from /home/aromanov/opencv/modules/core/include/opencv2/core.hpp:52,
from /home/aromanov/opencv/modules/core/include/opencv2/core/utility.hpp:56,
from /home/aromanov/opencv/modules/core/src/precomp.hpp:53,
from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h: In function ‘cv::hal_baseline::v_uint32x4 cv::hal_baseline::v_dotprod_expand(const cv::hal_baseline::v_uint8x16&, const cv::hal_baseline::v_uint8x16&, const cv::hal_baseline::v_uint32x4&)’:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:33666:1: error: inlining failed in call to always_inline ‘uint32x4_t vdotq_u32(uint32x4_t, uint8x16_t, uint8x16_t)’: target specific option mismatch
33666 | vdotq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
| ^~~~~~~
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin.hpp:221,
from /home/aromanov/opencv/modules/core/src/precomp.hpp:88,
from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:708:37: note: called from here
708 | OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP(v_uint32x4, v_uint8x16, u32)
| ^
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:705:12: note: in definition of macro ‘OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP’
705 | return _Tpvec1(vdotq_##suffix(c.val, a.val, b.val)); \
| ^~~
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/cv_cpu_dispatch.h:83,
from /home/aromanov/opencv/modules/core/include/opencv2/core/cvdef.h:361,
from /home/aromanov/opencv/modules/core/include/opencv2/core.hpp:52,
from /home/aromanov/opencv/modules/core/include/opencv2/core/utility.hpp:56,
from /home/aromanov/opencv/modules/core/src/precomp.hpp:53,
from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:33666:1: error: inlining failed in call to always_inline ‘uint32x4_t vdotq_u32(uint32x4_t, uint8x16_t, uint8x16_t)’: target specific option mismatch
33666 | vdotq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
| ^~~~~
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin.hpp:221,
from /home/aromanov/opencv/modules/core/src/precomp.hpp:88,
from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:708:37: note: called from here
708 | OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP(v_uint32x4, v_uint8x16, u32)
| ^
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:705:12: note: in definition of macro ‘OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP’
705 | return _Tpvec1(vdotq_##suffix(c.val, a.val, b.val)); \
| ^~~~~
make[3]: *** [modules/core/CMakeFiles/opencv_core.dir/build.make:597: modules/core/CMakeFiles/opencv_core.dir/src/matmul.dispatch.cpp.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [CMakeFiles/Makefile2:2297: modules/core/CMakeFiles/opencv_core.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:2703: modules/dnn/CMakeFiles/opencv_test_dnn.dir/rule] Error 2
make: *** [Makefile:626: opencv_test_dnn] Error 2
Steps to reproduce
cmake -DCPU_BASELINE="NEON;NEON_FP16;NEON_DOTPROD" ../opencv
make -j4
Issue submission checklist
- I report the issue, it's not a question
- I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
- I updated to the latest OpenCV version and the issue is still there
- There is reproducer code and related data files (videos, images, onnx, etc)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugcategory: build/installplatform: armARM boards related issues: RPi, NVIDIA TK/TX, etcARM boards related issues: RPi, NVIDIA TK/TX, etc