Skip to content

ARMv8 CPU features management is broken for some cases #24588

@asmorkalov

Description

@asmorkalov

System Information

Platform: amrv8 + dotprodoct extension or fp16 neon extension
OS: Linux
Compiler: GCC.

Detailed description

OpenCV handles ARMv8 extensions incorrectly. Other architectures use extra flags to enable extensions, e.g. msse3, -mavx. ARMv8 defines extra architecture options in march as -march=armv8.2-a+fp16+bf16.

cmake -DCPU_BASELINE="NEON;NEON_FP16;NEON_DOTPROD" ../opencv
Substitutes wrong flags to compiler: -march=armv8.2-a+dotprod -march=armv8.2-a+fp16

It leads to build error:

[ 57%] Building CXX object modules/core/CMakeFiles/opencv_core.dir/src/matrix.cpp.o
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/cv_cpu_dispatch.h:83,
                 from /home/aromanov/opencv/modules/core/include/opencv2/core/cvdef.h:361,
                 from /home/aromanov/opencv/modules/core/include/opencv2/core.hpp:52,
                 from /home/aromanov/opencv/modules/core/include/opencv2/core/utility.hpp:56,
                 from /home/aromanov/opencv/modules/core/src/precomp.hpp:53,
                 from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h: In function ‘cv::hal_baseline::v_uint32x4 cv::hal_baseline::v_dotprod_expand(const cv::hal_baseline::v_uint8x16&, const cv::hal_baseline::v_uint8x16&, const cv::hal_baseline::v_uint32x4&)’:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:33666:1: error: inlining failed in call to always_inline ‘uint32x4_t vdotq_u32(uint32x4_t, uint8x16_t, uint8x16_t)’: target specific option mismatch
33666 | vdotq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
      | ^~~~~~~
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin.hpp:221,
                 from /home/aromanov/opencv/modules/core/src/precomp.hpp:88,
                 from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:708:37: note: called from here
  708 | OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP(v_uint32x4, v_uint8x16, u32)
      |                                     ^
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:705:12: note: in definition of macro ‘OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP’
  705 |     return _Tpvec1(vdotq_##suffix(c.val, a.val, b.val)); \
      |            ^~~
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/cv_cpu_dispatch.h:83,
                 from /home/aromanov/opencv/modules/core/include/opencv2/core/cvdef.h:361,
                 from /home/aromanov/opencv/modules/core/include/opencv2/core.hpp:52,
                 from /home/aromanov/opencv/modules/core/include/opencv2/core/utility.hpp:56,
                 from /home/aromanov/opencv/modules/core/src/precomp.hpp:53,
                 from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:33666:1: error: inlining failed in call to always_inline ‘uint32x4_t vdotq_u32(uint32x4_t, uint8x16_t, uint8x16_t)’: target specific option mismatch
33666 | vdotq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
      | ^~~~~
In file included from /home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin.hpp:221,
                 from /home/aromanov/opencv/modules/core/src/precomp.hpp:88,
                 from /home/aromanov/opencv/modules/core/src/matmul.dispatch.cpp:44:
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:708:37: note: called from here
  708 | OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP(v_uint32x4, v_uint8x16, u32)
      |                                     ^
/home/aromanov/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:705:12: note: in definition of macro ‘OPENCV_HAL_IMPL_NEON_DOT_PRODUCT_OP’
  705 |     return _Tpvec1(vdotq_##suffix(c.val, a.val, b.val)); \
      |            ^~~~~
make[3]: *** [modules/core/CMakeFiles/opencv_core.dir/build.make:597: modules/core/CMakeFiles/opencv_core.dir/src/matmul.dispatch.cpp.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [CMakeFiles/Makefile2:2297: modules/core/CMakeFiles/opencv_core.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:2703: modules/dnn/CMakeFiles/opencv_test_dnn.dir/rule] Error 2
make: *** [Makefile:626: opencv_test_dnn] Error 2

Steps to reproduce

cmake -DCPU_BASELINE="NEON;NEON_FP16;NEON_DOTPROD" ../opencv
make -j4

Issue submission checklist

  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
  • I updated to the latest OpenCV version and the issue is still there
  • There is reproducer code and related data files (videos, images, onnx, etc)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions