-
-
Notifications
You must be signed in to change notification settings - Fork 56.5k
AVX does not necessarily imply FP16c (even though OpenCV's CMake does) #18779
Description
System information (version)
- OpenCV => 4.5
- Operating System / Platform => Linux
- Compiler => custom/bazel-based (not CMake)
Detailed description
I could not reproduce it with CMake but when compiling opencv_core with -mavx -mavx2 -mfma (and not -mf16c) I get the error:
include/opencv2/core/cvdef.h:849:39: error: '__builtin_ia32_vcvtps2ph' needs target feature f16c
w = (ushort)_mm_cvtsi128_si32(_mm_cvtps_ph(v, 0));
^
include/f16cintrin.h:96:12: note: expanded from macro '_mm_cvtps_ph'
(__m128i)__builtin_ia32_vcvtps2ph((__v4sf)(__m128)(a), (imm))
Indeed, according to https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_cvtph_ps&expand=1717 , you need to have f16c enabled to get _mm_cvtps_ph and _mm256_cvtph_ps working.
Apparently, all processors supporting avx2 also support f16c but you might choose not to enable f16c (CMake does enable it by default though so that is why the build farm cannot see the error).
I believe code using _mm_cvtps_ph should be guarded for the presence of f16c.
Issue submission checklist
- I report the issue, it's not a question
- I checked the problem with documentation, FAQ, open issues,
answers.opencv.org, Stack Overflow, etc and have not found solution - I updated to latest OpenCV version and the issue is still there