Skip to content

Added NEON support in builds for Windows on ARM#21630

Merged
alalek merged 3 commits intoopencv:4.xfrom
shibayan:arm64-msvc-neon
Feb 26, 2022
Merged

Added NEON support in builds for Windows on ARM#21630
alalek merged 3 commits intoopencv:4.xfrom
shibayan:arm64-msvc-neon

Conversation

@shibayan
Copy link
Copy Markdown
Contributor

@shibayan shibayan commented Feb 17, 2022

Since Visual Studio 16.11 Visual Studio 2022 (17.0), the data types (e.g. int32x4_t) used by ARM NEON instructions are now defined, so the blocking of NEON support in builds for Windows on ARM has been resolved.

TODO:

  • (alalek) Fix BUILD_SAMPLES=ON mode
    • find_package(OpenCL) founds OpenCL from CUDA SDK which can't work with ARM64 cross-compialtion
    • added -DWITH_OPENCL=OFF -DCMAKE_DISABLE_FIND_PACKAGE_OpenCL=ON
  • (alalek) Configure MSVS 2022 build image

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom Win
Xbuild_image:Custom Win=msvs2019-arm64
build_image:Custom Win=msvs2022-arm64
buildworker:Custom Win=windows-1
build_contrib:Custom Win=OFF
Xbuild_examples:Custom Win=OFF

@shibayan
Copy link
Copy Markdown
Contributor Author

Fixed HAVE_CPU_NEON_SUPPORT display broken during compiler test.

--- Performing Test HAVE_CXX_D__ARM64_DISTINCT_NEON_TYPES (check file: cmake/checks/cpu_neon.cpp)
--- Performing Test HAVE_CXX_D__ARM64_DISTINCT_NEON_TYPES - Success
+-- Performing Test HAVE_CPU_NEON_SUPPORT (check file: cmake/checks/cpu_neon.cpp)
+-- Performing Test HAVE_CPU_NEON_SUPPORT - Success

@asmorkalov
Copy link
Copy Markdown
Contributor

Potentially related: #8429

@shibayan
Copy link
Copy Markdown
Contributor Author

shibayan commented Feb 21, 2022

Apparently, the ARM64 compiler in VC++ 2019 still has a bug, so it is failing to build.

VC++ 2022 fixes all of the problems, so I was able to build successfully, and the built binaries were confirmed to work on a Windows on ARM machine.

Build CMake command

cmake -G "Visual Studio 17 2022" -A ARM64 -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_SYSTEM_VERSION=10.0 -DCMAKE_SYSTEM_PROCESSOR=ARM64 -DWITH_OPENCL=OFF -DWITH_FFMPEG=OFF -DWITH_CUDA=OFF -DBUILD_EXAMPLES=ON -DBUILD_TESTS=ON ..\
cmake --build . --config Release
CMake stdout log (Build log is trimmed)
-- 'Release' build type is used by default. Use CMAKE_BUILD_TYPE to specify build type (Release or Debug)
-- Selecting Windows SDK version 10.0.20348.0 to target Windows 10.0.
-- The CXX compiler identification is MSVC 19.31.31104.0
-- The C compiler identification is MSVC 19.31.31104.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.31.31103/bin/Hostx64/arm64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.31.31103/bin/Hostx64/arm64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detected processor: ARM64
-- Could NOT find PythonInterp (missing: PYTHON_EXECUTABLE) (Required is at least version "2.7")
-- Could NOT find PythonInterp (missing: PYTHON_EXECUTABLE) (Required is at least version "3.2")
-- Performing Test HAVE_CXX_FP:PRECISE
-- Performing Test HAVE_CXX_FP:PRECISE - Success
-- Performing Test HAVE_C_FP:PRECISE
-- Performing Test HAVE_C_FP:PRECISE - Success
-- Performing Test HAVE_CPU_NEON_SUPPORT (check file: cmake/checks/cpu_neon.cpp)
-- Performing Test HAVE_CPU_NEON_SUPPORT - Success
-- Performing Test HAVE_CPU_FP16_SUPPORT (check file: cmake/checks/cpu_fp16.cpp)
-- Performing Test HAVE_CPU_FP16_SUPPORT - Failed
-- FP16 is not supported by C++ compiler
-- Optimization FP16 is not available, skipped
-- Performing Test HAVE_CPU_BASELINE_FLAGS
-- Performing Test HAVE_CPU_BASELINE_FLAGS - Success
-- Performing Test HAVE_CXX_W15240
-- Performing Test HAVE_CXX_W15240 - Success
-- Performing Test HAVE_C_W15240
-- Performing Test HAVE_C_W15240 - Success
-- Looking for malloc.h
-- Looking for malloc.h - found
-- Looking for _aligned_malloc
-- Looking for _aligned_malloc - found
-- Looking for fseeko
-- Looking for fseeko - not found
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of off64_t
-- Check size of off64_t - failed
-- libjpeg-turbo: VERSION = 2.1.2, BUILD = opencv-4.5.5-dev-libjpeg-turbo
-- Check size of size_t
-- Check size of size_t - done
-- Check size of unsigned long
-- Check size of unsigned long - done
-- Looking for include file intrin.h
-- Looking for include file intrin.h - found
-- Looking for assert.h
-- Looking for assert.h - found
-- Looking for fcntl.h
-- Looking for fcntl.h - found
-- Looking for inttypes.h
-- Looking for inttypes.h - found
-- Looking for io.h
-- Looking for io.h - found
-- Looking for limits.h
-- Looking for limits.h - found
-- Looking for memory.h
-- Looking for memory.h - found
-- Looking for search.h
-- Looking for search.h - found
-- Looking for string.h
-- Looking for string.h - found
-- Performing Test C_HAS_inline
-- Performing Test C_HAS_inline - Success
-- Check size of signed short
-- Check size of signed short - done
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Check size of signed int
-- Check size of signed int - done
-- Check size of unsigned int
-- Check size of unsigned int - done
-- Check size of signed long
-- Check size of signed long - done
-- Check size of signed long long
-- Check size of signed long long - done
-- Check size of unsigned long long
-- Check size of unsigned long long - done
-- Check size of unsigned char *
-- Check size of unsigned char * - done
-- Check size of ptrdiff_t
-- Check size of ptrdiff_t - done
-- Looking for memmove
-- Looking for memmove - not found
-- Looking for setmode
-- Looking for setmode - found
-- Looking for strcasecmp
-- Looking for strcasecmp - not found
-- Looking for strchr
-- Looking for strchr - found
-- Looking for strrchr
-- Looking for strrchr - found
-- Looking for strstr
-- Looking for strstr - found
-- Looking for strtol
-- Looking for strtol - found
-- Looking for strtol
-- Looking for strtol - found
-- Looking for strtoull
-- Looking for strtoull - found
-- Looking for lfind
-- Looking for lfind - found
-- Performing Test HAVE_SNPRINTF
-- Performing Test HAVE_SNPRINTF - Success
-- Performing Test HAVE_C_STD_C99
-- Performing Test HAVE_C_STD_C99 - Failed
-- Could NOT find OpenJPEG (minimal suitable version: 2.0, recommended version >= 2.3.1). OpenJPEG will be built from sources
-- OpenJPEG: VERSION = 2.4.0, BUILD = opencv-4.5.5-dev-openjp2-2.4.0
-- Looking for stdlib.h
-- Looking for stdlib.h - found
-- Looking for stdio.h
-- Looking for stdio.h - found
-- Looking for math.h
-- Looking for math.h - found
-- Looking for float.h
-- Looking for float.h - found
-- Looking for time.h
-- Looking for time.h - found
-- Looking for stdarg.h
-- Looking for stdarg.h - found
-- Looking for ctype.h
-- Looking for ctype.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for inttypes.h
-- Looking for inttypes.h - found
-- Looking for strings.h
-- Looking for strings.h - not found
-- Looking for sys/stat.h
-- Looking for sys/stat.h - found
-- Looking for unistd.h
-- Looking for unistd.h - not found
-- Looking for include file malloc.h
-- Looking for include file malloc.h - found
-- Looking for _aligned_malloc
-- Looking for _aligned_malloc - found
-- Looking for posix_memalign
-- Looking for posix_memalign - not found
-- Looking for memalign
-- Looking for memalign - not found
-- OpenJPEG libraries will be built from sources: libopenjp2 (version "2.4.0")
-- Could not find OpenBLAS include. Turning OpenBLAS_FOUND off
-- Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
-- Looking for sgemm_
-- Looking for sgemm_ - not found
-- Found Threads: TRUE
-- Found OpenMP_C: -openmp (found version "2.0")
-- Found OpenMP: TRUE (found version "2.0") found components: C
-- Could NOT find BLAS (missing: BLAS_LIBRARIES)
-- Could NOT find LAPACK (missing: LAPACK_LIBRARIES)
    Reason given by package: LAPACK could not be found because dependency BLAS could not be found.

CMake Deprecation Warning at 3rdparty/carotene/hal/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Deprecation Warning at 3rdparty/carotene/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- ADE: Download: v0.1.1f.zip
-- Looking for mfapi.h
-- Looking for mfapi.h - found
-- Looking for d3d11_4.h
-- Looking for d3d11_4.h - found
-- Allocator metrics storage type: 'int'
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.sse2.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.sse3.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.ssse3.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.sse4_1.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.sse4_2.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.avx.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.fp16.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.avx2.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin128.avx512_skx.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin256.avx2.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin256.avx512_skx.cpp
-- Excluding from source files list: <BUILD>/modules/core/test/test_intrin512.avx512_skx.cpp
-- Excluding from source files list: modules/imgproc/src/corner.avx.cpp
-- Excluding from source files list: modules/imgproc/src/imgwarp.avx2.cpp
-- Excluding from source files list: modules/imgproc/src/imgwarp.sse4_1.cpp
-- Excluding from source files list: modules/imgproc/src/resize.avx2.cpp
-- Excluding from source files list: modules/imgproc/src/resize.sse4_1.cpp
-- Registering hook 'INIT_MODULE_SOURCES_opencv_dnn': C:/Users/shibayan/Documents/GitHub/opencv/modules/dnn/cmake/hooks/INIT_MODULE_SOURCES_opencv_dnn.cmake
-- opencv_dnn: filter out ocl4dnn source code
-- opencv_dnn: filter out cuda4dnn source code
-- Excluding from source files list: <BUILD>/modules/dnn/layers/layers_common.avx.cpp
-- Excluding from source files list: <BUILD>/modules/dnn/layers/layers_common.avx2.cpp
-- Excluding from source files list: <BUILD>/modules/dnn/layers/layers_common.avx512_skx.cpp
-- Excluding from source files list: <BUILD>/modules/dnn/layers/layers_common.rvv.cpp
-- Excluding from source files list: <BUILD>/modules/dnn/int8layers/layers_common.avx2.cpp
-- Excluding from source files list: <BUILD>/modules/dnn/int8layers/layers_common.avx512_skx.cpp
-- Excluding from source files list: modules/features2d/src/fast.avx2.cpp
-- imgcodecs: OpenEXR codec is disabled in runtime. Details: https://github.com/opencv/opencv/issues/21326
-- highgui: using builtin backend: WIN32UI
-- Found 'misc' Python modules from C:/Users/shibayan/Documents/GitHub/opencv/modules/python/package/extra_modules
-- Found 'mat_wrapper;utils' Python modules from C:/Users/shibayan/Documents/GitHub/opencv/modules/core/misc/python/package
-- Found 'gapi' Python modules from C:/Users/shibayan/Documents/GitHub/opencv/modules/gapi/misc/python/package
-- OpenCL samples are skipped: OpenCL SDK is required
-- SYCL/OpenCL samples are skipped: SYCL SDK is required
--    - check configuration of SYCL_DIR/SYCL_ROOT/CMAKE_MODULE_PATH
--    - ensure that right compiler is selected from SYCL SDK (e.g, clang++): CMAKE_CXX_COMPILER=C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.31.31103/bin/Hostx64/arm64/cl.exe
--
-- General configuration for OpenCV 4.5.5-dev =====================================
--   Version control:               4.1.2-4257-g92b401326d
--
--   Platform:
--     Timestamp:                   2022-02-21T15:28:27Z
--     Host:                        Windows 10.0.22000 AMD64
--     Target:                      Windows 10.0 ARM64
--     CMake:                       3.22.22011901-MSVC_2
--     CMake generator:             Visual Studio 17 2022
--     CMake build tool:            C:/Program Files/Microsoft Visual Studio/2022/Enterprise/MSBuild/Current/Bin/amd64/MSBuild.exe
--     MSVC:                        1931
--     Configuration:               Debug Release
--
--   CPU/HW features:
--     Baseline:                    NEON
--       requested:                 NEON FP16
--
--   C/C++:
--     Built as dynamic libs?:      YES
--     C++ standard:                11
--     C++ Compiler:                C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.31.31103/bin/Hostx64/arm64/cl.exe  (ver 19.31.31104.0)
--     C++ flags (Release):         /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /D _ARM64_DISTINCT_NEON_TYPES /Oi  /fp:precise   /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP  /MD /O2 /Ob2 /DNDEBUG
--     C++ flags (Debug):           /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /D _ARM64_DISTINCT_NEON_TYPES /Oi  /fp:precise   /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP  /MDd /Zi /Ob0 /Od /RTC1
--     C Compiler:                  C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.31.31103/bin/Hostx64/arm64/cl.exe
--     C flags (Release):           /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /D _ARM64_DISTINCT_NEON_TYPES /Oi  /fp:precise   /MP   /MD /O2 /Ob2 /DNDEBUG
--     C flags (Debug):             /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /D _ARM64_DISTINCT_NEON_TYPES /Oi  /fp:precise   /MP /MDd /Zi /Ob0 /Od /RTC1
--     Linker flags (Release):      /machine:ARM64  /INCREMENTAL:NO
--     Linker flags (Debug):        /machine:ARM64  /debug /INCREMENTAL
--     ccache:                      NO
--     Precompiled headers:         YES
--     Extra dependencies:
--     3rdparty dependencies:
--
--   OpenCV modules:
--     To be built:                 calib3d core dnn features2d flann gapi highgui imgcodecs imgproc ml objdetect photo stitching ts video videoio
--     Disabled:                    world
--     Disabled by dependency:      -
--     Unavailable:                 java python2 python3
--     Applications:                tests perf_tests examples apps
--     Documentation:               NO
--     Non-free algorithms:         NO
--
--   Windows RT support:            NO
--
--   GUI:                           WIN32UI
--     Win32 UI:                    YES
--
--   Media I/O:
--     ZLib:                        build (ver 1.2.11)
--     JPEG:                        build-libjpeg-turbo (ver 2.1.2-62)
--     WEBP:                        build (ver encoder: 0x020f)
--     PNG:                         build (ver 1.6.37)
--     TIFF:                        build (ver 42 - 4.2.0)
--     JPEG 2000:                   build (ver 2.4.0)
--     OpenEXR:                     build (ver 2.3.0)
--     HDR:                         YES
--     SUNRASTER:                   YES
--     PXM:                         YES
--     PFM:                         YES
--
--   Video I/O:
--     DC1394:                      NO
--     GStreamer:                   NO
--     DirectShow:                  YES
--     Media Foundation:            YES
--       DXVA:                      YES
--
--   Parallel framework:            Concurrency
--
--   Trace:                         YES (with Intel ITT)
--
--   Other third-party libraries:
--     Lapack:                      NO
--     Custom HAL:                  YES (carotene (ver 0.0.1))
--     Protobuf:                    build (3.19.1)
--
--   Python (for build):            NO
--
--   Install to:                    C:/Users/shibayan/Documents/GitHub/opencv/build/install
-- -----------------------------------------------------------------
--
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/shibayan/Documents/GitHub/opencv/build

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Thank you for contribution 👍

@alalek alalek merged commit d354ad1 into opencv:4.x Feb 26, 2022
@shibayan shibayan deleted the arm64-msvc-neon branch February 26, 2022 17:59
@opencv-pushbot opencv-pushbot mentioned this pull request Apr 23, 2022
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
* Added NEON support in builds for Windows on ARM

* Fixed `HAVE_CPU_NEON_SUPPORT` display broken during compiler test

* Fixed a build error prior to Visual Studio 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: build/install feature platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc platform: win32

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants