-
Notifications
You must be signed in to change notification settings - Fork 116
Description
simdutf ARM64 constexpr + NEON compilation issue on clang
Summary
First, thank you for maintaining simdutf and making it easy to embed in other projects.
While integrating simdutf into a C++ benchmark on an ARM64 macOS machine, we hit a
compilation error in the ARM64 implementation when building against a recent
Apple-clang toolchain with a C++2x standard mode.
The core issue is that a simdutf_constexpr variable in arm64/simd.h is initialized
using NEON intrinsics, which this toolchain does not consider a constant expression.
Environment
- Platform:
aarch64-apple-darwin - OS: macOS (Apple Silicon; exact version is not critical to reproduce)
- CPU: Apple M-series (ARM64)
- Compiler: Apple clang (from an LLVM-based cross toolchain)
- Language mode:
-std=c++2b/-std=c++2c
In our specific setup the compiler is invoked as (simplified):
clang++ \
-std=c++2c -O3 -ffast-math \
-fno-rtti -fno-unwind-tables -fno-asynchronous-unwind-tables \
-march=native \
-I <simdutf-dir>/include \
-I <simdutf-dir>/src \
<simdutf-dir>/src/simdutf.cpp \
-cThe same error can be reproduced by compiling src/simdutf.cpp directly in a fresh
simdutf checkout with similar flags.
Reproduction steps
-
Clone simdutf (current
mainas of 2025-12; the exact revision is whichever
git clone https://github.com/simdutf/simdutf.gitcurrently produces):git clone https://github.com/simdutf/simdutf.git cd simdutf -
On an ARM64 macOS system (Apple Silicon), build
src/simdutf.cppwith a recent
Apple-clang and C++2x standard mode, including bothinclude/andsrc/:clang++ \ -std=c++2c -O3 -ffast-math \ -fno-rtti -fno-unwind-tables -fno-asynchronous-unwind-tables \ -march=native \ -I include \ -I src \ src/simdutf.cpp -c
-
The build fails in the ARM64 SIMD implementation with a constexpr-related error
(see below).
Observed error
The compiler reports:
src/simdutf/arm64/simd.h:313:35: error: constexpr variable 'pair' must be initialized by a constant expression
simdutf_constexpr int8x16x2_t pair =
^
...
note: use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
...
note: non-constexpr function 'vmovq_n_s8' cannot be used in a constant expression
The offending code looks roughly like (simplified for illustration):
simdutf_constexpr int8x16x2_t pair =
match_system(big_endian)
? int8x16x2_t{{this->value, vmovq_n_s8(0)}}
: int8x16x2_t{{vmovq_n_s8(0), this->value}};Here simdutf_constexpr is mapped to constexpr when SIMDUTF_CPLUSPLUS17 is set,
and vmovq_n_s8 is a NEON intrinsic declared as a regular (non-constexpr) function
in arm_neon.h. The compiler therefore rejects this as a non-constant expression.
Expected behavior
Ideally, the ARM64 implementation should compile cleanly under:
- Apple-clang on ARM64 macOS,
- with C++17 or newer (including C++2x modes such as
-std=c++2b/-std=c++2c), - using the default
simdutf_constexprconfiguration.
In particular, code paths guarded by SIMDUTF_IMPLEMENTATION_ARM64 should not
rely on constexpr initialization that depends on non-constexpr NEON intrinsics.
Current workaround in our integration
To keep our own benchmark build portable while still using simdutf for comparison,
we temporarily disable the ARM64 implementation and rely on the fallback kernel:
clang++ ... -DSIMDUTF_IMPLEMENTATION_ARM64=0 ...This avoids compiling the affected ARM64 SIMD code (arm64/simd.h and friends)
and sidesteps the constexpr + NEON combination that triggers the error. Functionally,
simdutf::validate_utf8 still works correctly, but it no longer reflects the best
possible performance on ARM64.
Possible directions (for maintainers’ consideration)
We completely understand that constexpr usage in combination with intrinsics and
different standard modes is subtle and compiler-dependent. From an external
embedder’s perspective, the following options would make integration smoother:
- Guard the problematic
simdutf_constexprinitializations so that they are only
enabled when the toolchain actually supports constexpr evaluation of the
underlying intrinsics (or when the code does not depend onthisin a way that
violates constexpr rules). - Alternatively, relax
simdutf_constexprto non-constexpr for specific ARM64
code paths that rely on intrinsics likevmovq_n_s8, at least when compiling
under known-affected configurations (Apple-clang + C++2x on ARM64). - Provide a documented macro (similar to
SIMDUTF_IMPLEMENTATION_ARM64) that
can be used to selectively disable constexpr-heavy SIMD paths while still
keeping other ARM64 optimizations enabled.
We are happy to test any proposed fixes or patches on our setup, and can provide
more detailed compiler version / clang++ --version output if that would help.
Thank you again for your work on simdutf, and please let us know if we can assist
with additional diagnostics or testing.