Skip to content

simdutf ARM64 constexpr + NEON compilation issue on clang #874

@MacroModel

Description

@MacroModel

simdutf ARM64 constexpr + NEON compilation issue on clang

Summary

First, thank you for maintaining simdutf and making it easy to embed in other projects.
While integrating simdutf into a C++ benchmark on an ARM64 macOS machine, we hit a
compilation error in the ARM64 implementation when building against a recent
Apple-clang toolchain with a C++2x standard mode.

The core issue is that a simdutf_constexpr variable in arm64/simd.h is initialized
using NEON intrinsics, which this toolchain does not consider a constant expression.

Environment

  • Platform: aarch64-apple-darwin
  • OS: macOS (Apple Silicon; exact version is not critical to reproduce)
  • CPU: Apple M-series (ARM64)
  • Compiler: Apple clang (from an LLVM-based cross toolchain)
  • Language mode: -std=c++2b / -std=c++2c

In our specific setup the compiler is invoked as (simplified):

clang++ \
  -std=c++2c -O3 -ffast-math \
  -fno-rtti -fno-unwind-tables -fno-asynchronous-unwind-tables \
  -march=native \
  -I <simdutf-dir>/include \
  -I <simdutf-dir>/src \
  <simdutf-dir>/src/simdutf.cpp \
  -c

The same error can be reproduced by compiling src/simdutf.cpp directly in a fresh
simdutf checkout with similar flags.

Reproduction steps

  1. Clone simdutf (current main as of 2025-12; the exact revision is whichever
    git clone https://github.com/simdutf/simdutf.git currently produces):

    git clone https://github.com/simdutf/simdutf.git
    cd simdutf
  2. On an ARM64 macOS system (Apple Silicon), build src/simdutf.cpp with a recent
    Apple-clang and C++2x standard mode, including both include/ and src/:

    clang++ \
      -std=c++2c -O3 -ffast-math \
      -fno-rtti -fno-unwind-tables -fno-asynchronous-unwind-tables \
      -march=native \
      -I include \
      -I src \
      src/simdutf.cpp -c
  3. The build fails in the ARM64 SIMD implementation with a constexpr-related error
    (see below).

Observed error

The compiler reports:

src/simdutf/arm64/simd.h:313:35: error: constexpr variable 'pair' must be initialized by a constant expression
  simdutf_constexpr int8x16x2_t pair =
                                  ^
...
note: use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
...
note: non-constexpr function 'vmovq_n_s8' cannot be used in a constant expression

The offending code looks roughly like (simplified for illustration):

simdutf_constexpr int8x16x2_t pair =
    match_system(big_endian)
        ? int8x16x2_t{{this->value, vmovq_n_s8(0)}}
        : int8x16x2_t{{vmovq_n_s8(0), this->value}};

Here simdutf_constexpr is mapped to constexpr when SIMDUTF_CPLUSPLUS17 is set,
and vmovq_n_s8 is a NEON intrinsic declared as a regular (non-constexpr) function
in arm_neon.h. The compiler therefore rejects this as a non-constant expression.

Expected behavior

Ideally, the ARM64 implementation should compile cleanly under:

  • Apple-clang on ARM64 macOS,
  • with C++17 or newer (including C++2x modes such as -std=c++2b / -std=c++2c),
  • using the default simdutf_constexpr configuration.

In particular, code paths guarded by SIMDUTF_IMPLEMENTATION_ARM64 should not
rely on constexpr initialization that depends on non-constexpr NEON intrinsics.

Current workaround in our integration

To keep our own benchmark build portable while still using simdutf for comparison,
we temporarily disable the ARM64 implementation and rely on the fallback kernel:

clang++ ... -DSIMDUTF_IMPLEMENTATION_ARM64=0 ...

This avoids compiling the affected ARM64 SIMD code (arm64/simd.h and friends)
and sidesteps the constexpr + NEON combination that triggers the error. Functionally,
simdutf::validate_utf8 still works correctly, but it no longer reflects the best
possible performance on ARM64.

Possible directions (for maintainers’ consideration)

We completely understand that constexpr usage in combination with intrinsics and
different standard modes is subtle and compiler-dependent. From an external
embedder’s perspective, the following options would make integration smoother:

  1. Guard the problematic simdutf_constexpr initializations so that they are only
    enabled when the toolchain actually supports constexpr evaluation of the
    underlying intrinsics (or when the code does not depend on this in a way that
    violates constexpr rules).
  2. Alternatively, relax simdutf_constexpr to non-constexpr for specific ARM64
    code paths that rely on intrinsics like vmovq_n_s8, at least when compiling
    under known-affected configurations (Apple-clang + C++2x on ARM64).
  3. Provide a documented macro (similar to SIMDUTF_IMPLEMENTATION_ARM64) that
    can be used to selectively disable constexpr-heavy SIMD paths while still
    keeping other ARM64 optimizations enabled.

We are happy to test any proposed fixes or patches on our setup, and can provide
more detailed compiler version / clang++ --version output if that would help.

Thank you again for your work on simdutf, and please let us know if we can assist
with additional diagnostics or testing.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions