Describe the enhancement requested
Currently, there are BYTE_STREAM_SPLIT optimizations using hand-written x86 intrinsics (for SSE4.2, AVX2 and AVX512), selected at compile-time.
We should rewrite those using the xsimd library so as to provide support for non-x86 ISA extensions such as Arm Neon (most importantly) and SVE.
More precisely:
- rewrite the SSE4.2 acceleration for generic 128-bit SIMD
- rewrite the AVX2 acceleration for generic 256-bit SIMD
- either rewrite the AVX512 acceleration, leave it alone, or remove it (the benefits are probably minor)
Component(s)
C++, Parquet