Add Linux-based cross-compilation pipeline for libvec native binaries#144845
Add Linux-based cross-compilation pipeline for libvec native binaries#144845ldematte merged 26 commits intoelastic:mainfrom
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
I assume that the actual built artefacts (the 3 native libraries) are identical when built with this new mechanism? We should confirm this. |
Confirmed with objdump |
|
This looks cool, but I don't think we can bundle the MacOS SDK into Docker images and redistribute it like this. |
|
I was worried about licensing issues too.. what if we make this internal? (can we?) |
|
I think that just using the SDK outside of real Mac hardware is a violation of the license, so making it internal doesn't really matter |
|
(I could be wrong though) |
…ocker-cross-compile
…libc and there is no non-Apple libc implementation
I think you might be right, so I changed approach. No MacOS SDK. |
|
@brianseeders, can you check build_cross_toolchain_image.sh and tell me which names/accounts/endpoints I should be using? Of course the push of docker.elastic.co/es-dev/es-native-cross-toolchain:1 to |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe change replaces the Gradle-based native build with a Makefile-driven cross-compilation workflow. Per-architecture Dockerfiles for amd64/aarch64 were removed and a new cross-toolchain Dockerfile and image build script were added. A Makefile now builds libvec for darwin-aarch64, linux-aarch64, and linux-x64 with tiered object sets. publish_vec_binaries.sh was rewritten to use the cross-toolchain Docker build and gained --local and --force-upload flags. Compiler pragmas forcing AVX-512/SVE per-translation-unit were removed from several sources. Gradle native build files and the Gradle wrapper for libs/simdvec/native were deleted; vec artifact version was updated in libs/native/libraries/build.gradle. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@libs/simdvec/native/build_cross_toolchain_image.sh`:
- Around line 24-26: The script currently only recognizes "--local" by checking
"$1" and leaves typos to proceed to the docker push logic; change argument
parsing so any non-empty, non-"--local" argument causes an immediate error and
exit (fail-closed). Specifically, validate "$1" at the top where LOCAL is set:
if "$1" is empty allow normal behavior, if "$1" == "--local" set LOCAL=true,
else print a clear usage/error and exit 1; also apply the same strict check to
the later argument-handling region that controls the docker push (the block
referencing LOCAL and the docker push commands) so unknown args cannot fall
through to the push. Ensure the error message references the accepted flag
("--local") and that the exit code is non-zero.
In `@libs/simdvec/native/Dockerfile.cross-toolchain`:
- Around line 21-44: The Dockerfile currently leaves the container running as
root after package installation; to fix, after the package-installing RUN that
ends with "rm -rf /var/lib/apt/lists/*" add steps to create a non-root user and
group (e.g., "simdvec" or "builder"), create a home directory, set appropriate
ownership of relevant workspace/build directories, set HOME and any required
environment variables, and switch to that user with the USER instruction before
the image is finalized; locate these changes around the package installation RUN
block in Dockerfile.cross-toolchain and ensure subsequent build steps run as the
non-root user.
In `@libs/simdvec/native/publish_vec_binaries.sh`:
- Line 72: The curl upload command that pipes the zip archive into curl (the
line containing: (cd "$TEMP" && zip -rq - .) | curl -sS -X PUT -H
"X-JFrog-Art-Api: ${ARTIFACTORY_API_KEY}" --data-binary `@-` --location
"${ARTIFACTORY_REPOSITORY}/org/elasticsearch/vec/${VERSION}/vec-${VERSION}.zip")
should include the --fail flag so HTTP errors cause a non‑zero exit; update that
curl invocation to add --fail (e.g., curl -sS --fail -X PUT ...) so failures are
detected and the script exits appropriately.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 0013c400-d07b-447b-b4cf-16fb837bcf3c
📒 Files selected for processing (24)
libs/native/libraries/build.gradlelibs/simdvec/native/.gitignorelibs/simdvec/native/Dockerfile.aarch64libs/simdvec/native/Dockerfile.amd64libs/simdvec/native/Dockerfile.cross-toolchainlibs/simdvec/native/Makefilelibs/simdvec/native/build_cross_toolchain_image.shlibs/simdvec/native/publish_vec_binaries.shlibs/simdvec/native/src/vec/c/aarch64/caps.cpplibs/simdvec/native/src/vec/c/aarch64/score_1.cpplibs/simdvec/native/src/vec/c/aarch64/vec_1.cpplibs/simdvec/native/src/vec/c/aarch64/vec_2.cpplibs/simdvec/native/src/vec/c/amd64/caps.cpplibs/simdvec/native/src/vec/c/amd64/score_1.cpplibs/simdvec/native/src/vec/c/amd64/score_2.cpplibs/simdvec/native/src/vec/c/amd64/vec_1.cpplibs/simdvec/native/src/vec/c/amd64/vec_2.cpplibs/simdvec/native/src/vec/c/amd64/vec_i4_2.cpplibs/simdvec/native/src/vec/headers/darwin/tinystd/algorithmlibs/simdvec/native/src/vec/headers/darwin/tinystd/limitslibs/simdvec/native/src/vec/headers/darwin/tinystd/type_traitslibs/simdvec/native/src/vec/headers/darwin/tinystd/utilitylibs/simdvec/native/src/vec/headers/score_common.hlibs/simdvec/native/src/vec/headers/vec_common.h
💤 Files with no reviewable changes (7)
- libs/simdvec/native/src/vec/c/amd64/caps.cpp
- libs/simdvec/native/src/vec/headers/vec_common.h
- libs/simdvec/native/Dockerfile.amd64
- libs/simdvec/native/src/vec/c/amd64/vec_2.cpp
- libs/simdvec/native/Dockerfile.aarch64
- libs/simdvec/native/src/vec/c/amd64/vec_i4_2.cpp
- libs/simdvec/native/src/vec/c/aarch64/vec_2.cpp
… local/install Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ocker-cross-compile
|
Can we add the |
Already done. Is it not working for you? |
|
Argh, sorry, messing around with branches. Nevermind. |
…sticsearch into native/docker-cross-compile
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@libs/simdvec/native/publish_vec_binaries.sh`:
- Around line 42-45: The publish script sets TOOLCHAIN_IMAGE to
docker.elastic.co/es-dev/es-native-cross-toolchain:1 which mismatches the image
published by build_cross_toolchain_image.sh
(docker.elastic.co/elasticsearch-infra/es-native-cross-toolchain:1) and can
break pulls; update the TOOLCHAIN_IMAGE assignment in publish_vec_binaries.sh to
use the exact published repository name
docker.elastic.co/elasticsearch-infra/es-native-cross-toolchain:1 (keeping the
existing LOCAL override behavior), or centralize the image name into a shared
constant so both publish_vec_binaries.sh (TOOLCHAIN_IMAGE) and
build_cross_toolchain_image.sh reference the same canonical image name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: b3a08cd1-7ce8-45a3-8918-00e0f735e648
📒 Files selected for processing (4)
libs/native/libraries/build.gradlelibs/simdvec/native/build_cross_toolchain_image.shlibs/simdvec/native/publish_vec_binaries.shlibs/simdvec/native/src/vec/c/amd64/vec_2.cpp
💤 Files with no reviewable changes (1)
- libs/simdvec/native/src/vec/c/amd64/vec_2.cpp
✅ Files skipped from review due to trivial changes (1)
- libs/native/libraries/build.gradle
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…elastic#144845) This PR adds a Docker-based cross-compilation pipeline for building libvec for all three platforms from a single Linux/amd64 container, replacing the previous Mac-dependent workflow. It builds on top of elastic#145066 (merged), which introduced the tinystd headers and builtin replacements.
This PR adds a Docker-based cross-compilation pipeline for building
libvecfor all three platforms from a single Linux/amd64 container, replacing the previous Mac-dependent workflow. It builds on top of #145066 (merged), which introduced the tinystd headers and builtin replacements.The three output binaries are built using:
darwin-aarch64libvec.dylibclang++-18→ Mach-O via lldlinux-aarch64libvec.soaarch64-linux-gnu-g++-14linux-x64libvec.sog++-14(native)Docker image
The Docker image is a single public layer — no macOS SDK is included. This avoids any licensing concerns around redistributing Apple's proprietary SDK in a Docker image.
How does Darwin cross-compilation work without the SDK?
The Makefile uses
-nostdincto block host (glibc) headers from leaking into the Darwin build, then adds the tinystd headers (from #145066) and clang's freestanding builtins (arm_neon.h,stdint.h, etc.) via explicit-isystempaths:At link time,
-nostdlib -Wl,-undefined,dynamic_lookupavoids needing SDK stub libraries. The only remaining runtime symbols (_bzero,dyld_stub_binder) are resolved by macOS's dyld at load time.Linux builds are unaffected — they use the real libstdc++ from GCC as before.
Additional cleanup in this PR
-DNDEBUGand-Winlineto CXXFLAGS#pragmatarget push/pop from_2.cppfiles (the Makefile compiles them with the correct-march)gradlew/build.gradle) — replaced bymake local/make installDockerfile.aarch64,Dockerfile.amd64)ISA tiers
Source files are split by naming convention.
*_2.cppfiles implementtier-2 ISA paths and are compiled with a higher
-march; all other filesuse the baseline.
*_1.cpp,caps.cpp, …)*_2.cpp)darwin-aarch64armv8.2-a+dotprodarmv8.2-a+svelinux-aarch64armv8.2-a+dotprodarmv8.2-a+svelinux-x64core-avx2icelake-clientThe Makefile compiles each group to
.ofiles separately, then links them into the shared library.Workflows
Publish a new release
Requires
ARTIFACTORY_API_KEYto be set. BumpVERSIONinpublish_vec_binaries.shfirst.Local test build (no upload)
Produces
vec-<VERSION>-local.zipin the working directory.Local build with upload
Build using the local Docker image, but upload to Artifactory.
Useful when you don't have access to the registry image but want to publish.
Update compiler versions
Bump
VERSIONinbuild_cross_toolchain_image.sh, update packageversions in
Dockerfile.cross-toolchain, then:After pushing, update
TOOLCHAIN_IMAGEinpublish_vec_binaries.sh.Verification
Tested on all three platforms:
Local development
The standalone Gradle build (
gradlew/build.gradle) has been removed. Use the Makefile instead: