ci: migrate native-Go + aerospike CI jobs to arm64; bump aerospike 8.0→8.1#1047
Conversation
The x86 teranode-runner-16-core OOM-killed the make test compile (go test -race -coverpkg=./... over the whole repo, ~13GB peak) during the build phase. Move the two make-test jobs to teranode-runner-16-core-arm, which has more headroom and runs natively on arm64 (verified locally: full repo compiles incl. CGO/BDK, postgres testcontainers work). make test uses the testtxmetacache tag (postgres only, no aerospike images), so no arm64 image gaps. Other jobs stay on x86 (8-core-arm not yet available; image-pulling jobs would need a multi-arch teranode image).
|
🤖 Claude Code Review Status: Complete No issues found. The PR makes well-justified infrastructure changes: Summary:
Changes verified:
|
Bump the make-test jobs to teranode-runner-32-core-arm (more cores -> faster -race+coverage build/test, more RAM headroom). Add a temporary telemetry sampler to the PR test step: logs runner specs and peak mem/disk during the run, surfaced as a job annotation, to size the runner. Remove the telemetry before merge.
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-06-08 08:59 UTC |
Switch the container-free / native-Go jobs to arm64 (cheaper, ample RAM — fixes the x86 16-core make-test OOM without -p caps or build-cache bulk): - test (make test) 16-core -> 16-core-arm (postgres testcontainers, arm64) - go-test (main, make test)16-core -> 16-core-arm - golangci-lint 8-core -> 8-core-arm (pure Go analysis, no containers) - sonar 4-core -> 4-core-arm (scanner CLI + artifacts, no build) Telemetry confirmed: 32-core gave no speedup (I/O/container-bound), so 16-core is the cost/perf sweet spot; reverted from the 32-core trial. Kept on x86 (arm64 image blockers, NOT teranode which is multi-arch): - smoketest / sequential / legacy-sync: bitcoinsv/bitcoin-sv (x86-only) + aerospike - prunertest / chainintegrity: aerospike/aerospike-server (arm64 unverified) These can move once SV-node has an arm64 image and aerospike arm64 is confirmed.
aerospike/aerospike-server:8.1 has an arm64 image, so:
- bump 8.0->8.1 in all official refs (test compose files, chainintegrity +
3blasters compose, longtest aerospike8 test). Custom ghcr 8.0.0-3 and the
already-8.1 deploy manifests left as-is.
- move the aerospike-only, SV-node-free jobs to arm64:
prunertest (pr + main) 8-core -> 8-core-arm
chainintegrity (pr) 8-core -> 8-core-arm
chainintegrity / -3blasters 16-core -> 16-core-arm
These use aerospike (now arm64), teranode:latest (multi-arch), postgres +
redpanda (arm64) — no SV-node.
Still on x86 (bitcoinsv/bitcoin-sv is x86-only): smoketest, legacy-sync,
sequential — they stand up a real SV-Node container.
Address review: the gennodes docker-compose templates (used by make gen-multinode) still pinned aerospike-server:8.0, leaving generated compose files inconsistent with the rest of the 8.1 bump.
|
ordishs
left a comment
There was a problem hiding this comment.
Approve — clean, well-documented CI-only change.
Verified against the tree:
- aerospike bump is complete; only the custom
ghcr.io/...:8.0.0-3and already-8.1 deploy manifests remain (as the PR states). - x86 retention is correct:
smoketest/legacy-sync/sequentialare exactly the SV-node-dependent jobs. - Bumping shared compose files doesn't break the still-x86 consumers (nightly chainintegrity, longtest) since 8.1 is multi-arch.
Conditional on the test-plan checkboxes going green — particularly the prunertest/chainintegrity arm runs that prove the aerospike 8.1 arm64 image + compose stack come up.
Minor follow-ups (non-blocking): the separate sonarqube job in teranode_main_tests.yaml stays on x86 while sonar-pr-analyze moves to arm — same workload class, could move for consistency.



Summary
Migrate the CI jobs that can safely run on arm64 to arm runners, and bump aerospike 8.0 → 8.1 (which has an arm64 image). This fixes the x86
make testOOM (arm runners have ample RAM) without-pcaps or build-cache churn, and is cheaper.Why
On x86
teranode-runner-16-core,make test(go test -race -coverpkg=./...over the whole repo) SIGTERM'd ~3min in during the compile — that cold race+coverage build peaks ~13GB and OOM'd the runner. Telemetry on a 32-core arm run showed ~70GB peak / 125GB, and 32-core gave no speedup (the suite is I/O/container-bound, not CPU-bound) — so 16-core arm is the cost/perf sweet spot. arm runners are cheaper/min and have the headroom the x86 runner lacked.Runner matrix
Moved to arm64 (verified arm-safe — compiles incl. CGO/BDK, containers are arm64-available):
test,go-test(make test)testtxmetacachetag (no aerospike)golangci-lintsonarprunertest(pr + main)chainintegrity(pr)chainintegrity,chainintegrity-3blastersKept on x86 —
smoketest,legacy-sync,sequential: they stand up a realbitcoinsv/bitcoin-svSV-Node container, which is x86-only (no arm64 image). teranode being multi-arch doesn't unblock these — the SV-node image does.aerospike 8.0 → 8.1
Bumped all official
aerospike/aerospike-server:8.0refs (test compose, chainintegrity + 3blasters compose, longtest aerospike8 test, and the gennodes templates used bymake gen-multinode). Left the customghcr.io/bsv-blockchain/aerospike-server:8.0.0-3and the already-8.1 deploy manifests as-is.Test plan
testjob green on arm64 (no compile-phase SIGTERM)prunertest+chainintegritygreen on arm64 — confirms aerospike 8.1 arm64 + the compose stack come upsmoketest/legacy-sync/sequentialonce an arm64 SV-node image exists