Skip to content

aerospike: optional native operate-path for mod-teranode UDFs#828

Open
icellan wants to merge 98 commits into
mainfrom
feat/teranode-native-ops
Open

aerospike: optional native operate-path for mod-teranode UDFs#828
icellan wants to merge 98 commits into
mainfrom
feat/teranode-native-ops

Conversation

@icellan

@icellan icellan commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds an opt-in path that invokes mod-teranode UDF functions through the native operate-path (TeranodeModifyOp, wire op type 200) instead of the legacy UDF executor. The native path:

  • Bypasses the Aerospike UDF executor (its dedicated thread pool, heap, per-record handoff overhead)
  • Runs under the same lock as native ops like LIST_APPEND / CDT_MODIFY (finer-grained than UDF context lock)
  • Batches with other native ops in the same wire request

Expected to give meaningful concurrency wins under heavy batch load. The feature is off by default; existing clusters keep working unchanged.

Required server

The BSV fork of aerospike-server-private (branch feat/teranode-native-op or merged equivalent), which adds the matching server-side dispatcher (AS_MSG_OP_TERANODE_MODIFY = 200, sub-op id table 1..13).

Clusters running stock Aerospike or older versions of the BSV fork are unaffected — the capability probe (described below) detects unsupported servers and falls back to the UDF path.

Important: mixed-version Aerospike clusters are NOT supported. The capability probe samples one partition's master only; a mid-rolling-upgrade cluster can produce false-positive or false-negative probe results. Upgrade the aerospike-server cluster fully before flipping aerospike_use_native_teranode_ops = true. See the setting's longdesc for details.

Decision wrapping

Each call site delegates path selection to a single helper:

batchRecords[i] = s.teranodeBatchRecord(
    batchUDFPolicy, usePackage, key, subOpSetMined, "setMined",
    minedBlockInfo.BlockID,
    minedBlockInfo.BlockHeight,
    // ...plain Go values; helper handles wrapping for either path
)

The helper returns either an aerospike.NewBatchWrite(... TeranodeModifyOp(...)) or aerospike.NewBatchUDF(...) BatchRecord, depending on:

  1. Setting aerospike_use_native_teranode_ops (default false)
  2. Capability probe at store init: runs a valid setLocked sub-op via TeranodeModifyOp against a short-lived probe key. A patched BSV server recognises wire op 200, executes the sub-op, and returns a structured response — that's a "supported" signal. A stock/unpatched server rejects the unknown opcode with PARAMETER_ERROR, which is treated as "not supported". Anything else (timeout, other error, connection drop) is also treated as "not supported", biasing toward false-negatives so the fallback never runs against a server that doesn't support the op.

If either gate is closed, the helper produces the same NewBatchUDF call the existing code did. Behavior is byte-identical to the current code path on clusters that don't enable the setting or don't run the BSV server fork.

Split into separate PRs for independent review

Three orthogonal themes that were originally bundled in this branch have been factored out so reviewers can sign off on each in isolation. Their content is still present on this branch so PR #828 remains independently buildable; once each lands on main, the next merge from main will drop the duplicate hunks naturally.

Known follow-ups (deferred to separate PRs)

  • Runtime PARAMETER_ERROR → UDF fallback in teranodeBatchRecord / executeTeranodeOp. Closes the rolling-upgrade gap where the capability probe passes (one partition's master is on the BSV fork) but a runtime write hits a stock-Aerospike master. Mirrors the codes.Unimplemented fallback in fix(blockassembly): fall back to row-oriented AddTxBatch on Unimplemented #897. Will be a separate PR.
  • unspend SpendingData ownership check on the native path — issue #899. PR #766 added a mandatory SpendingData arg to the UDF unspend for ownership verification; the native-op subOpUnspend = 3 dispatcher needs the same arg before the native path can safely apply it. Until the server-fork side ships the matching dispatcher change, the native path skips the check — a security regression vs the UDF path. Workaround: keep aerospike_use_native_teranode_ops = false on any cluster where the ownership check matters.
  • Unit-level test for teranodeBatchRecord UDF-vs-native path routing — currently the path selection is only exercised by integration tests against a live BSV-forked cluster.
  • Tag the aerospike-client-go fork at 72616cb3187f (e.g. v8.7.1-bsv3) and re-pin via tag. Currently pinned via Go pseudo-version, which is integrity-safe (go.sum hash) but operationally unclear.

What's in the diff

Foundation (commit 1)

  • stores/utxo/aerospike/native_op.go (new): sub-op id constants, msgpack [sub_op_id, args] payload encoder, teranodeBatchRecord / executeTeranodeOp wrappers, and the capability probe.
  • stores/utxo/aerospike/aerospike.go: adds Store.useNativeTeranodeOps cached field and s.initNativeTeranodeOps(ctx) invocation in New(...).
  • settings/aerospike_settings.go + settings/settings.go: adds AerospikeSettings.UseNativeTeranodeOps bool (default false), keyed aerospike_use_native_teranode_ops.

Cutovers (commit 2)

11 mod-teranode UDF call sites swapped from aerospike.NewBatchUDF(...) / s.client.Execute(...) to s.teranodeBatchRecord(...) / s.executeTeranodeOp(...):

File Function Sub-op id
set_mined.go setMined 4
aerospike.go preserveUntil 9
alert_system.go (×3) freeze, unfreeze, reassign 5, 6, 7
conflicting.go setConflicting 8
locked.go setLocked 10
spend.go (×3) incrementSpentExtraRecs (×2), spendMulti 11, 11, 2
un_spend.go unspend (single Execute) 3

Net effect: -3 lines across the call-site files (the wrapper absorbs aerospike.NewValue(...) / aerospike.BoolValue(...) ladders, so each call site shrinks).

Module path switch (commit 3)

  • Renames all github.com/aerospike/aerospike-client-go/v8 imports across 55 files to github.com/bsv-blockchain/aerospike-client-go/v8.
  • Drops the local-filesystem replace directive.

The fork is consumed as a regular dependency at the Go pseudo-version v8.7.1-bsv2.0.20260518145742-72616cb3187f via the Go module proxy. This is 8 commits ahead of the v8.7.1-bsv2 tag — the upstream fork has not yet tagged this commit. Integrity is preserved via the go.sum hash; operationally we should tag (e.g. v8.7.1-bsv3) before merge and re-pin to the tag.

The fork is functionally a strict superset of upstream; on clusters that don't enable the new path, the binary behaves identically to one built against upstream v8.7.0.

Sub-op id assignment (frozen wire contract)

sub_op_id UDF function
1 spend
2 spendMulti
3 unspend
4 setMined
5 freeze
6 unfreeze
7 reassign
8 setConflicting
9 preserveUntil
10 setLocked
11 incrementSpentExtraRec
12 setDeleteAtHeight (reserved)
13 addDeletedChildren

These IDs must match the SUBOP_TABLE in the BSV-forked server's mod_teranode_native_op.c. Never renumber; never reuse a freed slot.

icellan and others added 9 commits May 6, 2026 15:13
Adds optional support for invoking mod-teranode functions via the
native operate-path (TeranodeModifyOp, wire op type 200) instead of
the legacy UDF path. The new path bypasses the Aerospike UDF
executor and runs under the same lock as native ops like
LIST_APPEND.

Required server: the BSV fork of aerospike-server-private
(feat/teranode-native-op or merged equivalent) on every node.

Decision is wrapped at one place — call sites use
s.teranodeBatchRecord(...) instead of aerospike.NewBatchUDF(...),
and the helper picks the path based on:
  1. Setting aerospike_use_native_teranode_ops (default false)
  2. A startup capability probe against the cluster

If either is false, calls fall back to UDF transparently. Clusters
without the BSV fork keep working unchanged.

Foundation:
- go.mod: replace aerospike-client-go/v8 with the BSV fork
- AerospikeSettings.UseNativeTeranodeOps (default false)
- stores/utxo/aerospike/native_op.go: sub_op id constants, msgpack
  payload encoder, teranodeBatchRecord/executeTeranodeOp wrappers,
  capability probe (sends a malformed payload, expects PARAMETER_ERROR
  from the BSV dispatcher)
- Store.useNativeTeranodeOps cached at construction

Cutovers:
- setMined

Remaining cutovers (follow-up commits): preserveUntil, freeze,
unfreeze, reassign, setConflicting, setLocked, spendMulti, unspend,
incrementSpentExtraRecs (×2 sites).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Routes preserveUntil, freeze, unfreeze, reassign, setConflicting,
setLocked, incrementSpentExtraRecs (×2), spendMulti, and unspend
through the s.teranodeBatchRecord / s.executeTeranodeOp helpers
added in the previous commit. Each call site shrinks because the
aerospike.NewValue / aerospike.BoolValue wrappers are absorbed by
the helper.

Behavior is unchanged when aerospike_use_native_teranode_ops is
false (the default) or when the cluster's capability probe rejects
the new opcode — both result in the helper producing the same
NewBatchUDF / client.Execute call as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The BSV fork of aerospike-client-go-v8 is now published at
github.com/bsv-blockchain/aerospike-client-go/v8 v8.7.1-bsv1 — adds
TeranodeModifyOp / TeranodeReadOp wire opcodes 200/201 alongside
the upstream API.

Mechanical sed across 55 files renaming all imports from
github.com/aerospike/aerospike-client-go/v8 to
github.com/bsv-blockchain/aerospike-client-go/v8. Build, vet, and
the package test suite all green.

Drops the local-filesystem replace directive — the require directive
now resolves directly via the Go module proxy. The previously-pinned
upstream github.com/aerospike/aerospike-client-go/v8 v8.4.2 remains
as an indirect dependency of unrelated transitive imports.

The fork is functionally a strict superset of upstream: when
aerospike_use_native_teranode_ops is false (default) and/or the
cluster's capability probe rejects the new opcode, Teranode's calls
go through unchanged NewBatchUDF / client.Execute paths, behaving
identically to upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Picks up 15d73f5 on the BSV go-client fork which fixes routing
classification for TeranodeReadOp. The original v8.7.1-bsv1 had
TeranodeReadOp falling into the operate_args.go default branch and
being routed via PartitionForWrite because the client uses explicit
op-type switches that didn't include _TERANODE_READ. Caught on PR
review of bsv-blockchain/aerospike-client-go#1.

No source changes here — only the version pin moves and go.sum
updates with the new tarball hash. Build, vet, mod-tidy all green.
@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

Current Review: No new blocking issues found.

The native operate-path feature (TeranodeModifyOp, wire op 200) is cleanly gated: off by default, with a startup capability probe and transparent UDF fallback. Path selection is centralised in teranodeBatchRecord / executeTeranodeOp, args are forwarded as plain Go values, and the response parser (parseLuaMapResponseInto plus the luaResponse* helpers) now tolerates both the UDF path map type and the native dispatcher typed maps/int slices. Concurrency looks sound: useNativeTeranodeOps and the shared nativeOpBatchWritePolicy are set once during New() before the store is used, and the policy is read-only thereafter.

A few things worth keeping in mind (no action required, all documented):

History:

  • ✅ Resolved (prior rounds): native-path WritePolicy now respected; PreserveTransactions append pattern avoids nil batch entries; probe-doc wording corrected.
  • ✅ Re-verified false-positive: native spendMulti MapValue slice msgpack encoding (MapValue is a named map type, so it encodes as an array of maps).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in “native operate-path” for mod-teranode operations in the Aerospike UTXO store (via TeranodeModifyOp, wire op 200) with a startup capability probe and transparent fallback to the legacy Lua UDF path. The PR also migrates imports to the github.com/bsv-blockchain/aerospike-client-go/v8 fork and hardens/extends response parsing + diagnostics to accommodate native-path return types.

Changes:

  • Introduces native-op wrappers (teranodeBatchRecord, executeTeranodeOp) plus capability probing and msgpack payload encoding.
  • Refactors multiple call sites to route through the wrappers and improves batch-result diagnostics / response parsing.
  • Switches Aerospike client imports to the BSV fork and updates module dependencies/tests accordingly.

Reviewed changes

Copilot reviewed 82 out of 83 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
util/uaerospike/mock.go Switch Aerospike client imports to BSV fork.
util/uaerospike/client.go Switch Aerospike client imports to BSV fork.
util/uaerospike/client_test.go Switch Aerospike client imports to BSV fork.
util/aerospike.go Switch Aerospike client imports to BSV fork.
util/aerospike_test.go Switch Aerospike client imports to BSV fork.
test/utils/containers/container_manager.go Switch Aerospike client imports to BSV fork.
test/sequentialtest/large_tx_reorg/helpers.go Switch Aerospike client imports to BSV fork.
test/longtest/stores/utxo/aerospike/aerospike8_test.go Switch Aerospike client imports to BSV fork.
test/e2e/daemon/wip/aerospike_helpers.go Switch Aerospike client imports to BSV fork.
test/e2e/daemon/ready/catchup_test.go Switch Aerospike client imports to BSV fork.
test/e2e/daemon/ready/aerospike_helpers.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/verify_children_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/unmined_iterator.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/unmined_iterator_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/un_spend.go Route single-record unspend through executeTeranodeOp and improve error context.
stores/utxo/aerospike/test_helper.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/teranode.go Broaden Lua/native response parsing to tolerate native numeric/map/slice types.
stores/utxo/aerospike/teranode_test.go Add tests for native-style response shapes (typed slices, int64 keys/values, string maps).
stores/utxo/aerospike/spend.go Route spend-related UDF calls via wrapper; add robust per-record diagnostics and nil-safety.
stores/utxo/aerospike/spend_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/spend_expressions.go Improve error propagation/diagnostics and nil-handling in expression spend path.
stores/utxo/aerospike/set_mined.go Route setMined batch UDF calls via wrapper; improve per-record error/parse diagnostics.
stores/utxo/aerospike/set_mined_expressions.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/set_mined_expressions_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/refactored_methods_test.go Update tests for modified setMined batch error helper signature/message.
stores/utxo/aerospike/pruner/pruner_service.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/pruner/pruner_service_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/pruner/mock_index_waiter_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/pruner/external_pruning_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/preserve_expressions.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/preserve_expressions_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/native_op.go New native-op implementation: sub-op IDs, msgpack encoding, wrappers, capability probe, init hook.
stores/utxo/aerospike/native_op_test.go Add tests for payload encoding shape + result-bin naming invariants.
stores/utxo/aerospike/mock.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/longest_chain.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/longest_chain_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/locked.go Route setLocked via wrapper and improve batch-result validation/diagnostics.
stores/utxo/aerospike/index_test.go Use shared Aerospike test-container helper (supports image overrides).
stores/utxo/aerospike/get.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/duplicate_spend_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/delete.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/create.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/create_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/create_external_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/container_helper_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/container_helper_internal_test.go New helper to run Aerospike containers with env-controlled image/platform.
stores/utxo/aerospike/consistency_scan.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/consistency_scan_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/conflicting.go Route setConflicting batch UDF calls via wrapper.
stores/utxo/aerospike/batch_diagnostics.go New diagnostics helpers for BatchRecord/Record/bin/value shape logging.
stores/utxo/aerospike/batch_diagnostics_test.go Add tests for diagnostics helper output and nil-safety.
stores/utxo/aerospike/alert_system2_test.go Switch Aerospike client imports to BSV fork.
stores/utxo/aerospike/alert_system.go Route freeze/unfreeze/reassign through wrapper and harden batch result parsing.
stores/utxo/aerospike/aerospike.go Initialize native-op support during store construction; route preserveUntil via wrapper; improve diagnostics.
stores/utxo/aerospike/aerospike_test.go Use shared Aerospike test-container helper; optionally assert native-op enablement via env.
stores/utxo/aerospike/aerospike_server_test.go Switch Aerospike client imports to BSV fork.
settings/settings.go Add aerospike_use_native_teranode_ops setting wiring (default false).
settings/aerospike_settings.go Document and add UseNativeTeranodeOps config field.
services/asset/httpimpl/GetTxMetaByTXID.go Switch Aerospike client imports to BSV fork.
services/asset/httpimpl/GetTxMetaByTXID_test.go Switch Aerospike client imports to BSV fork.
go.mod Add BSV Aerospike client fork dependency; add msgpack; remove local replace; update indirects.
go.sum Update dependency checksums for new fork + msgpack and bumped transitive deps.
cmd/monitor/monitor.go Switch Aerospike client imports to BSV fork.
cmd/aerospikereader/aerospike_reader.go Switch Aerospike client imports to BSV fork.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread stores/utxo/aerospike/native_op.go
Comment thread stores/utxo/aerospike/native_op.go
Comment thread stores/utxo/aerospike/aerospike.go Outdated
@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-828 (ac371b4)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 138
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.742µ 1.961µ ~ 0.400
Block_ValidOrderAndBlessed_DiskVsMemory/leaves=1024/memory-4 11.93m 12.21m ~ 0.700
Block_ValidOrderAndBlessed_DiskVsMemory/leaves=1024/disk_1-4 12.32m 12.60m ~ 0.200
Block_ValidOrderAndBlessed_DiskVsMemory/leaves=1024/disk_2-4 12.74m 12.82m ~ 0.400
Block_ValidOrderAndBlessed_DiskVsMemory/leaves=16384/memo... 26.98m 28.20m ~ 0.200
Block_ValidOrderAndBlessed_DiskVsMemory/leaves=16384/disk... 34.55m 34.91m ~ 0.700
Block_ValidOrderAndBlessed_DiskVsMemory/leaves=16384/disk... 35.51m 35.67m ~ 0.200
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 64.02n 64.10n ~ 0.600
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 63.96n 64.04n ~ 0.100
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 64.14n 64.12n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 30.39n 29.88n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 51.84n 51.55n ~ 0.400
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 123.4n 109.1n ~ 0.200
MiningCandidate_Stringify_Short-4 245.9n 240.7n ~ 0.700
MiningCandidate_Stringify_Long-4 1.609µ 1.639µ ~ 0.100
MiningSolution_Stringify-4 862.8n 845.1n ~ 0.100
BlockInfo_MarshalJSON-4 1.603µ 1.598µ ~ 0.700
NewFromBytes-4 128.5n 129.1n ~ 0.200
AddTxBatchColumnar_Validation-4 2.451µ 2.443µ ~ 1.000
OffsetValidationLoop-4 640.5n 641.1n ~ 1.000
Mine_EasyDifficulty-4 67.00µ 66.72µ ~ 1.000
Mine_WithAddress-4 6.944µ 6.971µ ~ 0.200
BlockAssembler_AddTx-4 0.02453n 0.02800n ~ 0.700
AddNode-4 10.72 11.00 ~ 0.700
AddNodeWithMap-4 11.81 11.87 ~ 0.700
DirectSubtreeAdd/4_per_subtree-4 56.76n 60.36n ~ 0.100
DirectSubtreeAdd/64_per_subtree-4 28.73n 28.93n ~ 1.000
DirectSubtreeAdd/256_per_subtree-4 28.47n 27.60n ~ 0.200
DirectSubtreeAdd/1024_per_subtree-4 26.43n 26.41n ~ 1.000
DirectSubtreeAdd/2048_per_subtree-4 26.10n 26.08n ~ 1.000
SubtreeProcessorAdd/4_per_subtree-4 246.1n 241.6n ~ 0.100
SubtreeProcessorAdd/64_per_subtree-4 241.0n 237.8n ~ 0.400
SubtreeProcessorAdd/256_per_subtree-4 235.8n 236.5n ~ 0.700
SubtreeProcessorAdd/1024_per_subtree-4 232.6n 233.7n ~ 1.000
SubtreeProcessorAdd/2048_per_subtree-4 226.7n 228.1n ~ 0.600
SubtreeProcessorRotate/4_per_subtree-4 232.0n 231.0n ~ 0.400
SubtreeProcessorRotate/64_per_subtree-4 231.0n 230.0n ~ 1.000
SubtreeProcessorRotate/256_per_subtree-4 233.1n 229.0n ~ 0.100
SubtreeProcessorRotate/1024_per_subtree-4 228.1n 229.1n ~ 0.500
SubtreeNodeAddOnly/4_per_subtree-4 55.12n 55.10n ~ 1.000
SubtreeNodeAddOnly/64_per_subtree-4 36.09n 36.04n ~ 0.700
SubtreeNodeAddOnly/256_per_subtree-4 35.03n 35.11n ~ 0.700
SubtreeNodeAddOnly/1024_per_subtree-4 34.45n 34.50n ~ 0.800
SubtreeCreationOnly/4_per_subtree-4 110.9n 110.0n ~ 0.600
SubtreeCreationOnly/64_per_subtree-4 349.9n 352.1n ~ 0.400
SubtreeCreationOnly/256_per_subtree-4 1.214µ 1.221µ ~ 1.000
SubtreeCreationOnly/1024_per_subtree-4 3.834µ 4.009µ ~ 1.000
SubtreeCreationOnly/2048_per_subtree-4 6.782µ 7.182µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 228.1n 228.6n ~ 0.400
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 229.7n 229.1n ~ 1.000
ParallelGetAndSetIfNotExists/1k_nodes-4 8.686m 11.798m ~ 0.100
ParallelGetAndSetIfNotExists/10k_nodes-4 12.73m 13.02m ~ 1.000
ParallelGetAndSetIfNotExists/50k_nodes-4 15.37m 16.42m ~ 0.100
ParallelGetAndSetIfNotExists/100k_nodes-4 19.50m 19.61m ~ 0.700
SequentialGetAndSetIfNotExists/1k_nodes-4 8.109m 9.878m ~ 0.100
SequentialGetAndSetIfNotExists/10k_nodes-4 13.05m 12.77m ~ 0.400
SequentialGetAndSetIfNotExists/50k_nodes-4 22.86m 23.64m ~ 0.400
SequentialGetAndSetIfNotExists/100k_nodes-4 32.17m 31.59m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 9.015m 11.325m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 16.82m 19.21m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 20.73m 18.83m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 10.44m 11.87m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 17.44m 16.00m ~ 0.200
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 68.99m 62.55m ~ 0.400
DiskTxMap_SetIfNotExists-4 3.850µ 4.060µ ~ 0.100
DiskTxMap_SetIfNotExists_Parallel-4 3.577µ 3.556µ ~ 0.400
DiskTxMap_ExistenceOnly-4 343.9n 353.9n ~ 0.200
Queue-4 188.1n 190.0n ~ 0.400
AtomicPointer-4 3.260n 3.241n ~ 0.400
TxMapSetIfNotExists-4 50.20n 49.78n ~ 0.100
TxMapSetIfNotExistsDuplicate-4 41.37n 41.28n ~ 1.000
ChannelSendReceive-4 560.4n 567.9n ~ 0.100
CalcBlockWork-4 474.8n 481.6n ~ 0.700
CalculateWork-4 641.3n 646.9n ~ 0.700
CheckOldBlockIDs/on-chain-prefetch/1000-4 59.25µ 69.37µ ~ 0.700
CheckOldBlockIDs/off-chain-prefetch/1000-4 50.79µ 49.48µ ~ 0.400
CheckOldBlockIDs/on-chain-prefetch/10000-4 442.3µ 442.5µ ~ 1.000
CheckOldBlockIDs/off-chain-prefetch/10000-4 352.9µ 354.2µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_10-4 1.426µ 1.476µ ~ 0.200
BuildBlockLocatorString_Helpers/Size_100-4 13.54µ 13.82µ ~ 0.400
BuildBlockLocatorString_Helpers/Size_1000-4 130.3µ 136.1µ ~ 0.100
CatchupWithHeaderCache-4 104.6m 104.6m ~ 0.100
_BufferPoolAllocation/16KB-4 4.180µ 4.191µ ~ 1.000
_BufferPoolAllocation/32KB-4 10.01µ 10.80µ ~ 1.000
_BufferPoolAllocation/64KB-4 21.13µ 17.40µ ~ 0.100
_BufferPoolAllocation/128KB-4 37.05µ 35.25µ ~ 0.400
_BufferPoolAllocation/512KB-4 131.8µ 121.9µ ~ 0.100
_BufferPoolConcurrent/32KB-4 22.78µ 23.74µ ~ 0.100
_BufferPoolConcurrent/64KB-4 36.95µ 32.85µ ~ 0.100
_BufferPoolConcurrent/512KB-4 156.0µ 160.7µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/16KB-4 695.9µ 748.5µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 687.9µ 666.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 694.3µ 672.5µ ~ 1.000
_SubtreeDeserializationWithBufferSizes/128KB-4 695.4µ 682.1µ ~ 0.200
_SubtreeDeserializationWithBufferSizes/512KB-4 690.3µ 724.7µ ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/16KB-4 36.92m 36.75m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.85m 36.74m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/64KB-4 36.64m 36.87m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/128KB-4 36.72m 36.56m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/512KB-4 36.65m 36.36m ~ 0.100
_PooledVsNonPooled/Pooled-4 740.6n 741.9n ~ 0.700
_PooledVsNonPooled/NonPooled-4 8.060µ 8.996µ ~ 0.400
_MemoryFootprint/Current_512KB_32concurrent-4 7.278µ 7.110µ ~ 0.700
_MemoryFootprint/Proposed_32KB_32concurrent-4 10.17µ 10.08µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.831µ 9.663µ ~ 0.100
_prepareTxsPerLevel-4 416.4m 430.7m ~ 0.700
_prepareTxsPerLevelOrdered-4 4.104m 3.988m ~ 0.100
_prepareTxsPerLevel_Comparison/Original-4 415.5m 429.6m ~ 0.200
_prepareTxsPerLevel_Comparison/Optimized-4 4.110m 3.895m ~ 0.100
SubtreeSizes/10k_tx_4_per_subtree-4 1.385m 1.327m ~ 0.400
SubtreeSizes/10k_tx_16_per_subtree-4 323.5µ 315.5µ ~ 0.400
SubtreeSizes/10k_tx_64_per_subtree-4 77.14µ 75.12µ ~ 0.100
SubtreeSizes/10k_tx_256_per_subtree-4 19.30µ 18.59µ ~ 0.100
SubtreeSizes/10k_tx_512_per_subtree-4 9.692µ 9.224µ ~ 0.100
SubtreeSizes/10k_tx_1024_per_subtree-4 4.786µ 4.581µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.372µ 2.300µ ~ 0.100
BlockSizeScaling/10k_tx_64_per_subtree-4 75.50µ 73.40µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 18.94µ 18.57µ ~ 0.100
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.674µ 4.632µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 395.4µ 386.5µ ~ 0.200
BlockSizeScaling/50k_tx_256_per_subtree-4 94.35µ 93.21µ ~ 0.200
BlockSizeScaling/50k_tx_1024_per_subtree-4 23.17µ 22.84µ ~ 0.100
SubtreeAllocations/small_subtrees_exists_check-4 156.3µ 155.7µ ~ 0.700
SubtreeAllocations/small_subtrees_data_fetch-4 163.1µ 162.4µ ~ 1.000
SubtreeAllocations/small_subtrees_full_validation-4 325.9µ 321.9µ ~ 0.100
SubtreeAllocations/medium_subtrees_exists_check-4 9.125µ 9.149µ ~ 1.000
SubtreeAllocations/medium_subtrees_data_fetch-4 9.517µ 9.605µ ~ 0.700
SubtreeAllocations/medium_subtrees_full_validation-4 18.99µ 18.67µ ~ 0.100
SubtreeAllocations/large_subtrees_exists_check-4 2.196µ 2.210µ ~ 0.600
SubtreeAllocations/large_subtrees_data_fetch-4 2.337µ 2.306µ ~ 0.700
SubtreeAllocations/large_subtrees_full_validation-4 4.719µ 4.681µ ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 332.3µ 335.6µ ~ 1.000
StoreBlock_Sequential/AboveCSVHeight-4 332.9µ 336.5µ ~ 0.100
GetUtxoHashes-4 281.6n 279.1n ~ 0.400
GetUtxoHashes_ManyOutputs-4 48.73µ 45.93µ ~ 0.100
_NewMetaDataFromBytes-4 212.5n 216.6n ~ 0.100
_Bytes-4 394.4n 398.0n ~ 0.700
_MetaBytes-4 138.0n 139.8n ~ 0.200

Threshold: >10% with p < 0.05 | Generated: 2026-06-12 14:52 UTC

Production asset pods were OOMKilling under load on /api/v1/subtree_data,
manifesting downstream as nginx "upstream prematurely closed connection
while reading response header from upstream". Goroutine profiles showed
60K+ goroutines accumulating in the chunk-fetch fan-out before SIGKILL.

The earlier streaming work bounded per-request memory but did nothing to
cap concurrent on-demand subtreeData file creations: each one holds
chunkSize tx-metadata batches in memory, multiplied by however many
clients arrive at once. With large transactions and slow clients the
process trivially exceeds even a 64Gi limit.

Asset side - admission control:
- New asset_concurrency_subtree_data_create setting (default 4) gates
  the dualStreamWithFileCreation path with non-blocking TryAcquire.
  When the cap is reached, requests get HTTP 503 with Retry-After: 1
  instead of waiting up to 30s for a permit.
- Restructured GetSubtreeDataReader to check Exists first without
  holding the reader semaphore. File-exists fast path uses the existing
  reader sem; on-demand creation uses the new create sem.
- Defensive cap on the pending chunk map in
  writeTransactionsViaSubtreeStoreStreaming (2 * concurrency); aborts
  with a clear error if a future scheduler regression grows it.
- ctx check every 256 txs in writeChunkToWriter so a client disconnect
  releases chunkMetaSlice promptly instead of waiting for the next pipe
  write to fail.

HTTP utility - typed 503 + retry helper:
- buildHTTPError now produces errors.ErrServiceUnavailable on 503 so
  callers can errors.Is on it.
- New DoHTTPRequestBodyReaderWithRetry: exponential backoff (250ms ->
  5s, max 6 attempts), honors Retry-After header, retries only on 503.
  Non-503 errors and ctx cancellation return immediately.

Callers - retry on peer 503:
- subtreevalidation/SubtreeValidation.go (getSubtreeMissingTxs)
- subtreevalidation/check_block_subtrees.go (CheckBlockSubtrees)
- blockvalidation/get_blocks.go (fetchSubtreeDataFromPeer)

Tests:
- 7 new unit tests for the retry helper covering success, retry-then-
  succeed, attempt exhaustion, Retry-After honoring, no-retry on non-
  503, ctx cancellation, and parseRetryAfter parsing. Race-clean.

Verified: go build, go vet, go test ./util/ -race all clean.
icellan added 3 commits May 8, 2026 12:47
Picks up the latest changes on github.com/bsv-blockchain/aerospike-client-go/v8.
go.sum tarball hash refreshed; no source changes in this commit.

Pin: v8.7.1-bsv2.0.20260508103936-b18e5d80a9e4 (commit b18e5d8).
Adds aerospike_enable_client_metrics (default true, preserving prior behavior).
When disabled, util/aerospike.go skips client.EnableMetrics(nil) and the stats
polling goroutine entirely.

Motivation: under sustained heavy batch load we have observed thousands of
goroutines blocked on a single sync.RWMutex.Lock() inside aerospike-client-go's
per-node nodeStats updateOrInsert path, called from writeCommand.parseResult
on every batch result. EnableMetrics is what wires that tracking up; turning
it off bypasses the contention point entirely at the cost of losing the
per-cluster Prometheus stats this Go process exposes.

Operational guidance:
- Keep enabled in normal production where observability matters.
- Flip to false on clusters where nodeStats mutex contention is observed
  (goroutine profile shows many waiters on the same mutex address inside
  aerospike-client-go internals) until a client-side fix lands upstream.
…ines

Production observation on dev-scale-2: subtree-validator's "Tx Meta read
from Kafka /second" Grafana panel oscillates between 0 and 2.4M tx/s
with mean ~1.5M, repeatedly dropping to zero. One month ago the same
metric was a steady ~900K/s with no zero-drops. Cache hit rate sits
at ~50% even for new (not-yet-mined) transactions, and subtree
validation takes 30-49s per subtree (target: ~2s).

Root cause is in services/subtreevalidation/txmetaHandler.go:

- Each Kafka message spawns a goroutine via `go func()` (unbounded
  fan-out, since 9f4f1e5 in Dec 2025).
- Each Kafka message carries a binary BATCH of N entries (since
  4dcf264 in Dec 2025), but the handler loops calling
  SetCacheFromBytes once per entry sequentially.
- Each SetCacheFromBytes call takes a per-bucket write lock in
  improved_cache.bucket.Set (improved_cache.go:799).

Under burst load the cumulative effect is thousands of goroutines
serializing through the cache's bucket locks. The cache's own
sharded-bucket parallelism (8192 buckets in SetMulti) is wasted
because each goroutine takes locks one at a time. Throughput
collapses to single-writer speed, recovers when contention drains,
collapses again — the visible 0/2.4M oscillation pattern.

The franz Kafka library switch in #611 (2026-03-27) is the most
likely trigger for the recent regression, but the underlying design
flaw predates that.

Fix:

- New txmetaCacheJob struct: parsed Kafka batch (deep-copied keys/
  values, separated ADD/DELETE).
- Bounded worker pool consumes parsed jobs from a buffered channel.
  Workers call SetCacheMulti ONCE per batch — letting the cache's
  bucket fan-out parallelize internally, taking each touched bucket
  lock once per Kafka message instead of once per entry.
- Handler now does cheap parsing on the Kafka consumer goroutine,
  then enqueues onto the work channel. Channel-full enqueue BLOCKS,
  applying backpressure to Kafka rather than letting goroutines
  pile up unboundedly.
- Shutdown is driven by closing the work channel (not ctx cancel)
  so workers drain queued items before exiting. sync.Once on the
  close keeps Stop() idempotent for tests with deferred cleanup.

Settings:
- subtreevalidation_txmetaCacheKafkaWorkers (default 8)
- subtreevalidation_txmetaCacheKafkaQueueSize (default 256)

Interface:
- SetCacheMulti added to txMetaCacheOps. TxMetaCache already has it.

Tests:
- Existing TestServer_txmetaHandler covers nil/short/ADD/DELETE
  paths; updated to assert SetCacheMulti (not SetCacheFromBytes)
  on ADD and run against the worker pool.
- New TestServer_txmetaHandler_BatchesIntoSingleSetCacheMulti
  guards the regression: 50 entries -> ONE SetCacheMulti call.
- New TestParseTxmetaBatch covers empty/zero-entry/multi-entry/
  truncated batches and verifies keys/values are deep-copied so
  later mutation of the source buffer doesn't corrupt parsed jobs.
- Race detector clean.

Note: this fix does NOT remove the cache writeback in
TxMetaCache.BatchDecorate (cache-aside on store fetches) — that
path is needed for nodes whose Kafka publishers don't carry every
transaction.

Verified: go build, go vet, go test -race for the changed packages
all green; pre-commit hooks pass.
icellan and others added 3 commits May 8, 2026 15:02
Restores the partition-level consume parallelism that the franz-go
switch (#611, 2026-03-27) silently removed, and fixes the producer-side
sticky-partitioner skew that produces bursty per-partition load.

Three changes, all on the txmeta hot path:

1) services/validator/Validator.go — Kafka producer now sets Key to a
   tx hash (the first hash in the batch for batched sends, the txn's
   own hash for single sends). Previously Key was nil, which under
   franz-go's default StickyKeyPartitioner is equivalent to a
   StickyPartitioner: every batch lands on the same partition until
   the linger/batch threshold trips, producing the bursty oscillation
   we observed in production.

   With a non-nil key, StickyKeyPartitioner hashes the key onto a
   partition deterministically. tx hashes are uniformly distributed,
   so partition usage is now uniform.

2) util/kafka/kafka_consumer.go (Start) — replaces the single-goroutine
   fetches.EachRecord(...) loop with fetches.EachPartition(...) +
   one goroutine per partition per fetch. Within a partition records
   are still processed sequentially so per-partition order is
   preserved; across partitions the work runs in parallel. partitionWg
   is awaited before the next PollFetches so in-flight goroutine count
   is bounded by partition count.

   This restores N-way concurrency that the previous Kafka library
   provided implicitly via per-partition consumer goroutines.

3) util/kafka/kafka_consumer.go (NewKafkaConsumerGroup) — when
   AutoCommitEnabled is true, registers kgo.AutoCommitMarks() and the
   per-partition loop calls client.MarkCommitRecords(record) only
   after consumerFn returns nil. With the previous default auto-commit
   any record that consumerFn errored on was still being committed by
   the auto-commit timer because franz-go tracks "iterated" not
   "succeeded". This change makes auto-commit consistent with the
   manual-commit path's semantics.

Verified:
- go build / go vet across changed packages — clean
- go test -short -race ./util/kafka/... — pass (perf tests are
  -short-skipped; they require a real broker)
- go test -race ./services/subtreevalidation/{TestServer_txmetaHandler,
  TestParseTxmetaBatch} — pass (worker pool from #834 still works
  under the new parallel feeder)
- Pre-commit hooks (gofmt / gci / golangci-lint) — pass

Notes:
- Producer compression intentionally NOT enabled — Bitcoin tx data
  is high-entropy, compression would just burn CPU.
- AutoCommitMarks closes the gap where errored records were still
  committed; it does NOT close the gap where async handlers (txmeta
  worker pool from #834) return before the cache write completes.
  Per discussion this is acceptable: cache misses fall through to
  the UTXO store transparently.
CI on PR #828 caught three e2e tests panicking with "send on closed
channel" originating from txmetaHandler.go:148:

    legacy-sync       TestBIP68_TimeBased_Accept
    smoketest         TestBIP68_HeightBased_Accept
    prunertest        TestPrunerUnminedParentRetention

All three use the in-memory Kafka consumer which delivers a final
message AFTER Server.Stop() has returned from txmetaConsumerClient.
Close(). The handler running on that delivery race-loses with
stopTxmetaCacheWorkers's close(txmetaCacheJobCh) and panics.

Fix: a sync.RWMutex (txmetaCacheCloseMu) plus a closed flag
(txmetaCacheClosed) coordinate the two paths:

- Senders (txmetaHandler) take the read lock, check closed, send.
- Closer (stopTxmetaCacheWorkers) takes the write lock, flips the
  flag, closes the channel — all under the write lock.

Read-lock-then-check-then-send is atomic from the closer's POV: a
sender either sees closed=false and completes its send before the
write lock is granted, or sees closed=true and bails. No path can
land on close(ch) followed by send(ch).

Read locks are uncontended on the steady-state hot path and the
close runs once per Server lifetime, so overhead is negligible.

Test: TestServer_txmetaHandler_ShutdownRace stresses 16 sender
goroutines against a concurrent stop(); 10× -race iterations clean.

Verified:
- go build / go vet — clean
- go test -count=10 -race -run "TestServer_txmetaHandler|
  TestParseTxmetaBatch" ./services/subtreevalidation/ — pass
- Pre-commit hooks — pass
icellan added 3 commits May 22, 2026 15:44
… size, clamp NaN/Inf

Three correctness fixes from review of the original PR:

1. settings/settings.go was missing the imperative loader entry for
   aerospike_semaphore_multiplier, so the struct-tag default of "1.0"
   never took effect — settings.NewSettings() returned 0.0 and every
   deployment without an explicit override silently ran with the
   in-process semaphore disabled. Add the getFloat64 call so the
   documented 1.0 default actually applies.

2. Client.GetConnectionQueueSize returned cap(nil) = 0 when the
   semaphore was deliberately disabled, which made the pruner's
   recommendedMax = 0 × threshold and collapsed chunkGroupLimit to 1.
   Store the resolved underlying connection-queue size on Client and
   report it when the semaphore is nil so external pool-capacity
   heuristics survive an opt-out.

3. buildConnSemaphore did not guard NaN/+Inf or runaway scaled values
   before make(chan, ...). NaN now disables the semaphore (treated as
   garbage input); +Inf and any scaled value above maxSemaphoreCapacity
   (1<<20) are clamped so a typo like 1.0e10 cannot allocate a multi-GB
   channel.

Adds a settings.NewSettings() regression test that asserts the runtime
default is 1.0 — the existing tests bypass the broken seam by calling
newClientConfig / buildConnSemaphore directly. Also extends
TestBuildConnSemaphore with NaN/+Inf/clamp cases and adds an explicit
fallback test for GetConnectionQueueSize.
buildOOBFixture and buildInRangeFixture created their SyncedMap with
limit=2 and then inserted exactly 2 entries. The _NilParentTx tests
subsequently Set on an existing key; once len == limit, go-tx-map's
setUnlocked evicts a random item from the map before writing — which
~50% of the time drops the child the SUT is about to look up, leaving
the test failing with "tx <hash> not found in txMap". Reproduced
locally with -count=20 under -race.

Use an unbounded SyncedMap in the fixtures (the production path sizes
by block tx count, not 2) and leave a comment explaining the eviction
trap so future fixtures don't fall into the same hole.

Duplicates the fix in PR #931 so this PR's CI is unblocked
independently; will fold into theirs if #931 lands first.
# Conflicts:
#	go.mod
#	services/legacy/netsync/handle_block_test.go
#	stores/txmetacache/improved_cache_test.go
icellan and others added 5 commits May 28, 2026 12:24
# Conflicts:
#	go.mod
#	go.sum
#	settings/aerospike_settings.go
#	settings/settings.go
#	stores/txmetacache/cache_backend.go
#	stores/txmetacache/improved_cache_meta.go
#	stores/txmetacache/pointer_cache.go
#	stores/txmetacache/pointer_cache_test.go
#	stores/txmetacache/txmetacache.go
#	stores/txmetacache/txmetacache_test.go
#	stores/utxo/aerospike/spend_expressions.go
PR #957 landed on main using the stock aerospike-client-go/v8 import
path. pr-828's whole utxo/aerospike package uses the BSV fork
(github.com/bsv-blockchain/aerospike-client-go/v8), which adds the
TeranodeModifyOp wire opcode that the native-ops path needs. Mixing
stock and fork in the same package would break type compatibility
across files.

Same surgical edit as was applied to spend_expressions.go during the
prior conflict resolution. The BSV fork is API-superset of stock for
everything circuit_breaker.go touches, so the change is import-path-only.
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
52.5% Coverage on New Code (required ≥ 80%)
3.1% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

# Conflicts:
#	stores/utxo/aerospike/locked.go
#	stores/utxo/aerospike/spend.go
#	util/aerospike.go
@socket-security

socket-security Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedgolang/​github.com/​vmihailenco/​msgpack/​v5@​v5.4.197100100100100

View full report

@icellan icellan marked this pull request as ready for review June 8, 2026 12:36
@icellan icellan requested a review from oskarszoon June 8, 2026 12:37

@oskarszoon oskarszoon left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review against 2988d2df7. Prior round cleared: the three Copilot findings (WritePolicy, probe doc/impl, PreserveTransactions nil-slice) all verify as fixed, merge conflict resolved, fork re-pinned to v8.7.1-bsv3. One blocker remains.

Blocking — native unspend drops the #766 SpendingData ownership check (#899). un_spend.go:164-180 routes through executeTeranodeOp(subOpUnspend) when aerospike_use_native_teranode_ops=true. The native subOpUnspend=3 dispatcher doesn't enforce the ownership comparison the Lua path does (teranode.lua:513-540), so with the flag on an unspend can clear a spending_data slot it doesn't own → resurrects a spent UTXO and corrupts spentUtxos accounting on the reorg / ProcessConflicting path. Default-false plus a doc note isn't a sufficient guard for consensus state — one flag flip disables the check silently. Fence unspend to the UDF path until the server dispatcher lands #899 (keeps the native win for the other ops), or gate the setting on a server version that includes the fix.

Before this flag is trusted in prod (non-blocking for a default-false merge):

  • Startup probe validates only setLocked; one bool gates all sub-ops and there's no runtime PARAMETER_ERROR→UDF fallback. Probe pass ≠ proof the other sub-ops decode/enforce correctly. (native_op.go:194-291)
  • Native path msgpack-marshals raw Go types (uint32 vs int) untranslated, unlike the UDF path's normalised values. A numeric-type mismatch on any sub-op (e.g. spendMulti's uint32 blockHeight) passes the probe and then fails/corrupts at the first real op. Normalise numeric args to int64, or accept both MP_INT/MP_UINT server-side, and verify every sub-op against the C SUBOP_TABLE.
  • Path selection + probe error-classification have zero unit coverage (Store.client is concrete *uaerospike.Client, no mock seam). Add an interface seam + a conformance test (spend as B, attempt unspend with A's SpendingData, assert the UTXO stays spent) alongside the server change.

@ordishs ordishs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — native operate-path for mod-teranode UDFs

The headline feature (opt-in TeranodeModifyOp wire-200 path with a single decision point in teranodeBatchRecord / executeTeranodeOp, a startup capability probe biased toward false-negatives, transparent UDF fallback, shared BatchWritePolicy, and a msgpack-tolerant response parser) is well-architected and conservatively gated. Off by default → byte-identical to current behaviour. Init ordering is correct: client/namespace/setName/logger are all set before initNativeTeranodeOps runs, and the probe never touches the logger when the setting is off.

Acknowledging up front that this PR bundles several independently-shippable changes specifically because they've soaked in production for a couple of weeks. That's real evidence, and it retires my weakest argument — so the points below are deliberately split into "what soak time covers" vs "what it doesn't."

What production soak reasonably retires

The nil-hardening, diagnostics helpers (describe*, with tests), streamOrAbort, the catchup ErrServiceErrorErrExternal reclassification, and the Prometheus metric rename/removal all ship live regardless of the flag. Two weeks of exposure is worth more than fresh review eyes, so I'm downgrading "these should be separate PRs" from a blocker to a preference. One caveat that soak doesn't cover: this repo is consumed downstream, so the metric rename (set_tx_meta_cache_kafka_kafka_batch + new _kafka_count) and the removal of tx_meta_cache.hit_old_tx will break other operators' dashboards even though they're already reconciled here. Worth a CHANGELOG/release-note callout.

What soak does NOT automatically cover — depends on flag state

The entire native path is short-circuited to the identical UDF code when the flag is off. So the key question is whether aerospike_use_native_teranode_ops was actually true in the soaked deployment:

  • If it was OFF — everything except the native path has production evidence, but the headline feature has zero production exposure.
  • If it was ON — the native path soaked (great), but two concerns were then live in production:

1. unspend ownership check skipped on the native path (security regression, #899). un_spend.go forwards SpendingData to subOpUnspend=3, but the server-fork dispatcher doesn't yet validate it. PR #766 made that ownership check mandatory on the UDF path; the native path weakens spend-reversal authorization. Unspend fires on reorgs / block invalidation — so even "flag on in prod" only counts as tested here if a reorg actually occurred in that window. This is the one place the PR can weaken a correctness guarantee rather than just add a code path, so it's my primary ask.

2. No runtime PARAMETER_ERROR → UDF fallback. The probe samples one partition master. If it passes but a runtime spend/setMined lands on a stock-Aerospike master (mid-rolling-upgrade or a rolled-back node), the write hard-fails on a consensus-critical path with no fallback. A uniform fork-only cluster never exercises this, so soak time doesn't retire it — the gap is still there for the next rolling upgrade.

Asks before this is enabled anywhere new (neither requires splitting the PR)

  1. Confirm the flag state in the soaked deployment, and whether unspend/reorg was exercised under it.
  2. Add one line to the setting's longdesc flagging the #899 unspend-ownership gap — right now that caveat lives only in the PR body and the issue, not where an operator toggling the flag will see it.

Smaller notes

  • No CI-level test of msgpack wire-contract fidelity per sub-op (esp. spendMulti's []aerospike.MapValue with non-string keys). The frozen wire contract could drift silently; a golden-bytes encode test per sub-op would guard it. Acknowledged as a follow-up.
  • Probe writes into the production namespace/set (unique per-process key + 60s TTL — reasoning is sound); worth confirming it can't trip secondary-index population on that set.
  • The new ErrExternal catchup branch in blockvalidation/Server.go changes peer-failover behaviour and has no test.
  • Sub-op id 12 (setDeleteAtHeight) reserved-but-unused may trip an unused-const linter.

Verdict

Comfortable with this merging as a production-validated batch — dropping the split-it-up insistence. The two items I'd still want resolved before the flag is enabled on any new cluster are #1 and #2 above, and the soak history only addresses them if the flag was on and a reorg/unspend actually occurred. Static review only; I did not run the suite (the native path needs the BSV-forked server to exercise).

The native subOpUnspend=3 dispatcher in the BSV fork of aerospike-server now
enforces the #766 SpendingData ownership check. Update the stale un_spend.go
comment that said the dispatcher still needed the fix, document the minimum
server-build requirement in the UseNativeTeranodeOps longdesc, and drop the
private-repo reference from the native_op.go header comment.
@icellan

icellan commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Re: the native unspend ownership blocker (#899)

This is now resolved server-side — the native subOpUnspend=3 dispatcher in the BSV fork of aerospike-server enforces the same SpendingData ownership comparison as the Lua path (size-guarded memcmp), and a non-owner unspend is a silent no-op that still runs the setDeleteAtHeight housekeeping, matching the #766 ProcessConflicting semantics.

On the teranode side, no client change was required: un_spend.go already forwards the args in the matching order — offset, utxoHash, spendingData(36), currentBlockHeight, blockHeightRetention — identical across the UDF and native paths. The conformance test (unspend_ownership_test.go: spend as B, attempt unspend with A's SpendingData, assert the UTXO stays spent) is on the branch, and the server-side equivalent passes against a build that includes the fix.

This PR updates the now-stale un_spend.go comment and the UseNativeTeranodeOps longdesc to document the minimum server-build requirement (commit 60ff9f9a3).

On the non-blocking follow-ups (probe validates setLocked only; no runtime PARAMETER_ERROR→UDF fallback; numeric arg normalisation): these remain operator-gated by the documented "upgrade the cluster fully before flipping the flag, default stays false" requirement. Happy to add a probe-level ownership/version check as a follow-up if a programmatic guard is preferred over the doc-only gate.

@icellan icellan requested a review from oskarszoon June 9, 2026 13:36
Comment thread stores/utxo/aerospike/spend.go
@icellan icellan requested a review from ordishs June 12, 2026 08:38
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
52.6% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants