fix(stores/utxo/aerospike): align frozen UTXO checks with SQL and bitcoin-sv#949
Conversation
…coin-sv The Aerospike UTXO store's two spend paths (Lua and expression-based) diverged from the SQL store and bitcoin-sv at the freeze-expiry boundary. Reference behaviour: the freeze window is closed at start, open at stop — a UTXO becomes spendable AT spendableIn (see svnode CFrozenTXOCheck). Two related fixes: 1. Lua boundary off-by-one (T5a) teranode.lua used `>=` which kept the UTXO frozen at currentBlockHeight == spendableIn. SQL uses strict `<` consistently across all three call sites (sql.go:2071, 2557, 3346). Flipped to `>` so the UTXO becomes spendable AT spendableIn, matching SQL/svnode. 2. Expression-path strictness (T25) The expression filter rejected any record with utxoSpendableIn set, regardless of the per-offset value, because Aerospike expressions cannot inspect map values. Kept the conservative ExpBinExists guard for safety (a record needing per-offset comparison must never reach ListSetOp through the expression path) and added retry-through-Lua for FILTERED_OUT records: processSpendBatchResultsExpressions now collects filtered items and re-dispatches them via the Lua UDF, which compares per-offset and produces the correct accept/reject decision along with the precise rejection reason. Extracted the Lua execution from sendSpendBatchLua into executeLuaSpendBatch so both the primary Lua route and the expression retry share the same implementation. Tests cover the boundary (at, below, above spendableIn) and parity between Lua and expression paths for both past and future spendableHeight values; all run against a real Aerospike container via testcontainers-go. Closes #4640
|
🤖 Claude Code Review Status: Complete Current Review: No issues found. This PR implements a careful fix for the frozen UTXO boundary checks with strong safeguards: Strengths:
Design tradeoffs acknowledged:
The PR follows AGENTS.md principles: minimal changes, test-first approach, clear tradeoffs stated upfront. |
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-26 15:13 UTC |
…h Lua retry
Operators monitoring the expression-path-vs-Lua fallback rate had no
quantitative signal that the conservative utxoSpendableIn guard was hot.
Add two counters:
- utxo_spend_expression_lua_retry: batches that triggered any retry
(rate divided by utxo_spend_batch gives the batch-level retry
fraction).
- utxo_spend_expression_lua_retry_records: individual records
re-dispatched through Lua (rate divided by utxo_spend gives the
per-record retry fraction).
Tighten the comment in processSpendBatchResultsExpressions to reflect
that the retry is one batched Lua call per expression batch, not one
per record.
|
oskarszoon
left a comment
There was a problem hiding this comment.
Approve. Consensus-safe.
Consensus check
Verified against svnode src/frozentxo_db.h:226-262, 367-371 + frozentxo.cpp:37-83. HeightInterval is half-open [start, stop) (frozentxo_db.h:224 explicit). IsFrozenOnConsensus returns true iff nHeight >= start && nHeight < stop. TXO spendable AT stop.
The Lua >= → > change at teranode.lua:373 aligns with svnode and the SQL store (sql.go:2071, 2557, 3346, all strict <). Boundary directionality:
| height vs spendableHeight | svnode | SQL | Aerospike pre | Aerospike post |
|---|---|---|---|---|
| below | reject | reject | reject | reject |
| at | accept | accept | reject | accept |
| above | accept | accept | accept | accept |
No height at which the fix could flip Aerospike to looser-than-network. The only change is closing the one-block false-reject at the boundary.
Expression path (T25): defense-in-depth ExpNot(ExpBinExists(UtxoSpendableIn)) retained; FILTERED_OUT records dispatch to executeLuaSpendBatch which does the now-correct per-offset comparison. Lua is authoritative for any record the expression can't evaluate. Error classification end-to-end verified: LuaErrorCodeFrozen / LuaErrorCodeFrozenUntil both satisfy errors.Is(_, errors.ErrFrozen); Spent/Conflicting/Locked/Creating/CoinbaseImmature/TxNotFound/InvalidSpend/UtxoNotFound/UtxoHashMismatch/UtxoInvalidSize all classify correctly. Replaces the pre-fix generic "spend validation failed".
Go review
executeLuaSpendBatchextraction is clean — single source of truth, both primary and retry paths share policy, circuit-breaker, logging.- FILTERED_OUT retry collection and re-dispatch: exactly one response per item; empty case handled; mixed-result batches preserve genuine failures while retried records get precise reasons; index mapping in
prepareSpendBatchesis self-consistent against the slice passed. - No new race window — Lua UDF is atomic per record and re-checks freeze state itself, so the expression → Lua hand-off has no TOCTOU.
- Metrics: two counters (batch + record), no label cardinality issues,
promauto+sync.Once. - Boundary test covers both paths (lua / expressions), three heights at spendableHeight ± 1 and ==, plus two parity tests; asserts
errors.ErrFrozenclassification.
Follow-ups (not blockers)
- Pre-existing P1, more reachable now.
IncrementSpentRecordsMultiatspend.go:151does.(map[interface{}]interface{})without anok-check. Process panics if Lua returns an unexpected type on error. The new expression-to-Lua retry path makes this more reachable. Add anok-check or wrap in a defensive switch — same PR or a follow-up. - Guard comment in
spend_expressions.gocould cross-reference the retry dispatch at lines 523-527 — cosmetic. - No mixed-offset test (utxoBatchSize > 1 with one offset spendable + one frozen in the same record). The Lua per-offset iteration at
teranode.lua:371-383isn't exercised that way. - Tests are self-consistency between Aerospike paths, not directly svnode-anchored. Optional belt-and-braces pin against svnode reference behaviour at the same heights.
utxoSpendableInis single-interval today; svnode'senforceAtHeightis an array of intervals. Pre-existing limitation, worth a tracking issue but out of scope for this PR.



Summary
The Aerospike UTXO store's two spend paths (Lua and expression-based) diverged from the SQL store and bitcoin-sv at the freeze-expiry boundary. An Aerospike-backed node was strictly stricter than the rest of the network: it refused to mine or accept transactions that SQL/svnode considered valid.
Reference behaviour comes from svnode's
CFrozenTXOCheck(bitcoin-sv/src/frozentxo_db.h):enforceAtHeightintervals are evaluated half-open withnHeight < i.stop— the freeze window is closed at start, open at stop. A UTXO becomes spendable ATstop.Two related fixes ship together so all three storage paths (SQL, Aerospike Lua, Aerospike expression) make identical accept/reject decisions.
Fix 1 — Aerospike Lua boundary off-by-one (T5a)
stores/utxo/aerospike/teranode.lua:373used>=which kept the UTXO frozen atcurrentBlockHeight == spendableIn. The SQL store uses strict<in all three call sites (sql.go:2071, 2557, 3346). Flipped to>so the UTXO becomes spendable ATspendableIn, matching SQL/svnode.Fix 2 — Aerospike expression-path strictness (T25)
stores/utxo/aerospike/spend_expressions.gorejected ANY record withutxoSpendableInset, regardless of the per-offset value, because Aerospike filter expressions cannot inspect map values — only check bin existence.Picked Option B2a from the issue (over-retry through Lua):
ExpNot(ExpBinExists(UtxoSpendableIn))guard as defense in depth — a record needing per-offset comparison must never reachListSetOpthrough the expression path.processSpendBatchResultsExpressionsnow collectsFILTERED_OUTrecords and re-dispatches them via a single batched Lua call, which compares per-offset and produces the correct accept/reject decision with the precise rejection reason (ErrFrozen,ErrSpent,ErrTxConflicting, etc.) — rather than the previous generic "spend validation failed".sendSpendBatchLuaintoexecuteLuaSpendBatchso both the primary Lua route and the expression retry share the same implementation.Cost: one additional batched Lua call per expression batch that has any
FILTERED_OUTrecords. The cost is independent of how many records were filtered out — they all go through in the same batch — and is only paid when at least one record needed re-evaluation. In steady state most spends pass the expression filter, so this is paid only on already-failing records. The conservative guard means the safety property holds even if the retry logic has a bug.Observability
Two new Prometheus counters give operators visibility into how often the expression path is falling back to Lua — the only quantitative signal that the conservative
utxoSpendableInguard is hot:teranode_aerospike_utxo_spend_expression_lua_retry— counter incremented per expression batch that triggered a retry (rate divided byutxo_spend_batchgives the batch-level retry fraction).teranode_aerospike_utxo_spend_expression_lua_retry_records— counter incremented per record re-dispatched through Lua (rate divided byutxo_spendgives the per-record retry fraction).Test plan
New tests in
stores/utxo/aerospike/spend_spendable_in_boundary_test.go, all run against a real Aerospike container via testcontainers-go:TestStore_SpendableInBoundary— boundary atspendableHeight - 1(reject),== spendableHeight(accept after fix),+ 1(accept). Runs against both Lua and expression paths.TestStore_SpendableInExpressionParity_PastValueAccepted— record withutxoSpendableInset to a long-past value is spendable through both paths.TestStore_SpendableInExpressionParity_FutureValueRejected— record withutxoSpendableInstrictly in the future is rejected through both paths withErrFrozen(the safety property the guard enforces).go test ./stores/utxo/aerospike/...suite passes (178s with containers).go test ./stores/utxo/sql/....go test -tags testtxmetacache ./services/validator/....Acceptance criteria (from bitcoin-sv/teranode#4640)
blockHeight == spendableIn, all three store paths accept the spend.blockHeight == spendableIn - 1, all three reject.UtxoSpendableInset whose values are all past is spendable through both Aerospike paths.UtxoSpendableInmap contains a value> currentBlockHeightfor the output being spent.Closes bitcoin-sv/teranode#4640