fix(blockvalidation): arena-backed tx decode to eliminate catch-up OOM#929
Merged
oskarszoon merged 10 commits intoMay 22, 2026
Merged
Conversation
Brings in bt.Arena, *.ReadFromWithArena, Tx.HashTxIDInto, and a script-length cap (1 GiB) on existing ReadFrom paths. No teranode code uses the new API yet; subsequent commits migrate hot decode paths in subtreevalidation, blockpersister, asset, and legacy/netsync to per-subtree/per-block arena pools.
sync.Pool of bt.Arena instances reused across subtree decode loops. Get on decode start, Put on completion — putSubtreeArena runs ResetAndShrink(64 MiB) so a one-off oversized decode doesn't bloat the pool's idle footprint. Hot-path migration in subsequent commits.
readTransactionsFromSubtreeDataStream now decodes via tx.ReadFromWithArena and hashes via tx.HashTxIDInto with a reusable scratch buffer. Each parallel subtree-decode goroutine gets its own arena via the sync.Pool helper added in the previous commit; arenas are released after processTransactionsInLevels consumes the batch's txs. processSubtreeDataStream and extractAndCollectTransactions gain a *bt.Arena parameter to thread the per-batch arena through. Catches the catch-up critical hot path that bsv-blockchain#920 identified.
The per-call arena pattern would force a defensive copy of script bytes before return, defeating the win. Bulk-stream sites (the catch-up hot loop in check_block_subtrees.go) are where the arena pays off; one-shot missing-tx fetches stay on the standard ReadFrom path.
ProcessSubtreeUTXOStreaming now decodes via tx.ReadFromWithArena and hashes the integrity check via tx.HashTxIDInto with a reusable scratch. A new per-package sync.Pool of bt.Arena instances is allocated once at loop start and Put when the loop returns — tx pointers are consumed inside each iteration (utxoDiff.ProcessTx copies fields into a UTXO struct that is then serialised + written), so the arena lifetime contract is satisfied.
GetLegacyBlock's inner tx streaming loop now decodes via tx.ReadFromWithArena and calls arena.Reset between txs so peak memory is bounded by the largest single tx, not the cumulative block size. The arena is acquired once at the top of the streaming goroutine and returned to the pool on exit.
handle_block.go's coinbase decode and manager.go's inbound tx decode are one-shot operations where the *bt.Tx must outlive the function frame. An arena allocated here would have to be Put before return, aliasing the tx's script slices to soon-to-be-reused memory. The per-block tx loops in legacy ingestion work with bsvutil.Tx (the legacy wire wrapper, not bt.Tx) and never round-trip through go-bt decode — the eventual heavy decode of those subtree-resident txs runs in services/subtreevalidation, which already uses the per-subtree arena from earlier in this PR series.
Synthetic 100 MiB subtree (1000 txs × 100 KB OP_RETURN each) decoded via the arena path. Asserts post-decode HeapInuse delta is well under the wire size — proving that arena reuse bounds peak memory by the slab cap + per-tx struct overhead, not by cumulative script bytes. Observed: 97 MiB wire, 28 MiB HeapInuse delta (vs ~97 MiB without arena). Skipped under -short.
Contributor
|
🤖 Claude Code Review Status: Complete Current Review: No new issues found. The PR implements arena-backed transaction decoding correctly with proper lifecycle management across all paths. Previously Reported:
Notes:
|
CI bot flagged a use-after-free concern keyed on UTXODiff.Add, but ProcessSubtreeUTXOStreaming uses *utxopersister.UTXOSet (not the unrelated services/blockpersister/utxoset/model.UTXODiff type, which is only imported by its own internal tests). UTXOSet.ProcessTx serialises via UTXOWrapper.Bytes -> UTXO.Bytes, which heap-copies script bytes via append, then writes through a bufio.Writer that copies into its own buffer. No arena-backed slice survives the function frame. Adding the comment so the next reader doesn't have to retrace the chain.
Contributor
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-22 08:39 UTC |
icellan
approved these changes
May 21, 2026
|
freemans13
approved these changes
May 22, 2026
This was referenced Jun 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Closes #920. Eliminates the per-script
make([]byte, l)allocation hotspot ingo-bttx decode that was responsible for ~70% of the heap during testnet catch-up sync (causing OOM at ~13 GiB RSS around block ~5000).Uses the new
bt.Arena+Tx.ReadFromWithArena+Tx.HashTxIDIntoAPIs landed in go-bt v2.6.4 (bsv-blockchain/go-bt#146).What's in this PR
go-bttov2.6.4(introducesbt.Arena,*.ReadFromWithArena,Tx.HashTxIDInto, and a 1 GiB script-length cap on existingReadFrompaths).sync.Poolofbt.Arenainsubtreevalidation,blockpersister,asset/repository. Get on decode start, Put on completion viaResetAndShrink(64 MiB)so a one-off large decode doesn't bloat idle pool footprint.subtreevalidation/check_block_subtrees.go:readTransactionsFromSubtreeDataStreamis the catch-up critical loop. Per-subtree arena allocated in each parallel goroutine, kept alive in abatchArenasslice, returned to the pool only afterprocessTransactionsInLevelsconsumes the batch's txs.blockpersister/streaming_process_subtree.go: per-subtree arena aroundProcessSubtreeUTXOStreaming's decode loop.asset/repository/GetLegacyBlock.go: per-block arena around the streaming-reconstruction loop, witharena.Reset()between successfultx.WriteTocalls — peak bounded by the largest single tx, not the cumulative block size.tx.HashTxIDInto(reusable scratch buffer) in the bulk-stream sites — eliminates thetx.Bytes()per-call allocation that contributed 524 MB to the original heap profile.*bt.Txpast the function frame (subtreevalidation.readTxFromReader,legacy/netsync/handle_block.gocoinbase decode,legacy/netsync/manager.goinbound tx decode) intentionally stay on the standardbt.NewTxFromBytes/tx.ReadFrompath with an inline comment explaining why.Performance
TestReadTransactionsFromSubtreeDataStream_MemoryBoundWithArena(synthetic,-shortskipped):Concrete heap-bound:
arena_poolusesResetAndShrink(64 MiB). With up to 8 concurrent catch-up workers (per #911), idle pool baseline ≤ 512 MiB; peak ≤ N × max-subtree-script-bytes + struct overhead.Backwards compatibility
go-bt v2.6.4is additive — no breaking changes for any consumer.processSubtreeDataStreamandextractAndCollectTransactionsgained a*bt.Arenaparameter; all internal callers updated. Test files passnilto preserve the heap-allocating path.*Serverremain stable.Verification
go vet ./...on touched packages — cleango test -count=1 -tags testtxmetacache ./...on touched packages — passgo test -count=1 -race -tags testtxmetacache ./services/subtreevalidation/... ./services/blockpersister/...— passTestArenaPool_GetPutLifecycle,TestArenaPool_ShrinkAfterLargeUse— verify pool contractTestReadTransactionsFromSubtreeDataStream_MemoryBoundWithArena— 28 MiB delta on 97 MiB streambsva-ovh-teranode-ttn-eu-3, restart fresh, catch up past block ~5000, confirm RSS < 8 GiB sustained andgo-bt.(*Output).ReadFromno longer in heap top-10 inuse.Upstream
go-bt PR: bsv-blockchain/go-bt#146 (merged + released as v2.6.4)