You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
go-bt/v2's tx/output encode path (Output.appendTo, Tx.toBytesHelper, Output.Bytes) accounts for ~50% of legacy's cumulative allocation churn during mainnet IBD. PR #929 (go-bt arena for decode) eliminated the decode-side allocations but the encode-side hash path is still per-call allocating fresh []byte. Every tx that gets hashed (TxID, UTXO hash) goes through this path; on a busy legacy that's millions of allocations per minute.
Observed
allocs (cumulative alloc-space) profile of legacy on bsva-ovh-teranode-eu-2, 2026-06-01, mainnet IBD at height 714k:
PR #929 added bt.Arena + Tx.HashTxIDInto for the bulk-stream decode sites (subtreevalidation, blockpersister, asset/repository). It explicitly excluded single-tx call sites in legacy:
Single-tx call sites that retain *bt.Tx past the function frame (subtreevalidation.readTxFromReader, legacy/netsync/handle_block.go coinbase decode, legacy/netsync/manager.go inbound tx decode) intentionally stay on the standard bt.NewTxFromBytes / tx.ReadFrom path with an inline comment explaining why.
So the legacy hot path still allocates on every hash. For mainnet IBD this is the dominant churn.
Where the encode path is invoked
Hashing in teranode happens at multiple sites:
tx.TxIDChainHash() → tx.Bytes() → toBytesHelper() → Output.appendTo() for each output
Subtree Merkle-root computation includes per-tx hash
Aerospike key derivation uses tx hash bytes
In legacy's HandleBlockDirect chain, each tx in the block is hashed multiple times across validateTransactionsLegacyMode, PreValidateTransactions, createUtxos, extendTransactions. Each hash = fresh []byte of the full tx serialization.
Pool the hash scratch buffer. A sync.Pool[*bytes.Buffer] reused across hash calls in the same goroutine. Resets between txs. Net effect: one allocation per goroutine per block instead of one per tx per hash.
Cache the tx hash on first computation.bt.Tx already has txHash private field via Tx.TxIDChainHash, but Tx.Bytes() doesn't cache. If subsequent calls in the same block-processing frame end up re-serializing, that's pure waste.
For Output.Bytes specifically, return a slice into a per-block arena (similar to blockArena.Alloc) instead of allocating per call.
Verification
Pre/post alloc_space profile on legacy during IBD: Output.appendTo + toBytesHelper + Output.Bytes combined should drop from 50% to <20% of cumulative allocs
GC pressure: gc-pause-ms metric should drop in proportion
Block throughput: blocks/second on busy legacy should rise modestly (GC scavenger spends less time)
bytes.Clone at 6.6% (487 MB inuse) is from legacy/netsync.WireTxToGoBtTx and is deliberate (breaks arena-alias chain — see the doc comment on that function). Not a target.
teranode/util.UTXOHash (6.5% / 1,628 GB) is a separate but adjacent hot path; can be addressed in the same arena pattern.
Summary
go-bt/v2's tx/output encode path (Output.appendTo,Tx.toBytesHelper,Output.Bytes) accounts for ~50% of legacy's cumulative allocation churn during mainnet IBD. PR #929 (go-bt arena for decode) eliminated the decode-side allocations but the encode-side hash path is still per-call allocating fresh[]byte. Every tx that gets hashed (TxID, UTXO hash) goes through this path; on a busy legacy that's millions of allocations per minute.Observed
allocs(cumulative alloc-space) profile of legacy onbsva-ovh-teranode-eu-2, 2026-06-01, mainnet IBD at height 714k:go-bt/v2.(*Output).appendTogo-bt/v2.(*Tx).toBytesHelper(cum: 11,090 / 44.08%)go-wire.(*blockArena).Alloc(already optimized — expected)bytes.Clone(called fromWireTxToGoBtTx, intentional copy)teranode/util.UTXOHashgo-bt/v2.(*Output).Bytesgo-bt/v2.readArenaScriptaerospike-client-go/v8.batchCommandOperate.parseRecordOf 25,158 GB cumulative allocation, ~50% (12,463 GB) comes from the encode-side serialise-for-hashing triad:
appendTo+toBytesHelper+Output.Bytes.The decode-side arena (from #929) is working —
blockArena.Allocis only 7.5%. The issue is on the encode side.Why this is still happening after #929
PR #929 added
bt.Arena+Tx.HashTxIDIntofor the bulk-stream decode sites (subtreevalidation, blockpersister, asset/repository). It explicitly excluded single-tx call sites in legacy:So the legacy hot path still allocates on every hash. For mainnet IBD this is the dominant churn.
Where the encode path is invoked
Hashing in teranode happens at multiple sites:
tx.TxIDChainHash()→tx.Bytes()→toBytesHelper()→Output.appendTo()for each outpututil.UTXOHash(tx, vout)→ constructsoutpoint || PkScript || satoshisblob → callsOutput.Bytes()In legacy's
HandleBlockDirectchain, each tx in the block is hashed multiple times acrossvalidateTransactionsLegacyMode,PreValidateTransactions,createUtxos,extendTransactions. Each hash = fresh[]byteof the full tx serialization.Fix directions
Extend the arena pattern from fix(blockvalidation): arena-backed tx decode to eliminate catch-up OOM #929 to cover hash sites in legacy. Pass a
*bt.Arenainto theHandleBlockDirectpipeline; usetx.HashTxIDInto(already exists per fix(blockvalidation): arena-backed tx decode to eliminate catch-up OOM #929) instead oftx.TxIDChainHashwhere the slice doesn't need to escape.Pool the hash scratch buffer. A
sync.Pool[*bytes.Buffer]reused across hash calls in the same goroutine. Resets between txs. Net effect: one allocation per goroutine per block instead of one per tx per hash.Cache the tx hash on first computation.
bt.Txalready hastxHashprivate field viaTx.TxIDChainHash, butTx.Bytes()doesn't cache. If subsequent calls in the same block-processing frame end up re-serializing, that's pure waste.For
Output.Bytesspecifically, return a slice into a per-block arena (similar toblockArena.Alloc) instead of allocating per call.Verification
alloc_spaceprofile on legacy during IBD:Output.appendTo+toBytesHelper+Output.Bytescombined should drop from 50% to <20% of cumulative allocsgc-pause-msmetric should drop in proportionCaptured profiles (local)
probe/eu2-mainnet-sync-2026-06-01/legacy/legacy-allocs.pb.gzprobe/eu2-mainnet-sync-2026-06-01/legacy/legacy-heap.pb.gzAvailable on request.
Related
Not in this PR's scope
bytes.Cloneat 6.6% (487 MB inuse) is fromlegacy/netsync.WireTxToGoBtTxand is deliberate (breaks arena-alias chain — see the doc comment on that function). Not a target.teranode/util.UTXOHash(6.5% / 1,628 GB) is a separate but adjacent hot path; can be addressed in the same arena pattern.