Skip to content

go-bt encode-side allocations still 50% of legacy alloc churn after #929 (decode arena) #1002

@oskarszoon

Description

@oskarszoon

Summary

go-bt/v2's tx/output encode path (Output.appendTo, Tx.toBytesHelper, Output.Bytes) accounts for ~50% of legacy's cumulative allocation churn during mainnet IBD. PR #929 (go-bt arena for decode) eliminated the decode-side allocations but the encode-side hash path is still per-call allocating fresh []byte. Every tx that gets hashed (TxID, UTXO hash) goes through this path; on a busy legacy that's millions of allocations per minute.

Observed

allocs (cumulative alloc-space) profile of legacy on bsva-ovh-teranode-eu-2, 2026-06-01, mainnet IBD at height 714k:

% GB allocated Function
24.09% 6,061 go-bt/v2.(*Output).appendTo
19.47% 4,898 go-bt/v2.(*Tx).toBytesHelper (cum: 11,090 / 44.08%)
7.47% 1,880 go-wire.(*blockArena).Alloc (already optimized — expected)
6.63% 1,667 bytes.Clone (called from WireTxToGoBtTx, intentional copy)
6.47% 1,628 teranode/util.UTXOHash
6.43% 1,617 go-bt/v2.(*Output).Bytes
5.82% 1,463 go-bt/v2.readArenaScript
2.13% 537 aerospike-client-go/v8.batchCommandOperate.parseRecord

Of 25,158 GB cumulative allocation, ~50% (12,463 GB) comes from the encode-side serialise-for-hashing triad: appendTo + toBytesHelper + Output.Bytes.

The decode-side arena (from #929) is working — blockArena.Alloc is only 7.5%. The issue is on the encode side.

Why this is still happening after #929

PR #929 added bt.Arena + Tx.HashTxIDInto for the bulk-stream decode sites (subtreevalidation, blockpersister, asset/repository). It explicitly excluded single-tx call sites in legacy:

Single-tx call sites that retain *bt.Tx past the function frame (subtreevalidation.readTxFromReader, legacy/netsync/handle_block.go coinbase decode, legacy/netsync/manager.go inbound tx decode) intentionally stay on the standard bt.NewTxFromBytes / tx.ReadFrom path with an inline comment explaining why.

So the legacy hot path still allocates on every hash. For mainnet IBD this is the dominant churn.

Where the encode path is invoked

Hashing in teranode happens at multiple sites:

  • tx.TxIDChainHash()tx.Bytes()toBytesHelper()Output.appendTo() for each output
  • util.UTXOHash(tx, vout) → constructs outpoint || PkScript || satoshis blob → calls Output.Bytes()
  • Subtree Merkle-root computation includes per-tx hash
  • Aerospike key derivation uses tx hash bytes

In legacy's HandleBlockDirect chain, each tx in the block is hashed multiple times across validateTransactionsLegacyMode, PreValidateTransactions, createUtxos, extendTransactions. Each hash = fresh []byte of the full tx serialization.

Fix directions

  1. Extend the arena pattern from fix(blockvalidation): arena-backed tx decode to eliminate catch-up OOM #929 to cover hash sites in legacy. Pass a *bt.Arena into the HandleBlockDirect pipeline; use tx.HashTxIDInto (already exists per fix(blockvalidation): arena-backed tx decode to eliminate catch-up OOM #929) instead of tx.TxIDChainHash where the slice doesn't need to escape.

  2. Pool the hash scratch buffer. A sync.Pool[*bytes.Buffer] reused across hash calls in the same goroutine. Resets between txs. Net effect: one allocation per goroutine per block instead of one per tx per hash.

  3. Cache the tx hash on first computation. bt.Tx already has txHash private field via Tx.TxIDChainHash, but Tx.Bytes() doesn't cache. If subsequent calls in the same block-processing frame end up re-serializing, that's pure waste.

  4. For Output.Bytes specifically, return a slice into a per-block arena (similar to blockArena.Alloc) instead of allocating per call.

Verification

  • Pre/post alloc_space profile on legacy during IBD: Output.appendTo + toBytesHelper + Output.Bytes combined should drop from 50% to <20% of cumulative allocs
  • GC pressure: gc-pause-ms metric should drop in proportion
  • Block throughput: blocks/second on busy legacy should rise modestly (GC scavenger spends less time)

Captured profiles (local)

  • probe/eu2-mainnet-sync-2026-06-01/legacy/legacy-allocs.pb.gz
  • probe/eu2-mainnet-sync-2026-06-01/legacy/legacy-heap.pb.gz

Available on request.

Related

Not in this PR's scope

  • bytes.Clone at 6.6% (487 MB inuse) is from legacy/netsync.WireTxToGoBtTx and is deliberate (breaks arena-alias chain — see the doc comment on that function). Not a target.
  • teranode/util.UTXOHash (6.5% / 1,628 GB) is a separate but adjacent hot path; can be addressed in the same arena pattern.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions