Add opt-in drain mode for the outpoint batcher (off by default) by freemans13 · Pull Request #865 · bsv-blockchain/teranode

freemans13 · 2026-05-14T11:42:12Z

What this changes

Adds a new opt-in setting, utxostore_outpointBatcherDrainMode (default false).

The outpoint batcher looks up parent transaction data (amount + locking script) for the inputs on every tx. It's used by PreviousOutputsDecorate and BatchPreviousOutputsDecorate. Normally, when it has a small batch of inputs it waits up to its full flush timer (default 10ms) before sending the lookup to Aerospike, in case more inputs are coming.

When this setting is on, the batcher drains whatever is already queued (up to the size cap) and fires immediately, instead of waiting on the timer.

It's the same drain-mode mechanism the four existing batchers already expose (Get / Spend / Store / Locked), now wired up for the outpoint batcher too.

When to turn it on

The clean win is cmd/rewindblockchain. That's a single-producer, separate process that calls PreviousOutputsDecorate one tx at a time. Each call offers ~2 inputs — far below the batcher's size cap — so without drain mode every call sits idle for the full 10ms timer before its lookup goes out. Drain mode removes that wait. Benchmarked ~90%+ faster at these low, sequential tx counts.

Turn it on with utxostore_outpointBatcherDrainMode=true.

When to leave it off (the default)

Do not enable this inside a running node to try to speed up the per-tx decorate path. That batcher is shared store-wide, so the toggle hits the concurrent hot path too:

the BatchPreviousOutputsDecorate fan-out (since perf(aerospike): fan out BatchPreviousOutputsDecorate per-tx #893, per-tx calls run via errgroup), and
the Validator's ValidateTransactionBatch errgroup path.

Benchmarking drain on vs off on that concurrent path (BenchmarkBatchPreviousOutputsDecorateDrainMode, added here) shows drain mode is bimodal / heavy-tailed at mid tx counts (~64–256): with drain off latency is rock-stable, with drain on the mean is unpredictable and some runs are several times slower. A node wants predictable decorate latency, so it stays off by default.

The batcher is store-wide — this toggle can't be applied to one caller without affecting the others in the same process. Measure per deployment before flipping it on.

Why off by default

Keeps existing behaviour identical on upgrade.
Small drained batches can hit Aerospike's per-item fallback path on some deployments.
The concurrent-path behaviour above is unpredictable, so opting in should be a deliberate, measured choice.

Files touched

settings/utxostore_settings.go — new setting, following the existing pattern
settings/settings.go — default value (false)
stores/utxo/aerospike/aerospike.go — conditional s.outpointBatcher.SetDrainMode(true) next to the other four, with a comment explaining the on/off tradeoff
stores/utxo/aerospike/aerospike_server_test.go — wiring smoke test (drain mode on, both decorate paths still decorate every input correctly)
stores/utxo/aerospike/outpoint_drain_bench_test.go — drain on-vs-off benchmark across tx counts, the evidence behind the "leave it off in a node" guidance

Testing

go build ./... clean
go vet ./stores/utxo/aerospike/... ./settings/... clean
go test -race -short ./stores/utxo/aerospike/ ./settings/ — passing (one pre-existing env-related failure in TestGetUTXOStoreURL that also fails on main)

The outpoint batcher (used by PreviousOutputsDecorate / BatchPreviousOutputsDecorate to look up parent outputs for tx inputs) was the only UtxoStore batcher missing a drain-mode toggle. The four existing toggles (Get / Spend / Store / Locked) all default to false and let operators opt in per batcher. Add OutpointBatcherDrainMode following the same pattern: - new bool field on UtxoStore settings with key/desc/default/longdesc - default false in settings.go - conditional s.outpointBatcher.SetDrainMode(true) alongside the others Why this matters: BatchPreviousOutputsDecorate hands a whole block's worth of inputs to PreviousOutputsDecorate in a tight loop, then blocks waiting for results. Without drain mode the batcher waits its full 10 ms timer before flushing because no further items arrive, which makes the legacy / quick-validation decorate hot path pay 10 ms per tx. Default kept false for behavioural consistency with the existing drain-mode toggles. Operators who want the speedup flip utxostore_outpointBatcherDrainMode=true. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-14T11:43:31Z

🤖 Claude Code Review

Status: Complete

Current Review:

No issues found. The PR introduces an optional drain mode toggle for the outpoint batcher, following the established pattern of the existing four drain-mode settings. The implementation is clean, well-documented, and thoroughly tested.

Key Strengths:

Consistent Pattern: Follows existing drain-mode toggles (Get/Spend/Store/Locked) with identical structure and naming conventions
Safe Default: Defaults to false to maintain existing behavior and predictable latency on concurrent paths
Honest Documentation: The extensive longdesc accurately reflects post-perf(aerospike): fan out BatchPreviousOutputsDecorate per-tx #893 concurrent behavior and explicitly warns about bimodal/heavy-tailed performance on concurrent paths
Evidence-Based: Includes comprehensive benchmark (BenchmarkBatchPreviousOutputsDecorateDrainMode) that measured the actual behavior across transaction counts
Test Coverage: Happy-path smoke test (TestAerospikeOutpointBatcherDrainMode) verifies the toggle wiring for both single-tx and concurrent paths using require assertions per repo convention
Clear Use Case: Identifies the specific win scenario (cmd/rewindblockchain single-producer sequential caller) vs concurrent node paths where it should stay off

Previous Review Items (Resolved):

All inline comments from Copilot and the author have been addressed and resolved in commits e2757b8 and e86917b.

github-actions · 2026-05-14T11:57:07Z

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-865 (e5f32b9)

Summary

Regressions: 0
Improvements: 0
Unchanged: 144
Significance level: p < 0.05

All benchmark results (sec/op)

Benchmark	Baseline	Current	Change	p-value
_NewBlockFromBytes-4	1.604µ	1.582µ	~	0.400
SplitSyncedParentMap_SetIfNotExists/256_buckets-4	71.16n	71.59n	~	1.000
SplitSyncedParentMap_SetIfNotExists/16_buckets-4	71.19n	71.29n	~	0.700
SplitSyncedParentMap_SetIfNotExists/1_bucket-4	71.23n	71.16n	~	0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets...	33.03n	32.90n	~	0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_...	56.93n	55.63n	~	0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa...	129.9n	128.3n	~	0.400
MiningCandidate_Stringify_Short-4	217.8n	220.5n	~	0.100
MiningCandidate_Stringify_Long-4	1.646µ	1.639µ	~	0.700
MiningSolution_Stringify-4	857.1n	850.8n	~	0.200
BlockInfo_MarshalJSON-4	1.726µ	1.727µ	~	1.000
NewFromBytes-4	130.8n	132.1n	~	0.100
AddTxBatchColumnar_Validation-4	2.604µ	2.653µ	~	0.100
OffsetValidationLoop-4	594.8n	595.4n	~	1.000
Mine_EasyDifficulty-4	60.79µ	61.68µ	~	0.700
Mine_WithAddress-4	7.075µ	7.000µ	~	0.200
BlockAssembler_AddTx-4	0.02742n	0.02914n	~	1.000
AddNode-4	11.23	11.13	~	1.000
AddNodeWithMap-4	12.08	12.17	~	0.400
DirectSubtreeAdd/4_per_subtree-4	58.05n	56.78n	~	0.200
DirectSubtreeAdd/64_per_subtree-4	29.13n	29.15n	~	1.000
DirectSubtreeAdd/256_per_subtree-4	28.08n	27.95n	~	0.100
DirectSubtreeAdd/1024_per_subtree-4	26.65n	26.61n	~	0.400
DirectSubtreeAdd/2048_per_subtree-4	26.27n	26.24n	~	0.300
SubtreeProcessorAdd/4_per_subtree-4	299.7n	297.3n	~	0.700
SubtreeProcessorAdd/64_per_subtree-4	301.4n	289.0n	~	0.100
SubtreeProcessorAdd/256_per_subtree-4	303.8n	289.0n	~	0.100
SubtreeProcessorAdd/1024_per_subtree-4	294.9n	279.3n	~	0.100
SubtreeProcessorAdd/2048_per_subtree-4	299.9n	280.3n	~	0.100
SubtreeProcessorRotate/4_per_subtree-4	283.3n	285.3n	~	0.400
SubtreeProcessorRotate/64_per_subtree-4	285.3n	286.4n	~	0.700
SubtreeProcessorRotate/256_per_subtree-4	281.8n	290.5n	~	0.100
SubtreeProcessorRotate/1024_per_subtree-4	280.5n	295.9n	~	0.100
SubtreeNodeAddOnly/4_per_subtree-4	55.26n	55.38n	~	1.000
SubtreeNodeAddOnly/64_per_subtree-4	36.17n	36.15n	~	1.000
SubtreeNodeAddOnly/256_per_subtree-4	35.28n	35.38n	~	0.400
SubtreeNodeAddOnly/1024_per_subtree-4	34.81n	34.50n	~	0.200
SubtreeCreationOnly/4_per_subtree-4	111.7n	111.6n	~	0.700
SubtreeCreationOnly/64_per_subtree-4	359.5n	363.9n	~	0.700
SubtreeCreationOnly/256_per_subtree-4	1.255µ	1.261µ	~	0.200
SubtreeCreationOnly/1024_per_subtree-4	3.931µ	3.909µ	~	0.400
SubtreeCreationOnly/2048_per_subtree-4	7.127µ	7.121µ	~	1.000
SubtreeProcessorOverheadBreakdown/64_per_subtree-4	288.0n	282.0n	~	0.100
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4	287.6n	285.9n	~	0.400
ParallelGetAndSetIfNotExists/1k_nodes-4	2.060m	2.005m	~	0.100
ParallelGetAndSetIfNotExists/10k_nodes-4	5.258m	5.312m	~	1.000
ParallelGetAndSetIfNotExists/50k_nodes-4	7.201m	7.348m	~	0.400
ParallelGetAndSetIfNotExists/100k_nodes-4	9.886m	10.242m	~	0.100
SequentialGetAndSetIfNotExists/1k_nodes-4	1.804m	1.832m	~	0.400
SequentialGetAndSetIfNotExists/10k_nodes-4	4.618m	4.541m	~	0.700
SequentialGetAndSetIfNotExists/50k_nodes-4	14.47m	13.82m	~	0.100
SequentialGetAndSetIfNotExists/100k_nodes-4	26.09m	25.36m	~	0.400
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4	2.111m	2.048m	~	0.200
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4	8.488m	8.417m	~	0.400
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4	13.75m	13.62m	~	0.400
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4	1.836m	1.800m	~	0.700
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4	9.305m	8.274m	~	0.100
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4	47.65m	45.88m	~	0.100
DiskTxMap_SetIfNotExists-4	4.154µ	4.216µ	~	0.400
DiskTxMap_SetIfNotExists_Parallel-4	3.886µ	3.786µ	~	0.100
DiskTxMap_ExistenceOnly-4	338.3n	521.8n	~	0.400
Queue-4	204.0n	200.5n	~	0.700
AtomicPointer-4	8.134n	8.131n	~	0.500
ReorgOptimizations/DedupFilterPipeline/Old/10K-4	830.3µ	809.5µ	~	0.100
ReorgOptimizations/DedupFilterPipeline/New/10K-4	734.3µ	719.0µ	~	0.200
ReorgOptimizations/AllMarkFalse/Old/10K-4	117.8µ	113.4µ	~	0.100
ReorgOptimizations/AllMarkFalse/New/10K-4	58.59µ	58.15µ	~	0.100
ReorgOptimizations/HashSlicePool/Old/10K-4	58.88µ	55.18µ	~	0.100
ReorgOptimizations/HashSlicePool/New/10K-4	11.81µ	11.79µ	~	0.100
ReorgOptimizations/NodeFlags/Old/10K-4	4.850µ	4.745µ	~	0.100
ReorgOptimizations/NodeFlags/New/10K-4	1.629µ	1.640µ	~	0.400
ReorgOptimizations/DedupFilterPipeline/Old/100K-4	10.61m	10.68m	~	1.000
ReorgOptimizations/DedupFilterPipeline/New/100K-4	11.44m	10.92m	~	0.200
ReorgOptimizations/AllMarkFalse/Old/100K-4	1.139m	1.185m	~	0.100
ReorgOptimizations/AllMarkFalse/New/100K-4	736.8µ	733.8µ	~	0.700
ReorgOptimizations/HashSlicePool/Old/100K-4	586.2µ	582.6µ	~	1.000
ReorgOptimizations/HashSlicePool/New/100K-4	302.0µ	328.4µ	~	0.400
ReorgOptimizations/NodeFlags/Old/100K-4	48.31µ	48.63µ	~	1.000
ReorgOptimizations/NodeFlags/New/100K-4	17.55µ	16.14µ	~	0.100
TxMapSetIfNotExists-4	52.57n	55.20n	~	0.100
TxMapSetIfNotExistsDuplicate-4	48.24n	48.29n	~	1.000
ChannelSendReceive-4	695.6n	686.0n	~	0.100
CalcBlockWork-4	497.3n	477.2n	~	0.700
CalculateWork-4	638.6n	735.9n	~	0.100
BuildBlockLocatorString_Helpers/Size_10-4	1.334µ	1.349µ	~	0.300
BuildBlockLocatorString_Helpers/Size_100-4	12.85µ	13.03µ	~	0.400
BuildBlockLocatorString_Helpers/Size_1000-4	135.5µ	128.4µ	~	0.400
CatchupWithHeaderCache-4	104.3m	104.5m	~	0.700
_prepareTxsPerLevel-4	423.7m	404.6m	~	1.000
_prepareTxsPerLevelOrdered-4	5.357m	3.736m	~	0.100
_prepareTxsPerLevel_Comparison/Original-4	425.8m	417.4m	~	0.400
_prepareTxsPerLevel_Comparison/Optimized-4	5.322m	4.518m	~	0.700
SubtreeSizes/10k_tx_4_per_subtree-4	1.248m	1.250m	~	0.700
SubtreeSizes/10k_tx_16_per_subtree-4	301.7µ	298.4µ	~	0.700
SubtreeSizes/10k_tx_64_per_subtree-4	70.74µ	70.24µ	~	0.100
SubtreeSizes/10k_tx_256_per_subtree-4	17.57µ	17.54µ	~	0.400
SubtreeSizes/10k_tx_512_per_subtree-4	8.761µ	8.634µ	~	0.100
SubtreeSizes/10k_tx_1024_per_subtree-4	4.359µ	4.306µ	~	0.100
SubtreeSizes/10k_tx_2k_per_subtree-4	2.146µ	2.150µ	~	1.000
BlockSizeScaling/10k_tx_64_per_subtree-4	69.11µ	68.66µ	~	0.400
BlockSizeScaling/10k_tx_256_per_subtree-4	17.44µ	17.43µ	~	1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4	4.361µ	4.285µ	~	0.400
BlockSizeScaling/50k_tx_64_per_subtree-4	363.9µ	361.7µ	~	0.700
BlockSizeScaling/50k_tx_256_per_subtree-4	87.17µ	85.82µ	~	0.200
BlockSizeScaling/50k_tx_1024_per_subtree-4	21.68µ	21.28µ	~	0.400
SubtreeAllocations/small_subtrees_exists_check-4	149.5µ	149.0µ	~	1.000
SubtreeAllocations/small_subtrees_data_fetch-4	159.6µ	157.9µ	~	0.700
SubtreeAllocations/small_subtrees_full_validation-4	305.3µ	306.2µ	~	1.000
SubtreeAllocations/medium_subtrees_exists_check-4	8.806µ	8.764µ	~	0.700
SubtreeAllocations/medium_subtrees_data_fetch-4	9.247µ	9.312µ	~	0.700
SubtreeAllocations/medium_subtrees_full_validation-4	17.53µ	17.27µ	~	0.200
SubtreeAllocations/large_subtrees_exists_check-4	2.096µ	2.056µ	~	0.200
SubtreeAllocations/large_subtrees_data_fetch-4	2.215µ	2.254µ	~	0.100
SubtreeAllocations/large_subtrees_full_validation-4	4.325µ	4.338µ	~	0.700
_BufferPoolAllocation/16KB-4	4.052µ	5.096µ	~	0.400
_BufferPoolAllocation/32KB-4	11.30µ	10.43µ	~	0.700
_BufferPoolAllocation/64KB-4	16.65µ	19.86µ	~	0.100
_BufferPoolAllocation/128KB-4	33.18µ	33.32µ	~	0.700
_BufferPoolAllocation/512KB-4	139.9µ	139.9µ	~	1.000
_BufferPoolConcurrent/32KB-4	20.60µ	21.65µ	~	0.700
_BufferPoolConcurrent/64KB-4	32.16µ	31.33µ	~	0.100
_BufferPoolConcurrent/512KB-4	172.0µ	165.1µ	~	0.100
_SubtreeDeserializationWithBufferSizes/16KB-4	716.0µ	689.6µ	~	0.100
_SubtreeDeserializationWithBufferSizes/32KB-4	711.0µ	692.4µ	~	0.200
_SubtreeDeserializationWithBufferSizes/64KB-4	723.6µ	706.8µ	~	0.100
_SubtreeDeserializationWithBufferSizes/128KB-4	706.5µ	698.2µ	~	0.100
_SubtreeDeserializationWithBufferSizes/512KB-4	670.5µ	664.2µ	~	1.000
_SubtreeDataDeserializationWithBufferSizes/16KB-4	38.39m	38.08m	~	0.100
_SubtreeDataDeserializationWithBufferSizes/32KB-4	38.38m	38.10m	~	0.100
_SubtreeDataDeserializationWithBufferSizes/64KB-4	38.66m	38.16m	~	0.100
_SubtreeDataDeserializationWithBufferSizes/128KB-4	38.54m	38.03m	~	0.100
_SubtreeDataDeserializationWithBufferSizes/512KB-4	38.60m	38.22m	~	0.100
_PooledVsNonPooled/Pooled-4	832.4n	825.6n	~	0.100
_PooledVsNonPooled/NonPooled-4	8.455µ	7.688µ	~	0.100
_MemoryFootprint/Current_512KB_32concurrent-4	8.791µ	7.935µ	~	0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4	12.86µ	11.56µ	~	0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4	11.78µ	10.83µ	~	0.100
StoreBlock_Sequential/BelowCSVHeight-4	324.9µ	322.3µ	~	0.700
StoreBlock_Sequential/AboveCSVHeight-4	322.3µ	324.4µ	~	0.100
GetUtxoHashes-4	265.9n	266.0n	~	1.000
GetUtxoHashes_ManyOutputs-4	43.38µ	47.92µ	~	0.200
_NewMetaDataFromBytes-4	230.2n	228.2n	~	0.700
_Bytes-4	406.0n	399.5n	~	0.100
_MetaBytes-4	141.4n	138.2n	~	0.100

Threshold: >10% with p < 0.05 | Generated: 2026-06-03 09:46 UTC

…int-drain-mode

Copilot

Pull request overview

Adds an opt-in configuration flag to enable “drain mode” for the Aerospike outpoint batcher, allowing previous-output decoration lookups to flush immediately instead of waiting for the batching timer window—aimed at reducing per-tx latency in decorate-heavy paths.

Changes:

Introduces utxostore_outpointBatcherDrainMode to UtxoStoreSettings metadata.
Wires the new setting into NewSettings config loading (default false).
Applies the setting in the Aerospike UTXO store initialization via s.outpointBatcher.SetDrainMode(true).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`stores/utxo/aerospike/aerospike.go`	Enables drain mode on the outpoint batcher when the new setting is true.
`settings/utxostore_settings.go`	Adds the new UTXO store setting metadata entry and long description.
`settings/settings.go`	Loads the new boolean setting with a default of `false`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

freemans13 · 2026-05-22T08:34:09Z

 	SpendBatcherDrainMode                   bool              `key:"utxostore_spendBatcherDrainMode" desc:"Enable drain mode for spend batcher" default:"false" category:"UtxoStore" usage:"Drain mode for UTXO spend operations" type:"bool" longdesc:"### Purpose\nEnables drain mode for the Spend batcher independently of other batchers.\n\n### When to Enable\nOnly if spend items arrive in large bursts and latency is critical.\n\n### When to Disable (Recommended)\nSpend items trickle in one-at-a-time as individual transactions complete Get+Validate. Drain mode fires single-item batches, causing Aerospike executeSingle fallback (one network round-trip per item instead of batched)."`
 	StoreBatcherDrainMode                   bool              `key:"utxostore_storeBatcherDrainMode" desc:"Enable drain mode for store/create batcher" default:"false" category:"UtxoStore" usage:"Drain mode for UTXO create operations" type:"bool" longdesc:"### Purpose\nEnables drain mode for the Create/Store batcher independently of other batchers.\n\n### When to Enable\nCreate receives items in semi-bursts after Spend batches complete. Drain mode can be beneficial here.\n\n### When to Disable\nTimer-based batching accumulates items for better Aerospike batch efficiency."`
 	LockedBatcherDrainMode                  bool              `key:"utxostore_lockedBatcherDrainMode" desc:"Enable drain mode for locked batcher" default:"false" category:"UtxoStore" usage:"Drain mode for UTXO lock operations" type:"bool" longdesc:"### Purpose\nEnables drain mode for the SetLocked batcher independently of other batchers.\n\n### When to Disable (Recommended)\nSetLocked is the last pipeline stage — items arrive one-at-a-time as individual transactions complete BA.Store. Drain mode fires single-item batches, causing Aerospike executeSingle fallback. This was measured at 21% CPU waste on propagation pods at 200k TPS."`
+	OutpointBatcherDrainMode                bool              `key:"utxostore_outpointBatcherDrainMode" desc:"Enable drain mode for outpoint batcher" default:"false" category:"UtxoStore" usage:"Drain mode for previous-outputs decorate operations" type:"bool" longdesc:"### Purpose\nEnables drain mode for the outpoint batcher independently of other batchers. The outpoint batcher is used by PreviousOutputsDecorate / BatchPreviousOutputsDecorate to look up parent outputs (amount + locking script) for transaction inputs.\n\n### When to Enable\nThe outpoint batcher receives items in bursts — BatchPreviousOutputsDecorate hands a whole block's worth of inputs to PreviousOutputsDecorate in a tight loop, then blocks waiting for results. Without drain mode the batcher waits OutpointBatcherDurationMillis (default 10ms) for the batch to fill before flushing, even though no more items are coming. Drain mode flushes the burst immediately and removes that per-tx 10ms timer wait.\n\n### When to Disable (Default)\nKept off by default for consistency with the other drain-mode toggles and to avoid behavioural changes on operators who haven't measured. Single-item burst sizes are rare on this batcher (a tx with 1 input still drains its 1 input as a 1-item batch — Aerospike executeSingle fallback applies), so enabling it is mostly safe but worth measuring per-deployment.\n\n### Related Settings\n- **OutpointBatcherSize** (default 100) — size cap; drain mode still respects this.\n- **OutpointBatcherDurationMillis** (default 10ms) — timer; drain mode bypasses it."`


Addressed in e2757b8d0. The longdesc no longer describes BatchPreviousOutputsDecorate as a tight serial loop; it now reflects #893's per-tx errgroup fan-out and the resulting size-based batch fill. The "when to enable" guidance is reframed around the remaining single-producer call sites (cmd/rewindblockchain, the two per-tx fallbacks in legacy/netsync/handle_block.go), and "when to disable" calls out the executeSingle fallback risk in the concurrent path explicitly.

freemans13 · 2026-05-22T08:34:18Z

 		s.lockedBatcher.SetDrainMode(true)
 	}
+	if tSettings.UtxoStore.OutpointBatcherDrainMode {
+		s.outpointBatcher.SetDrainMode(true)
+	}


Addressed in e2757b8d0. The comment above the drain-mode toggles now covers the outpoint case explicitly and notes the post-#893 nuance: under BatchPreviousOutputsDecorate's concurrent producers the batcher fills by size (drain mode would shrink those wide batches), but the single-producer fallbacks still idle on the 10 ms timer (drain mode helps there).

…int-drain-mode

oskarszoon

Request changes — recommend closing in favour of #893.

Now that #893 is merged the underlying symptom (low per-tx producer concurrency → batcher idling on the 10ms timer) is fixed by fan-out at the producer side: per-tx PreviousOutputsDecorate calls run via errgroup, the batcher fills by size, and the timer rarely fires.

With concurrent producers, flipping this flag would invert the win — it would force many small BatchOperate round-trips instead of one wide one. The drain-mode pattern is right for batchers with a single producer, not after #893.

Suggest closing this PR.

ordishs

LGTM — small, surgical, follows the established per-batcher drain-mode pattern, and gated behind an opt-in default-false toggle so upgrade behaviour is unchanged. Two minor suggestions:

Stale comment in stores/utxo/aerospike/aerospike.go:378-381. The block above the drain-mode toggles enumerates the four existing batchers ("beneficial for Get, Create — harmful for Spend, SetLocked"). With this PR landed, the comment no longer covers the full set. Worth a one-line tweak, e.g. "Outpoint receives input-burst traffic, similar to Get."
Add a smoke test for the new toggle. The fixtures already exist in aerospike_server_test.go:1359 / :1429. A short subtest that constructs the store with OutpointBatcherDrainMode=true and re-runs PreviousOutputsDecorate would lock in that the toggle is wired correctly and that drain-mode doesn't break the happy path. Optional but cheap insurance — none of the four existing drain-mode toggles ship with a dedicated unit test either, so it'd raise the bar slightly for this whole family.

Quick sanity-check question: was the 19× number measured on top of #864's per-tx fanout (already visible at get.go:1191-1206), or on a pre-#864 base? The PR description suggests the former but worth confirming so the composition story is unambiguous.

freemans13 · 2026-05-22T08:26:41Z

Thanks @oskarszoon — fair on the substance, but I'd push back on closing.

You're right that after #893 the BatchPreviousOutputsDecorate hot path no longer pays the per-tx 10 ms timer cost; the 10.70 s → 0.57 s number in the description was measured against the pre-#893 sequential producer (the bench harness in the closed #864) and shouldn't be read as a current claim.

But this PR doesn't enable drain mode for anyone. It only adds the toggle, default false, matching the existing per-batcher pattern (getBatcherDrainMode, spendBatcherDrainMode, storeBatcherDrainMode, lockedBatcherDrainMode). None of those four are universal wins either — their own longdesc blocks call out cases where they regress (SetLocked was measured at 21 % CPU waste at 200 k TPS, etc.). They exist as per-deployment tuning levers, not opinions about what's correct.

The outpoint batcher was the only one in that group without a corresponding toggle. The original motivation (sequential producer) is gone, but the symmetry argument still holds: an operator who wants to A/B drain-mode behaviour across the five batchers can now do that uniformly. And there are remaining single-producer call sites where the batcher still idles 10 ms — cmd/rewindblockchain and the two per-tx fallbacks in legacy/netsync/handle_block.go (lines 1362 and 1647). Not hot paths, but not non-existent either.

Default-off toggle that just fills out the per-batcher matrix — I'd rather land it than leave the next person to wonder why every batcher has the knob except this one.

@ordishs

Updates from @ordishs and Copilot reviews: - Extend the comment above the drain-mode toggles in aerospike.go to cover the new outpoint case and the post-bsv-blockchain#893 nuance. - Rewrite the OutpointBatcherDrainMode longdesc so the 'when to enable / when to disable' guidance reflects the post-bsv-blockchain#893 fan-out path rather than the pre-bsv-blockchain#893 sequential producer. - Add TestAerospikeOutpointBatcherDrainMode happy-path smoke test asserting the toggle stays wired (per-tx + concurrent decorate).

freemans13 · 2026-05-22T08:33:55Z

Thanks @ordishs — all three points addressed in e2757b8d0:

Stale comment (stores/utxo/aerospike/aerospike.go:378-381) — extended to mention the outpoint batcher and the post-perf(aerospike): fan out BatchPreviousOutputsDecorate per-tx #893 nuance (concurrent producers fill by size; single-producer fallbacks still idle on the timer).
Smoke test — added TestAerospikeOutpointBatcherDrainMode in aerospike_server_test.go next to TestAerospikeWithBatchSize. Builds a store with OutpointBatcherDrainMode=true and exercises both the per-tx PreviousOutputsDecorate and the concurrent BatchPreviousOutputsDecorate paths. The toggle has no exported getter so the test is "decorate still works with the flag on" rather than "the flag is set" — its job is to catch the wiring being deleted or misnamed in aerospike.go.
The 19× number — measured against the pre-Look up parent transactions in parallel instead of one at a time #864 / pre-perf(aerospike): fan out BatchPreviousOutputsDecorate per-tx #893 base (single sequential producer, ~5 ms timer per tx). After @oskarszoon's perf(aerospike): fan out BatchPreviousOutputsDecorate per-tx #893 the same workload (BatchPreviousOutputsDecorate) is already fast without this toggle, and flipping drain mode on in that path now hurts rather than helps (wide BatchOperate → many narrow ones). I've rewritten the OutpointBatcherDrainMode longdesc to reflect that — "when to enable" now points at the remaining single-producer call sites (cmd/rewindblockchain and the two per-tx fallbacks in legacy/netsync/handle_block.go), and "when to disable" explicitly names perf(aerospike): fan out BatchPreviousOutputsDecorate per-tx #893 and the fan-out path.

…fan-out; align docs Add BenchmarkBatchPreviousOutputsDecorateDrainMode comparing the outpoint batcher's drain mode on vs off across the concurrent BatchPreviousOutputsDecorate path (bsv-blockchain#893 errgroup fan-out), using distinct parents per input so batch width maps to real BatchOperate record counts. Findings (count=10 + benchstat, distinct-parent inputs): drain mode is neutral-to-faster in the mean at every tx count (-8% at 2856 tx to -94% at 16 tx), but bimodal/heavy-tailed at mid tx counts (~64-256). So the concurrent node path keeps drain mode default off (predictable latency), and the clean win is the single-producer, separate-process cmd/rewindblockchain caller. Update the aerospike.go comment and the setting longdesc to match the measured behaviour and to note the batcher is store-wide (the toggle cannot be applied to one caller without affecting the other in the same process).

…e' into stu/aerospike-outpoint-drain-mode

freemans13 · 2026-06-02T19:31:23Z

@oskarszoon — took your "this inverts #893's win" seriously and benchmarked it rather than argue it. Added BenchmarkBatchPreviousOutputsDecorateDrainMode (in the latest push) — it runs the concurrent BatchPreviousOutputsDecorate fan-out with drain on vs off, using a distinct parent tx per input so batch width maps to real BatchOperate record counts (otherwise sendOutpointBatch's uniqueTxHashes dedup hides the effect). count=10 + benchstat, warmup per tier.

First, a correction to my own earlier framing: drain mode doesn't dispatch one outpoint per Put. go-batcher/v2 greedily drains everything currently queued (up to the size cap) and fires without the timer wait — so under your fan-out there genuinely is a pile to drain.

Median ns/op, drain=false → drain=true:

txs	drain=false	drain=true	Δ mean	drain=true stability
16	6.57 ms	0.40 ms	−94%	stable
64	7.40 ms	1.18 ms	−84%	bimodal (0.9→52 ms)
256	8.87 ms	4.97 ms	−44%	bimodal (~2.6 / ~7 ms)
1024	15.0 ms	8.80 ms	−41%	tight ±3%
2856	23.2 ms	21.4 ms	−8%	±19%

So the mean doesn't regress on the concurrent path — it's neutral-to-faster everywhere (the drain=false low-tier cost is literally the 10 ms timer idling). But your instinct wasn't wrong: drain mode goes bimodal/heavy-tailed at mid tx counts (~64–256), occasionally 10–50× its own median, where drain=false is rock-stable. A node's concurrent decorate path wants predictable latency, so that instability is a good reason to keep it default-off — which I've now made the documented rationale, dropping the weak "legacy per-tx fallback" enable-case (it shares the process with the hot path anyway).

Caveat I can't measure locally: this is loopback Aerospike, so it under-weights the per-round-trip network RTT that's central to your "many small round-trips" point — on a real cluster the high tiers could move toward you.

Net: the only use I'm claiming is the single-producer, separate-process caller cmd/rewindblockchain, where it's the −94% timer-elimination win with no concurrent fan-out to fragment. Given it's default-off and scoped that way, are you OK to unblock — or do you still prefer we close it and leave rewindblockchain on the timer?

oskarszoon

Re-reviewed with a fresh benchmark run and a look at the actual go-batcher drain loop. Withdrawing my earlier close recommendation — it was based on a wrong model of drain mode. The drain loop greedily drains the channel up to the size cap before firing (go-batcher batcher.go:464-480), so under #893's concurrent fan-out the batches stay wide rather than collapsing to one-item round-trips. The bench bears this out (drain faster at the 64/256/2856 tx tiers), and cmd/rewindblockchain is a genuine serial single-producer caller where each PreviousOutputsDecorate otherwise eats the full 10ms timer — a real win no existing knob covers. The wiring is correct, default-off, race-free, and can't drop/dup/misorder lookups.

One required fix before merge:

The "neutral-to-faster in the mean at every tx count" claim (aerospike.go comment + longdesc) isn't supported by the benchmark. On the concurrent path drain=false is rock-stable but drain=true is bimodal with an unpredictable mean — at txs=16 and 64, two of three runs come in ~5× slower; txs=2856 had a 118ms spike in one run. The "bimodal/heavy-tailed" wording you already have is right; the "neutral-to-faster in the mean" next to it will lead operators to flip this on a node/concurrent path expecting a free win. Drop that phrase and keep the honest framing: bimodal/unpredictable on the concurrent path, clean win only for the single-producer (rewindblockchain) caller.

Minor: the longdesc warns about the legacy per-tx fallback but not the Validator's ValidateTransactionBatch errgroup path (also concurrent); and aerospike_server_test.go:1812+ uses assert where the repo wants require.

Fix the doc claim and I'll approve.

Address review feedback on PR bsv-blockchain#865: - Drop the unsupported "neutral-to-faster in the mean" claim from the aerospike.go comment and the outpointBatcherDrainMode longdesc. The concurrent BatchPreviousOutputsDecorate path is bimodal/heavy-tailed with drain on (drain=false rock-stable, drain=true unpredictable mean); the clean win is the single-producer cmd/rewindblockchain caller. - Name the Validator's ValidateTransactionBatch errgroup path alongside the BatchPreviousOutputsDecorate hot path as a concurrent caller that should keep drain off, not just the legacy per-tx fallback. - Use require (not assert) in TestAerospikeOutpointBatcherDrainMode per repo test convention. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

freemans13 · 2026-06-03T09:32:06Z

Thanks @oskarszoon — and thanks for re-reviewing and withdrawing the close recommendation. Fixed in e86917b:

Dropped the "neutral-to-faster in the mean" claim from both the aerospike.go comment and the outpointBatcherDrainMode longdesc. Both now state the honest framing: on the concurrent BatchPreviousOutputsDecorate path drain mode is bimodal/heavy-tailed at mid tx counts (~64-256) — drain=false rock-stable, drain=true unpredictable mean (some runs several× slower) — so a node keeps it off; the clean win is the single-producer cmd/rewindblockchain caller.
Named the Validator's ValidateTransactionBatch errgroup path in the longdesc's "When to Disable" section alongside the BatchPreviousOutputsDecorate hot path, so the concurrent-caller warning isn't limited to the legacy per-tx fallback.
Switched assert → require in TestAerospikeOutpointBatcherDrainMode per the repo convention.

Ready for another look when you have a moment.

sonarqubecloud · 2026-06-03T09:43:52Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
92.3% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

oskarszoon

Re-approving on e86917b. The "neutral-to-faster in the mean" overclaim is gone; the aerospike.go comment + the longdesc now describe drain mode on the concurrent path honestly — bimodal/heavy-tailed at mid tx counts, drain=true with an unpredictable mean (some runs several× slower), default-off for node decorate paths, and the clean win scoped to the single-producer cmd/rewindblockchain caller. You also folded in the Validator ValidateTransactionBatch errgroup-path caveat and the store-wide-toggle note, and fixed the smoke-test require. Doc + test only; the drain wiring is unchanged.

Approving.

freemans13 self-assigned this May 14, 2026

Merge remote-tracking branch 'upstream/main' into stu/aerospike-outpo…

55e3a4f

…int-drain-mode

freemans13 changed the title ~~feat(utxo/aerospike): optional drain mode for outpoint batcher~~ Let the outpoint batcher flush immediately instead of waiting 10ms (opt-in) May 19, 2026

Merge remote-tracking branch 'upstream/main' into stu/aerospike-outpo…

54d059e

…int-drain-mode

freemans13 requested a review from Copilot May 19, 2026 12:28

freemans13 mentioned this pull request May 19, 2026

Look up parent transactions in parallel instead of one at a time #864

Closed

Copilot AI reviewed May 19, 2026

View reviewed changes

Copilot started reviewing on behalf of freemans13 May 19, 2026 12:37 View session

freemans13 added 2 commits May 19, 2026 15:47

Merge remote-tracking branch 'upstream/main' into stu/aerospike-outpo…

3f7a3cf

…int-drain-mode

Merge remote-tracking branch 'upstream/main' into stu/aerospike-outpo…

e9cac90

…int-drain-mode

freemans13 requested review from liam and ordishs May 21, 2026 10:32

oskarszoon requested changes May 21, 2026

View reviewed changes

ordishs approved these changes May 21, 2026

View reviewed changes

liam approved these changes May 21, 2026

View reviewed changes

freemans13 added 4 commits May 29, 2026 12:58

Merge remote-tracking branch 'upstream/main' into HEAD

6927a84

Merge upstream/main into stu/aerospike-outpoint-drain-mode

8ede224

Merge remote-tracking branch 'origin/stu/aerospike-outpoint-drain-mod…

6efa6d6

…e' into stu/aerospike-outpoint-drain-mode

oskarszoon requested changes Jun 2, 2026

View reviewed changes

freemans13 requested a review from oskarszoon June 3, 2026 09:32

freemans13 changed the title ~~Let the outpoint batcher flush immediately instead of waiting 10ms (opt-in)~~ Add opt-in drain mode for the outpoint batcher (off by default) Jun 4, 2026

oskarszoon approved these changes Jun 5, 2026

View reviewed changes

freemans13 merged commit 5c950aa into bsv-blockchain:main Jun 5, 2026
34 checks passed

Conversation

freemans13 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this changes

When to turn it on

When to leave it off (the default)

Why off by default

Files touched

Testing

Uh oh!

github-actions Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Comparison Report

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

freemans13 May 22, 2026

Choose a reason for hiding this comment

Uh oh!

freemans13 May 22, 2026

Choose a reason for hiding this comment

Uh oh!

oskarszoon left a comment

Choose a reason for hiding this comment

Uh oh!

ordishs left a comment

Choose a reason for hiding this comment

Uh oh!

freemans13 commented May 22, 2026

Uh oh!

freemans13 commented May 22, 2026

Uh oh!

freemans13 commented Jun 2, 2026

Uh oh!

oskarszoon left a comment

Choose a reason for hiding this comment

Uh oh!

freemans13 commented Jun 3, 2026

Uh oh!

sonarqubecloud Bot commented Jun 3, 2026

Quality Gate passed

Uh oh!

oskarszoon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

freemans13 commented May 14, 2026 •

edited

Loading

github-actions Bot commented May 14, 2026 •

edited

Loading

github-actions Bot commented May 14, 2026 •

edited

Loading