Skip to content

feat(pruner): skip parent updates for already-pruned parents#628

Merged
freemans13 merged 15 commits into
bsv-blockchain:mainfrom
freemans13:feat/pruner-skip-pruned-parent-updates
May 14, 2026
Merged

feat(pruner): skip parent updates for already-pruned parents#628
freemans13 merged 15 commits into
bsv-blockchain:mainfrom
freemans13:feat/pruner-skip-pruned-parent-updates

Conversation

@freemans13

Copy link
Copy Markdown
Collaborator

Summary

  • On dev-scale-1/2, 25% of Aerospike writes during pruning are KEY_NOT_FOUND errors from updating parents that were already pruned in the same session — tight transaction chains (400K chains × ~3,750 depth from 40 blasters × 10K workers) cause cascading wasteful parent updates
  • Introduces a shared PrunedTxSet (256-shard concurrent map) across all partition workers that tracks TXIDs as they're scanned for pruning
  • Refactors partitionWorker into a two-stage read-ahead pipeline: stage 1 reads from Aerospike and registers TXIDs into the shared set via a 10K-entry buffered channel; stage 2 processes records and checks the set before issuing parent updates — if the parent is found, the update is skipped
  • The read-ahead buffer ensures all workers' readers populate the shared set ahead of any processor, effectively eliminating the cross-worker race condition for the transaction-blaster workload
  • Map is self-managing: entries are removed on lookup (each parent has one child in these chains), keeping peak memory at ~32MB (~400K dangling entries)

Test plan

  • 7 unit tests for PrunedTxSet (add/contains, check-and-remove, length tracking, concurrent access with 100 goroutines × 1000 ops, shard distribution, chain simulation, parent-not-in-block)
  • All tests pass with -race flag
  • Existing TestPruneWithPartitions partition retry tests pass
  • Full make build-teranode passes
  • make lint passes with 0 issues
  • Deploy to dev-scale and verify utxo_pruner_parents_skipped_pruned_total counter accounts for ~25% of parent update attempts
  • Verify utxo_pruner_parents_updated_skipped_total (KEY_NOT_FOUND) drops to near zero

🤖 Generated with Claude Code

freemans13 and others added 3 commits March 26, 2026 18:27
…ned TXIDs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ared PrunedTxSet

- Add PrunedTxSet shared across partition workers to track pruned TXIDs
- Refactor partitionWorker into two-stage pipeline: reader registers TXIDs
  into shared set via 10K-entry buffered channel, processor checks set before
  issuing parent updates
- Skip parent updates when parent TXID found in PrunedTxSet (parent already
  pruned in this session)
- Add prometheus counter utxo_pruner_parents_skipped_pruned_total

Eliminates ~25% wasteful Aerospike writes during pruning of tight transaction
chains (e.g. 400K chains of ~3750 depth in 1.5B-tx blocks from tx blasters).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete


Summary:

This PR introduces a shared PrunedTxSet to eliminate wasteful parent updates during UTXO pruning. The implementation addresses the observed 25% KEY_NOT_FOUND error rate on dev-scale deployments caused by tight transaction chains.

Key Design:

  • Two-stage pipeline: reader registers TXIDs before processor handles deletions
  • Shared 256-shard concurrent map tracks pruned transactions across partition workers
  • Memory-bounded at 10M entries (~1GB worst-case) with graceful degradation
  • Disabled in defensive mode (correctness constraint documented at pruner_service.go:662-687)

All Previously Reported Issues Addressed:

The author has thoroughly addressed all inline comments from prior reviews:

  • Read-ahead buffer now scales with chunkSize (pruner_service.go:490-498)
  • Duplicate Add() no longer increments count (pruned_tx_set.go:80-87)
  • Nil guards added for prunedSet (pruner_service.go:526-528)
  • Cancellation-safe pipeline send/recv (pruner_service.go:511-519, 546-550)
  • Graceful shutdown with recordset.Close() (pruner_service.go:586-587)
  • Hot-path optimization: reuse parentTxID (pruner_service.go:1015, 1021)
  • Defensive mode constraint enforced (pruner_service.go:689)
  • Dead skipParentUpdates field removed
  • Test coverage added (pruned_set_skip_test.go)

Design Rationale Documented:

The author provided clear technical justification for two intentional design choices:

  • CheckAndRemove is destructive (not per-output): Tight chains (99% of observed workload) have one child per parent; fan-out edge case accepts graceful KEY_NOT_FOUND
  • PrunedSet reuse across retries: Idempotent operations + defensive mode OFF makes cross-retry reuse safe and necessary for cross-partition deduplication

No New Issues Found

Code quality is high, test coverage is comprehensive, and the implementation aligns with CLAUDE.md requirements.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the Aerospike UTXO pruner to avoid wasteful parent-record updates when the parent transaction has already been pruned earlier in the same pruning session, targeting high-rate workloads that currently produce many KEY_NOT_FOUND write attempts.

Changes:

  • Adds a shared concurrent PrunedTxSet to track TXIDs scanned for pruning and skip parent updates when the parent is already pruned.
  • Refactors partitionWorker into a two-stage reader/processor pipeline to register TXIDs ahead of chunk processing.
  • Introduces a new Prometheus counter to track “skipped because already pruned” parent updates, and adds unit tests for PrunedTxSet.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
stores/utxo/aerospike/pruner/pruner_service.go Wires shared PrunedTxSet through partition workers, introduces reader/processor pipeline, and skips parent updates for already-pruned parents with a new metric.
stores/utxo/aerospike/pruner/pruned_tx_set.go Adds a sharded concurrent set implementation for tracking pruned TXIDs.
stores/utxo/aerospike/pruner/pruned_tx_set_test.go Adds unit tests for PrunedTxSet behavior and concurrency.
stores/utxo/aerospike/pruner/partition_retry_test.go Updates retry tests to match the new worker signature and initializes the new Prometheus counter in test setup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

chunk := make([]*aerospike.Result, 0, s.chunkSize)
// Two-stage pipeline: reader registers TXIDs in shared set before processor handles them
// This eliminates cross-worker race conditions for parent-update skipping
const readAheadBuffer = 10000

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readAheadBuffer is hard-coded to 10,000 records per partition worker. Since each aerospike.Result holds the full record bins, this can significantly increase peak memory (buffered results in pipeline + Aerospike RecordQueueSize + in-flight chunks), potentially undermining the existing chunking/memory-safety rationale (default chunkSize is 1000). Consider making the buffer size configurable and/or deriving it from s.chunkSize with a conservative cap so memory use scales predictably with settings.

Suggested change
const readAheadBuffer = 10000
// Derive read-ahead buffer size from chunkSize with a conservative cap so memory scales predictably.
readAheadBase := s.chunkSize
if readAheadBase <= 0 {
readAheadBase = 1000 // fall back to default chunk size if unset
}
// Allow modest read-ahead, but cap to avoid excessive buffered records.
readAheadBuffer := int(math.Min(float64(readAheadBase*2), 10000))
if readAheadBuffer < 1 {
readAheadBuffer = 1
}

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the hardcoded 10K buffer was an oversight. Now deriving it from chunkSize with a 2x multiplier capped at 10K, so memory scales predictably with the existing settings. Applied.

Comment on lines +975 to +979
// Accumulate parent updates, skipping parents already pruned in this session
for _, input := range inputs {
// Check if parent TX was already pruned — if so, skip the update
parentTxID := input.PreviousTxIDChainHash()
if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) {

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prunedSet.CheckAndRemove removes the parent TXID from the set on the first child/input that encounters it. A single parent transaction can have multiple outputs spent by different transactions being pruned in the same session; after the first skip, subsequent inputs referencing the same parent TXID will no longer be skipped and will still attempt the Aerospike update (likely reintroducing KEY_NOT_FOUND churn for fan-out/consolidation patterns). If the intent is to skip all updates once a parent TX is pruned, use a non-destructive lookup (or track a refcount / delayed eviction) instead of removing on first match.

Suggested change
// Accumulate parent updates, skipping parents already pruned in this session
for _, input := range inputs {
// Check if parent TX was already pruned — if so, skip the update
parentTxID := input.PreviousTxIDChainHash()
if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) {
// Accumulate parent updates, skipping parents already pruned in this session.
// Use a local map so that once a parent TX is treated as pruned, all subsequent
// inputs referencing the same parent are also skipped, even though
// prunedSet.CheckAndRemove is destructive.
prunedParentsSeen := make(map[string]struct{})
for _, input := range inputs {
// Check if parent TX was already pruned — if so, skip the update
parentTxID := input.PreviousTxIDChainHash()
parentTxIDStr := parentTxID.String()
// First, consult the local session cache
if _, alreadySeen := prunedParentsSeen[parentTxIDStr]; alreadySeen {
prometheusUtxoParentsSkippedPruned.Inc()
continue
}
// Then, consult the shared pruned set (which may remove on first match)
if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) {
prunedParentsSeen[parentTxIDStr] = struct{}{}

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great observation — you're right that a parent TX with multiple outputs spent in the same session would only get skipped on the first input. Added a local prunedParentsSeen map that caches the result of CheckAndRemove, so all subsequent inputs referencing the same pruned parent are also skipped without hitting the shared set again.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, I've reverted this change. The CheckAndRemove behaviour is intentional — tightly chained transactions are the primary cause of the 25% wasteful Aerospike updates we observed on dev-scale, and these chains almost never have fan-out (each parent has one child spending it within the same block). The remove-on-first-check keeps the map self-managing at ~32MB peak instead of growing monotonically. For the rare fan-out case, the second child simply sends the parent update as normal — which may get KEY_NOT_FOUND, handled gracefully as before. This is a deliberate compromise that addresses 99% of what we observe in production without adding complexity for edge cases.

Comment on lines +53 to +55
sh.m[h] = struct{}{}
sh.mu.Unlock()
s.count.Add(1)

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrunedTxSet.Add increments count unconditionally even if the TXID is already present in the shard map. This makes Len() drift upward on duplicate adds (which can happen across retries or if the same TXID is observed twice), and CheckAndRemove will only decrement once. Update Add to only increment when inserting a new key (check _, exists := sh.m[h] under the shard lock), and add a small unit test for duplicate-add behavior to prevent regressions.

Suggested change
sh.m[h] = struct{}{}
sh.mu.Unlock()
s.count.Add(1)
_, exists := sh.m[h]
if !exists {
sh.m[h] = struct{}{}
}
sh.mu.Unlock()
if !exists {
s.count.Add(1)
}

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — duplicate adds would cause Len() to drift upward since the count incremented unconditionally. Fixed Add to check for existence under the shard lock and only increment when inserting a new key. Also added a TestPrunedTxSet_DuplicateAdd test to prevent regressions.

@freemans13 freemans13 self-assigned this Mar 26, 2026
- Derive readAheadBuffer from chunkSize (2x capped at 10K) instead of hardcoded 10K
- Add local prunedParentsSeen map to handle fan-out parents with multiple spent outputs
- Fix duplicate-add count drift in PrunedTxSet.Add with existence check
- Add TestPrunedTxSet_DuplicateAdd regression test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-628 (1f47fab)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 142
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.804µ 1.659µ ~ 0.400
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 61.48n 61.52n ~ 0.500
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 61.47n 61.75n ~ 0.100
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 61.48n 61.48n ~ 0.800
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 29.93n 30.46n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 51.92n 51.07n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 103.8n 104.2n ~ 0.700
MiningCandidate_Stringify_Short-4 261.1n 260.7n ~ 0.400
MiningCandidate_Stringify_Long-4 1.883µ 1.874µ ~ 0.200
MiningSolution_Stringify-4 980.1n 979.3n ~ 0.400
BlockInfo_MarshalJSON-4 1.744µ 1.745µ ~ 0.500
NewFromBytes-4 131.4n 134.7n ~ 0.100
Mine_EasyDifficulty-4 60.53µ 60.77µ ~ 0.700
Mine_WithAddress-4 6.712µ 6.821µ ~ 0.100
BlockAssembler_AddTx-4 0.02944n 0.02799n ~ 1.000
AddNode-4 11.29 11.28 ~ 1.000
AddNodeWithMap-4 11.47 11.54 ~ 1.000
DirectSubtreeAdd/4_per_subtree-4 59.18n 59.68n ~ 1.000
DirectSubtreeAdd/64_per_subtree-4 28.19n 29.54n ~ 0.200
DirectSubtreeAdd/256_per_subtree-4 27.23n 27.16n ~ 0.300
DirectSubtreeAdd/1024_per_subtree-4 26.21n 26.17n ~ 0.700
DirectSubtreeAdd/2048_per_subtree-4 25.88n 25.75n ~ 0.400
SubtreeProcessorAdd/4_per_subtree-4 280.5n 279.9n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 275.5n 276.6n ~ 0.700
SubtreeProcessorAdd/256_per_subtree-4 278.5n 277.4n ~ 0.700
SubtreeProcessorAdd/1024_per_subtree-4 270.4n 272.1n ~ 0.200
SubtreeProcessorAdd/2048_per_subtree-4 268.6n 270.7n ~ 0.400
SubtreeProcessorRotate/4_per_subtree-4 276.6n 277.1n ~ 0.700
SubtreeProcessorRotate/64_per_subtree-4 272.6n 278.0n ~ 0.100
SubtreeProcessorRotate/256_per_subtree-4 277.2n 277.1n ~ 1.000
SubtreeProcessorRotate/1024_per_subtree-4 273.4n 275.7n ~ 0.700
SubtreeNodeAddOnly/4_per_subtree-4 54.07n 54.32n ~ 0.700
SubtreeNodeAddOnly/64_per_subtree-4 34.22n 34.20n ~ 0.800
SubtreeNodeAddOnly/256_per_subtree-4 33.14n 33.26n ~ 0.400
SubtreeNodeAddOnly/1024_per_subtree-4 32.61n 32.45n ~ 0.200
SubtreeCreationOnly/4_per_subtree-4 112.9n 112.2n ~ 0.100
SubtreeCreationOnly/64_per_subtree-4 394.6n 390.8n ~ 0.700
SubtreeCreationOnly/256_per_subtree-4 1.317µ 1.360µ ~ 0.700
SubtreeCreationOnly/1024_per_subtree-4 4.392µ 4.361µ ~ 1.000
SubtreeCreationOnly/2048_per_subtree-4 7.985µ 7.826µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 274.2n 274.9n ~ 1.000
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 276.5n 276.2n ~ 1.000
ParallelGetAndSetIfNotExists/1k_nodes-4 827.9µ 828.3µ ~ 1.000
ParallelGetAndSetIfNotExists/10k_nodes-4 1.625m 1.378m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 6.833m 6.726m ~ 0.200
ParallelGetAndSetIfNotExists/100k_nodes-4 13.70m 13.71m ~ 1.000
SequentialGetAndSetIfNotExists/1k_nodes-4 670.1µ 673.9µ ~ 0.200
SequentialGetAndSetIfNotExists/10k_nodes-4 2.905m 2.864m ~ 1.000
SequentialGetAndSetIfNotExists/50k_nodes-4 11.16m 10.51m ~ 0.100
SequentialGetAndSetIfNotExists/100k_nodes-4 21.16m 19.96m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 643.9µ 642.6µ ~ 0.400
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 4.244m 4.220m ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 17.17m 16.89m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 729.3µ 711.9µ ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 5.990m 5.852m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 38.96m 38.64m ~ 0.100
DiskTxMap_SetIfNotExists-4 3.741µ 3.503µ ~ 0.100
DiskTxMap_SetIfNotExists_Parallel-4 3.408µ 3.407µ ~ 0.700
DiskTxMap_ExistenceOnly-4 306.4n 311.9n ~ 0.700
Queue-4 191.3n 188.6n ~ 0.100
AtomicPointer-4 5.097n 4.524n ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 848.3µ 845.1µ ~ 0.700
ReorgOptimizations/DedupFilterPipeline/New/10K-4 785.7µ 812.6µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 112.2µ 107.5µ ~ 0.400
ReorgOptimizations/AllMarkFalse/New/10K-4 61.85µ 62.35µ ~ 0.100
ReorgOptimizations/HashSlicePool/Old/10K-4 53.68µ 53.55µ ~ 1.000
ReorgOptimizations/HashSlicePool/New/10K-4 11.54µ 10.54µ ~ 0.100
ReorgOptimizations/NodeFlags/Old/10K-4 4.676µ 4.485µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.577µ 1.588µ ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.356m 9.232m ~ 0.200
ReorgOptimizations/DedupFilterPipeline/New/100K-4 10.139m 9.821m ~ 0.200
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.123m 1.107m ~ 1.000
ReorgOptimizations/AllMarkFalse/New/100K-4 687.2µ 678.7µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/100K-4 558.6µ 506.7µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 335.2µ 334.3µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/100K-4 49.91µ 49.35µ ~ 0.100
ReorgOptimizations/NodeFlags/New/100K-4 17.37µ 17.14µ ~ 0.700
TxMapSetIfNotExists-4 51.02n 51.10n ~ 0.400
TxMapSetIfNotExistsDuplicate-4 38.32n 38.27n ~ 1.000
ChannelSendReceive-4 601.2n 605.4n ~ 0.700
CalcBlockWork-4 486.9n 493.0n ~ 0.400
CalculateWork-4 681.4n 666.0n ~ 0.700
BuildBlockLocatorString_Helpers/Size_10-4 1.033µ 1.025µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_100-4 9.958µ 10.007µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_1000-4 98.38µ 97.25µ ~ 0.400
CatchupWithHeaderCache-4 103.6m 103.7m ~ 0.700
_BufferPoolAllocation/16KB-4 3.180µ 3.235µ ~ 0.700
_BufferPoolAllocation/32KB-4 7.121µ 6.263µ ~ 0.100
_BufferPoolAllocation/64KB-4 14.50µ 16.73µ ~ 0.200
_BufferPoolAllocation/128KB-4 31.76µ 29.04µ ~ 0.100
_BufferPoolAllocation/512KB-4 96.39µ 109.54µ ~ 0.400
_BufferPoolConcurrent/32KB-4 18.64µ 15.98µ ~ 0.100
_BufferPoolConcurrent/64KB-4 29.21µ 25.67µ ~ 0.100
_BufferPoolConcurrent/512KB-4 131.9µ 133.1µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/16KB-4 584.1µ 622.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 582.9µ 616.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 580.8µ 617.2µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 599.1µ 622.0µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 591.6µ 618.7µ ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/16KB-4 36.01m 35.81m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.01m 35.76m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/64KB-4 35.33m 35.84m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/128KB-4 35.52m 35.81m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/512KB-4 35.05m 35.24m ~ 0.100
_PooledVsNonPooled/Pooled-4 827.8n 828.0n ~ 1.000
_PooledVsNonPooled/NonPooled-4 7.050µ 6.595µ ~ 0.200
_MemoryFootprint/Current_512KB_32concurrent-4 6.949µ 6.845µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 8.778µ 9.240µ ~ 0.200
_MemoryFootprint/Alternative_64KB_32concurrent-4 8.375µ 8.504µ ~ 0.100
_prepareTxsPerLevel-4 399.0m 399.0m ~ 1.000
_prepareTxsPerLevelOrdered-4 3.996m 3.962m ~ 0.700
_prepareTxsPerLevel_Comparison/Original-4 399.6m 402.8m ~ 0.700
_prepareTxsPerLevel_Comparison/Optimized-4 3.608m 3.495m ~ 0.100
SubtreeSizes/10k_tx_4_per_subtree-4 1.266m 1.346m ~ 0.200
SubtreeSizes/10k_tx_16_per_subtree-4 307.0µ 304.5µ ~ 0.700
SubtreeSizes/10k_tx_64_per_subtree-4 71.45µ 72.30µ ~ 0.100
SubtreeSizes/10k_tx_256_per_subtree-4 17.64µ 17.80µ ~ 0.700
SubtreeSizes/10k_tx_512_per_subtree-4 8.845µ 8.858µ ~ 1.000
SubtreeSizes/10k_tx_1024_per_subtree-4 4.387µ 4.464µ ~ 0.200
SubtreeSizes/10k_tx_2k_per_subtree-4 2.184µ 2.166µ ~ 1.000
BlockSizeScaling/10k_tx_64_per_subtree-4 69.90µ 69.17µ ~ 0.200
BlockSizeScaling/10k_tx_256_per_subtree-4 17.52µ 17.49µ ~ 0.200
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.390µ 4.307µ ~ 0.400
BlockSizeScaling/50k_tx_64_per_subtree-4 368.4µ 365.7µ ~ 0.700
BlockSizeScaling/50k_tx_256_per_subtree-4 88.38µ 87.66µ ~ 0.700
BlockSizeScaling/50k_tx_1024_per_subtree-4 21.65µ 21.58µ ~ 1.000
SubtreeAllocations/small_subtrees_exists_check-4 149.4µ 151.5µ ~ 0.400
SubtreeAllocations/small_subtrees_data_fetch-4 159.9µ 159.9µ ~ 0.700
SubtreeAllocations/small_subtrees_full_validation-4 311.6µ 311.1µ ~ 1.000
SubtreeAllocations/medium_subtrees_exists_check-4 8.818µ 8.928µ ~ 0.700
SubtreeAllocations/medium_subtrees_data_fetch-4 9.311µ 9.401µ ~ 0.200
SubtreeAllocations/medium_subtrees_full_validation-4 17.43µ 17.44µ ~ 0.700
SubtreeAllocations/large_subtrees_exists_check-4 2.098µ 2.081µ ~ 0.700
SubtreeAllocations/large_subtrees_data_fetch-4 2.232µ 2.259µ ~ 0.100
SubtreeAllocations/large_subtrees_full_validation-4 4.379µ 4.348µ ~ 0.200
StoreBlock_Sequential/BelowCSVHeight-4 244.6µ 244.9µ ~ 1.000
StoreBlock_Sequential/AboveCSVHeight-4 244.0µ 233.2µ ~ 0.400
GetUtxoHashes-4 256.4n 255.7n ~ 0.200
GetUtxoHashes_ManyOutputs-4 43.96µ 43.57µ ~ 0.700
_NewMetaDataFromBytes-4 239.7n 238.7n ~ 0.400
_Bytes-4 637.5n 634.9n ~ 1.000
_MetaBytes-4 582.6n 572.6n ~ 0.100

Threshold: >10% with p < 0.05 | Generated: 2026-05-14 13:27 UTC

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

stores/utxo/aerospike/pruner/pruner_service.go:590

  • Chunk flushing still depends on s.chunkSize being > 0 (if len(chunk) >= s.chunkSize). If chunkSize is ever configured as 0/negative (you already handle <=0 for read-ahead sizing), this condition becomes always true and will submit a goroutine per record, which can overwhelm the system. Consider validating/normalizing chunkSize once (e.g., during service init) so both chunking and RecordQueueSize have a safe minimum > 0.
		chunk = append(chunk, rec)
		if len(chunk) >= s.chunkSize {
			submitChunk(chunk)
			chunk = make([]*aerospike.Result, 0, s.chunkSize)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if txIDBytes, ok := rec.Record.Bins[s.fieldTxID].([]byte); ok && len(txIDBytes) == 32 {
var h chainhash.Hash
copy(h[:], txIDBytes)
prunedSet.Add(h)

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prunedSet is dereferenced without a nil check (prunedSet.Add(h)). Since the signature allows prunedSet *PrunedTxSet, a nil value would panic here (while later usage in processRecordChunk is guarded). Either make prunedSet required or guard before calling Add.

Suggested change
prunedSet.Add(h)
if prunedSet != nil {
prunedSet.Add(h)
}

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- added the nil guard around prunedSet.Add(h) to match the existing nil checks in processRecordChunk.

Comment on lines +543 to +547
}
}

pipeline <- rec
}

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader goroutine can block forever on pipeline <- rec if the processor stops reading (e.g., ctx cancellation). Because partitionWorker waits on readerDone, this can deadlock shutdown. Make the send cancellable (e.g., select on ctx.Done() while sending) and ensure the reader can also exit promptly if the recordset stalls.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed -- the bare channel send could deadlock on shutdown. Wrapped pipeline <- rec in a select with ctx.Done(), and also made the recordset.Results() recv cancellable via select so the reader can exit promptly even if the Aerospike recordset stalls.

for rec := range pipeline {
select {
case <-ctx.Done():
readerDone.Wait()

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On ctx cancellation, the processor path waits on readerDone without first ensuring the reader is unblocked from recordset.Results() or a blocked send to pipeline. This can deadlock cancellation if the Aerospike recordset is stalled or if the pipeline buffer is full. Consider creating a derived context for the worker and calling its cancel/closing the recordset before waiting, so the reader goroutine is guaranteed to exit.

Suggested change
readerDone.Wait()

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the shutdown deadlock. The processor now calls recordset.Close() on ctx cancellation before waiting on readerDone, which unblocks the reader from both recordset.Results() and the pipeline send. The reader's recv is also now wrapped in a select with ctx.Done() for belt-and-suspenders safety.

…ional

The fan-out case (one parent with multiple children in the same block)
is extremely rare in the transaction-blaster workload. The remove-on-first-check
design keeps the map self-managing at ~32MB peak. For the rare fan-out case,
subsequent children send the parent update normally (KEY_NOT_FOUND handled
gracefully as before).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

continue
}

keySource := uaerospike.CalculateKeySource(input.PreviousTxIDChainHash(), input.PreviousTxOutIndex, s.utxoBatchSize)

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor hot-path inefficiency: PreviousTxIDChainHash() is called twice per input (once for parentTxID, then again for CalculateKeySource). Reuse parentTxID for CalculateKeySource to avoid the extra method call/allocation in this inner loop.

Suggested change
keySource := uaerospike.CalculateKeySource(input.PreviousTxIDChainHash(), input.PreviousTxOutIndex, s.utxoBatchSize)
keySource := uaerospike.CalculateKeySource(parentTxID, input.PreviousTxOutIndex, s.utxoBatchSize)

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot — was calling PreviousTxIDChainHash() twice in the hot path unnecessarily. Now reusing the parentTxID variable for CalculateKeySource.

Comment on lines +525 to +534
// Register TXID in shared set before forwarding (if record is valid)
if rec.Err == nil && rec.Record != nil && rec.Record.Bins != nil {
if txIDBytes, ok := rec.Record.Bins[s.fieldTxID].([]byte); ok && len(txIDBytes) == 32 {
var h chainhash.Hash
copy(h[:], txIDBytes)
if prunedSet != nil {
prunedSet.Add(h)
}
}
}

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prunedSet is populated in the reader for every scanned record before the processor determines whether that record will actually be deleted (e.g., defensive mode can continue without deleting when a child is unstable). In defensive mode this can put TXIDs into prunedSet for records that remain in Aerospike, causing CheckAndRemove to incorrectly skip required parent updates later in the same run. Consider disabling this optimization when s.defensiveEnabled is true, or only registering TXIDs after the record has been confirmed for deletion (so the set represents “will be pruned”, not merely “was scanned”).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good catch — this is a genuine correctness issue. In defensive mode, the reader registers TXIDs before the processor decides whether to actually delete the record (child stability check can skip it). A skipped record stays in Aerospike but its TXID would be in the set, incorrectly suppressing a future parent update. Fixed by disabling the PrunedTxSet entirely when defensive mode is enabled. Defensive mode is about safety during reorgs, not throughput, so this is the right tradeoff.

}

// Shared set tracking TXIDs scanned for pruning — used to skip wasteful parent updates
prunedSet := NewPrunedTxSet(256)

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prunedSet is allocated unconditionally, but it’s only used inside the !s.skipParentUpdates path. When SkipParentUpdates is enabled, this adds avoidable per-record overhead (hash copy + sharded map insert) and retains memory with no effect. Consider only creating/passing a PrunedTxSet when parent updates are enabled (otherwise pass nil).

Suggested change
prunedSet := NewPrunedTxSet(256)
var prunedSet *PrunedTxSet
if !s.skipParentUpdates {
prunedSet = NewPrunedTxSet(256)
}

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — when skipParentUpdates is enabled, the set is never consulted so the per-record Add() overhead is pure waste. Now only allocating the PrunedTxSet when parent updates are enabled (and defensive mode is off). Passes nil otherwise, which the reader and processor already guard against.

freemans13 and others added 5 commits March 26, 2026 19:05
- Reuse parentTxID variable instead of calling PreviousTxIDChainHash() twice
- Only allocate PrunedTxSet when parent updates are enabled (nil when skipParentUpdates)
- Disable PrunedTxSet in defensive mode to prevent incorrect parent update skips
  when records are scanned but not deleted (child stability check fails)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sonarqubecloud

sonarqubecloud Bot commented May 6, 2026

Copy link
Copy Markdown

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

connectionPoolWarningThreshold float64 // Threshold for connection pool auto-adjustment (0.0-1.0)
utxoSetTTL bool // Use TTL expiration instead of hard delete
partitionWorkerFn func(ctx context.Context, blockHeight uint32, partitionStart int, partitionCount int) (int64, int64, error)
skipParentUpdates bool // Skip parent update operations and input fetching

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — confirmed dead code. skipParentUpdates was never initialized in NewService and no setting feeds it. Removed the field and simplified the conditional to if !s.defensiveEnabled { prunedSet = NewPrunedTxSet(256) }. The comment was misleading too, so it's gone with the field.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the dead skipParentUpdates field in b72cff7 — the gate at the only usage site is now simply !s.defensiveEnabled, matching the actual semantics ("only allocate the dedup set when defensive mode is off"). The field had no setter, no settings binding, and the surviving boolean expression contributed no information.

Comment on lines +991 to +1000
// Accumulate parent updates, skipping parents already pruned in this session
for _, input := range inputs {
keySource := uaerospike.CalculateKeySource(input.PreviousTxIDChainHash(), input.PreviousTxOutIndex, s.utxoBatchSize)
// Check if parent TX was already pruned — if so, skip the update
parentTxID := input.PreviousTxIDChainHash()
if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) {
prometheusUtxoParentsSkippedPruned.Inc()
continue
}

keySource := uaerospike.CalculateKeySource(parentTxID, input.PreviousTxOutIndex, s.utxoBatchSize)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TestProcessRecordChunk_SkipsParentsInPrunedSet and TestProcessRecordChunk_EmptyPrunedSetIsNoOp in pruned_set_skip_test.go. The first verifies that when both parents of a child's inputs are in PrunedTxSet, processRecordChunk accumulates zero parent updates and utxo_pruner_parents_skipped_pruned_total increments once per input. The second verifies an empty set is a no-op. Both run as pure unit tests (no testcontainers) because flushCleanupBatches short-circuits on empty parentUpdates and SkipDeletions gates the deletion path.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TestProcessRecordChunk_SkipsParentsInPrunedSet and TestProcessRecordChunk_EmptyPrunedSetIsNoOp in stores/utxo/aerospike/pruner/pruned_set_skip_test.go (b72cff7). The first test pre-populates the shared PrunedTxSet with both parent TXIDs of a synthetic child record, drives processRecordChunk directly, and asserts that utxo_pruner_parents_skipped_pruned_total increments by exactly 2 — once per input — while the parent-update map stays empty so flushCleanupBatches never touches Aerospike. The second test confirms no metric movement when the set is empty (coinbase-style input list). Both pass under go test -race.

- Remove dead skipParentUpdates field; simplify prunedSet allocation to
  gate on !defensiveEnabled only.
- Add TestProcessRecordChunk_SkipsParentsInPrunedSet and
  TestProcessRecordChunk_EmptyPrunedSetIsNoOp covering the parent-update
  skip path via processRecordChunk + utxo_pruner_parents_skipped_pruned_total.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
// in Aerospike.
var prunedSet *PrunedTxSet
if !s.defensiveEnabled {
prunedSet = NewPrunedTxSet(256)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor] Potential correctness issue with prunedSet reuse across retries

The prunedSet is allocated once and shared across all retry attempts. On retry after a timeout, the same prunedSet instance is passed to workers, but it may still contain TXIDs from successfully completed partitions or partially processed records before the timeout.

This could lead to false skips if a parent TXID remains in the set but the parent was not actually deleted due to timeout. The retry will incorrectly skip updating that parent.

Suggested fix: Create a fresh prunedSet for each retry attempt inside the retry loop, or document why reuse is safe if there is a guarantee.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question — worth documenting. Reuse across retries is intentional, and the correctness argument hinges on the fact that prunedSet is only ever non-nil when defensive mode is OFF. In that mode the deletedChildren bin update being skipped has no behavioural consequence (it's only consulted by the defensive safety check). Add is idempotent so re-scanning a timed-out partition is safe, and CheckAndRemove being destructive is the documented bound on the set's memory footprint. Reallocating per attempt would break the cross-partition dedup that's the whole point of the optimisation. I've extended the comment block at the allocation site to spell this out.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

Address Claude Code Review concern about cross-retry reuse: the set is
intentionally shared because (1) Add is idempotent, (2) CheckAndRemove
being destructive is fine — at worst one wasted parent update, never a
correctness issue, and (3) the suppressed deletedChildren bin update is
only consulted in defensive mode, which is the only mode where prunedSet
is nil. Reallocating per attempt would break the cross-partition dedup
that drove this optimisation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

@oskarszoon oskarszoon left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve in principle — PrunedTxSet + the SkipParentUpdates gate is the right shape. Two questions before merge plus a small ask:

  • SkipParentUpdates=true + file-store deployment: pruner_service.go:947-949 classifies all external-store entries as pruned. Is that intentional (file-store entries are externally managed so the parent-update gate is irrelevant), or do file-store nodes need a separate handling path? A one-line guard either way would make the intent unambiguous.
  • PrunedTxSet is sharded into 256 buckets. Was that count picked against observed pruner concurrency, or default? On low-concurrency deployments, over-sharding mostly adds GC pressure without throughput benefit — happy if 256 is measured, worth tuning if not.
  • The new prometheusUtxoParentsUpdated/Skipped counters expose the gate behaviour but the PR doesn't note an expected baseline skip-percentage for operators. A sentence in the longdesc ("expect ~X% skipped after warmup") would make the metric actionable.

@freemans13 freemans13 requested a review from ordishs May 14, 2026 10:36

@ordishs ordishs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve with comments

Core design is sound — gating on defensiveEnabled is the right invariant, and the retry-reuse comment in PruneWithPartitions is a model for documenting subtle concurrency contracts. Two-stage reader/processor pipeline handles cancellation correctly and won't leak goroutines (the processor always drains until the reader closes the channel, so every readerDone.Wait() is non-blocking by the time it executes).

Strengths

  • Clean module isolation: PrunedTxSet is a small, well-named API in its own file.
  • Defensive-mode gating is correct: prunedSet is nil whenever the deletedChildren bin is consulted, so skipped parent updates never affect a code path that reads the bin.
  • Solid concurrency test: TestPrunedTxSet_ConcurrentAccess (100×1000 ops) under -race, plus a chain-simulation test that exercises the actual usage pattern.
  • Cancellation is wired through both selects in the reader.

Suggested cleanups (non-blocking)

1. assert vs require in pruned_tx_set_test.go
Project rule (.claude/rules/testing.md): "Use require from testify (not assert)". pruned_set_skip_test.go follows this; pruned_tx_set_test.go uses assert throughout. Mechanical replace.

2. Unbounded set growth under non-chain workloads
The "~32 MB / ~400K dangling entries" estimate is workload-specific (tight chains → peak dangling = chain count, internal nodes get consumed). For a workload where most parents are from prior blocks, every TXID Added stays dangling, and Len() approaches N (total TXs pruned per session). At 10M records that's ~400 MB. Worth either:

  • A sentence in the PruneWithPartitions comment acknowledging the unbounded-in-N memory cost, or
  • A soft cap (e.g. degrade to no-op once Len() > some_threshold) for safety on workloads that don't match the dev-scale chain topology.

3. math.Min(float64(readAheadBase*2), 10000) (pruner_service.go:489)
Go 1.21+ builtin min(readAheadBase*2, 10000) — same behavior, no float round-trip.

4. mask uint8 is tightly coupled to the 256 shard cap (pruned_tx_set.go:13,42)
Works only because n is capped at 256. If anyone later raises the cap, this becomes a silent bug. Either widen to uint32 (no perf cost — still h[0] & mask) or add a comment tying the type to the cap.

5. Reader ordering with rec.Err != nil
The reader Adds the TXID to the set before checking rec.Err, but the Add is already gated on rec.Err == nil at line ~521, so this is fine — just worth a glance to confirm no test depended on the prior chunk-ordering of error records.

6. Hash distribution by first byte (pruned_tx_set.go:42)
Fine for SHA-256-derived TXIDs (uniform first byte). A line in the docstring noting "relies on uniform hash distribution; not safe for non-cryptographic keys" would protect future reuse.

7. Cross-worker dedup isn't directly tested
Set semantics and in-chunk skip path are covered; the cross-worker race (the motivation for the two-stage pipeline) is only validated via the deployed metric. Not blocking — hard to test deterministically — but worth flagging that the dev-scale utxo_pruner_parents_skipped_pruned_total check is the real correctness signal.

Correctness verification

Re-checked the retry edge case: parent's deletion times out in attempt 1, child already CheckAndRemoved it. Parent's deletedChildren bin is left slightly stale on retry. Safe because the only consumer of that bin is the defensive-mode verification (processRecordChunk Step 1), and defensive mode is off whenever prunedSet is non-nil. The PR comment captures this exactly.

LGTM after the assert → require swap and ideally a note (or cap) on worst-case memory.


Generated by Claude Code

Address review feedback from @ordishs on bsv-blockchain#628:

- Soft cap on PrunedTxSet (10M entries) so memory stays bounded on
  workloads that don't fit the tight-chain pattern. Sessions on
  dev-scale-1 prune ~500M records, so an unbounded set could grow to
  many GB if most parents live in prior blocks. Once the cap is hit,
  Add() becomes a no-op and the skip optimisation silently degrades to
  baseline — no correctness impact.
- Reuse the cap comment to document the unbounded-in-N concern in
  PruneWithPartitions, alongside the existing retry-reuse rationale.
- assert -> require in pruned_tx_set_test.go (project rule).
- math.Min -> builtin min (Go 1.21+).
- Doc the mask uint8 / 256-shard coupling and the SHA-256 first-byte
  distribution assumption on PrunedTxSet.
- Add TestPrunedTxSet_SoftCap and TestPrunedTxSet_UnlimitedWhenCapZero.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@freemans13

Copy link
Copy Markdown
Collaborator Author

Thanks for the careful read @ordishs. Walking through each point — fixed in adc7dbe:

1. assertrequire in pruned_tx_set_test.go — Fixed, mechanical replace. Good catch, somehow missed the rule on the original pass.

2. Unbounded set growth under non-chain workloads — Real problem, you nailed it. Pulled the actual numbers from dev-scale-1: pruner sessions delete ~493M records on average (sum(rate(utxo_pruner_records_deleted_total[1h])) / sum(rate(pruner_duration_seconds_count[1h]))), so my original "~32 MB / ~400K dangling" estimate was off by three orders of magnitude. At even 100 bytes/entry, a worst-case session-sized set would be ~50 GB. Added a soft cap (prunedTxSetMaxEntries = 10_000_000): once Len() reaches it, Add() becomes a silent no-op and the skip optimisation degrades to baseline for the rest of the session. No correctness impact — the cap just stops adding new entries. Comment block above the allocation site documents the cap and the unbounded-in-N concern.

3. math.Min(...) → builtin min(...) — Fixed.

4. mask uint8 ↔ 256-shard coupling — Kept the uint8 (saves a few bytes in the struct layout and matches the cap exactly) and added a comment on the field tying the type width to the 256 cap, so a future bump fails the build instead of silently truncating.

5. Reader ordering with rec.Err != nil — Re-checked, confirmed safe. The Add is gated on rec.Err == nil && rec.Record != nil && rec.Record.Bins != nil immediately before, so error records never get registered. No code change needed.

6. Hash distribution docstring — Added a paragraph on PrunedTxSet noting that sharding via h[0]&mask relies on uniform first-byte distribution and is safe for SHA-256-derived keys only.

7. Cross-worker dedup test — Agreed, deterministic test is hard given the two-stage pipeline's race-by-design. Leaving the deployed utxo_pruner_parents_skipped_pruned_total metric as the real signal — it's been confirming the skip path is firing as expected on dev-scale-1.

Re-tested: go test -race ./stores/utxo/aerospike/pruner/... → 44 pass, including new TestPrunedTxSet_SoftCap and TestPrunedTxSet_UnlimitedWhenCapZero.

@sonarqubecloud

Copy link
Copy Markdown

@freemans13 freemans13 merged commit 12b6408 into bsv-blockchain:main May 14, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants