feat(pruner): skip parent updates for already-pruned parents#628
Conversation
…ned TXIDs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ared PrunedTxSet - Add PrunedTxSet shared across partition workers to track pruned TXIDs - Refactor partitionWorker into two-stage pipeline: reader registers TXIDs into shared set via 10K-entry buffered channel, processor checks set before issuing parent updates - Skip parent updates when parent TXID found in PrunedTxSet (parent already pruned in this session) - Add prometheus counter utxo_pruner_parents_skipped_pruned_total Eliminates ~25% wasteful Aerospike writes during pruning of tight transaction chains (e.g. 400K chains of ~3750 depth in 1.5B-tx blocks from tx blasters). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
🤖 Claude Code Review Status: Complete Summary: This PR introduces a shared PrunedTxSet to eliminate wasteful parent updates during UTXO pruning. The implementation addresses the observed 25% KEY_NOT_FOUND error rate on dev-scale deployments caused by tight transaction chains. Key Design:
All Previously Reported Issues Addressed: The author has thoroughly addressed all inline comments from prior reviews:
Design Rationale Documented: The author provided clear technical justification for two intentional design choices:
No New Issues Found Code quality is high, test coverage is comprehensive, and the implementation aligns with CLAUDE.md requirements. |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the Aerospike UTXO pruner to avoid wasteful parent-record updates when the parent transaction has already been pruned earlier in the same pruning session, targeting high-rate workloads that currently produce many KEY_NOT_FOUND write attempts.
Changes:
- Adds a shared concurrent
PrunedTxSetto track TXIDs scanned for pruning and skip parent updates when the parent is already pruned. - Refactors
partitionWorkerinto a two-stage reader/processor pipeline to register TXIDs ahead of chunk processing. - Introduces a new Prometheus counter to track “skipped because already pruned” parent updates, and adds unit tests for
PrunedTxSet.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
stores/utxo/aerospike/pruner/pruner_service.go |
Wires shared PrunedTxSet through partition workers, introduces reader/processor pipeline, and skips parent updates for already-pruned parents with a new metric. |
stores/utxo/aerospike/pruner/pruned_tx_set.go |
Adds a sharded concurrent set implementation for tracking pruned TXIDs. |
stores/utxo/aerospike/pruner/pruned_tx_set_test.go |
Adds unit tests for PrunedTxSet behavior and concurrency. |
stores/utxo/aerospike/pruner/partition_retry_test.go |
Updates retry tests to match the new worker signature and initializes the new Prometheus counter in test setup. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| chunk := make([]*aerospike.Result, 0, s.chunkSize) | ||
| // Two-stage pipeline: reader registers TXIDs in shared set before processor handles them | ||
| // This eliminates cross-worker race conditions for parent-update skipping | ||
| const readAheadBuffer = 10000 |
There was a problem hiding this comment.
readAheadBuffer is hard-coded to 10,000 records per partition worker. Since each aerospike.Result holds the full record bins, this can significantly increase peak memory (buffered results in pipeline + Aerospike RecordQueueSize + in-flight chunks), potentially undermining the existing chunking/memory-safety rationale (default chunkSize is 1000). Consider making the buffer size configurable and/or deriving it from s.chunkSize with a conservative cap so memory use scales predictably with settings.
| const readAheadBuffer = 10000 | |
| // Derive read-ahead buffer size from chunkSize with a conservative cap so memory scales predictably. | |
| readAheadBase := s.chunkSize | |
| if readAheadBase <= 0 { | |
| readAheadBase = 1000 // fall back to default chunk size if unset | |
| } | |
| // Allow modest read-ahead, but cap to avoid excessive buffered records. | |
| readAheadBuffer := int(math.Min(float64(readAheadBase*2), 10000)) | |
| if readAheadBuffer < 1 { | |
| readAheadBuffer = 1 | |
| } |
There was a problem hiding this comment.
Good catch — the hardcoded 10K buffer was an oversight. Now deriving it from chunkSize with a 2x multiplier capped at 10K, so memory scales predictably with the existing settings. Applied.
| // Accumulate parent updates, skipping parents already pruned in this session | ||
| for _, input := range inputs { | ||
| // Check if parent TX was already pruned — if so, skip the update | ||
| parentTxID := input.PreviousTxIDChainHash() | ||
| if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) { |
There was a problem hiding this comment.
prunedSet.CheckAndRemove removes the parent TXID from the set on the first child/input that encounters it. A single parent transaction can have multiple outputs spent by different transactions being pruned in the same session; after the first skip, subsequent inputs referencing the same parent TXID will no longer be skipped and will still attempt the Aerospike update (likely reintroducing KEY_NOT_FOUND churn for fan-out/consolidation patterns). If the intent is to skip all updates once a parent TX is pruned, use a non-destructive lookup (or track a refcount / delayed eviction) instead of removing on first match.
| // Accumulate parent updates, skipping parents already pruned in this session | |
| for _, input := range inputs { | |
| // Check if parent TX was already pruned — if so, skip the update | |
| parentTxID := input.PreviousTxIDChainHash() | |
| if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) { | |
| // Accumulate parent updates, skipping parents already pruned in this session. | |
| // Use a local map so that once a parent TX is treated as pruned, all subsequent | |
| // inputs referencing the same parent are also skipped, even though | |
| // prunedSet.CheckAndRemove is destructive. | |
| prunedParentsSeen := make(map[string]struct{}) | |
| for _, input := range inputs { | |
| // Check if parent TX was already pruned — if so, skip the update | |
| parentTxID := input.PreviousTxIDChainHash() | |
| parentTxIDStr := parentTxID.String() | |
| // First, consult the local session cache | |
| if _, alreadySeen := prunedParentsSeen[parentTxIDStr]; alreadySeen { | |
| prometheusUtxoParentsSkippedPruned.Inc() | |
| continue | |
| } | |
| // Then, consult the shared pruned set (which may remove on first match) | |
| if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) { | |
| prunedParentsSeen[parentTxIDStr] = struct{}{} |
There was a problem hiding this comment.
Great observation — you're right that a parent TX with multiple outputs spent in the same session would only get skipped on the first input. Added a local prunedParentsSeen map that caches the result of CheckAndRemove, so all subsequent inputs referencing the same pruned parent are also skipped without hitting the shared set again.
There was a problem hiding this comment.
On reflection, I've reverted this change. The CheckAndRemove behaviour is intentional — tightly chained transactions are the primary cause of the 25% wasteful Aerospike updates we observed on dev-scale, and these chains almost never have fan-out (each parent has one child spending it within the same block). The remove-on-first-check keeps the map self-managing at ~32MB peak instead of growing monotonically. For the rare fan-out case, the second child simply sends the parent update as normal — which may get KEY_NOT_FOUND, handled gracefully as before. This is a deliberate compromise that addresses 99% of what we observe in production without adding complexity for edge cases.
| sh.m[h] = struct{}{} | ||
| sh.mu.Unlock() | ||
| s.count.Add(1) |
There was a problem hiding this comment.
PrunedTxSet.Add increments count unconditionally even if the TXID is already present in the shard map. This makes Len() drift upward on duplicate adds (which can happen across retries or if the same TXID is observed twice), and CheckAndRemove will only decrement once. Update Add to only increment when inserting a new key (check _, exists := sh.m[h] under the shard lock), and add a small unit test for duplicate-add behavior to prevent regressions.
| sh.m[h] = struct{}{} | |
| sh.mu.Unlock() | |
| s.count.Add(1) | |
| _, exists := sh.m[h] | |
| if !exists { | |
| sh.m[h] = struct{}{} | |
| } | |
| sh.mu.Unlock() | |
| if !exists { | |
| s.count.Add(1) | |
| } |
There was a problem hiding this comment.
You're right — duplicate adds would cause Len() to drift upward since the count incremented unconditionally. Fixed Add to check for existence under the shard lock and only increment when inserting a new key. Also added a TestPrunedTxSet_DuplicateAdd test to prevent regressions.
- Derive readAheadBuffer from chunkSize (2x capped at 10K) instead of hardcoded 10K - Add local prunedParentsSeen map to handle fan-out parents with multiple spent outputs - Fix duplicate-add count drift in PrunedTxSet.Add with existence check - Add TestPrunedTxSet_DuplicateAdd regression test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-14 13:27 UTC |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
stores/utxo/aerospike/pruner/pruner_service.go:590
- Chunk flushing still depends on
s.chunkSizebeing > 0 (if len(chunk) >= s.chunkSize). IfchunkSizeis ever configured as 0/negative (you already handle<=0for read-ahead sizing), this condition becomes always true and will submit a goroutine per record, which can overwhelm the system. Consider validating/normalizingchunkSizeonce (e.g., during service init) so both chunking andRecordQueueSizehave a safe minimum > 0.
chunk = append(chunk, rec)
if len(chunk) >= s.chunkSize {
submitChunk(chunk)
chunk = make([]*aerospike.Result, 0, s.chunkSize)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if txIDBytes, ok := rec.Record.Bins[s.fieldTxID].([]byte); ok && len(txIDBytes) == 32 { | ||
| var h chainhash.Hash | ||
| copy(h[:], txIDBytes) | ||
| prunedSet.Add(h) |
There was a problem hiding this comment.
prunedSet is dereferenced without a nil check (prunedSet.Add(h)). Since the signature allows prunedSet *PrunedTxSet, a nil value would panic here (while later usage in processRecordChunk is guarded). Either make prunedSet required or guard before calling Add.
| prunedSet.Add(h) | |
| if prunedSet != nil { | |
| prunedSet.Add(h) | |
| } |
There was a problem hiding this comment.
Good catch -- added the nil guard around prunedSet.Add(h) to match the existing nil checks in processRecordChunk.
| } | ||
| } | ||
|
|
||
| pipeline <- rec | ||
| } |
There was a problem hiding this comment.
The reader goroutine can block forever on pipeline <- rec if the processor stops reading (e.g., ctx cancellation). Because partitionWorker waits on readerDone, this can deadlock shutdown. Make the send cancellable (e.g., select on ctx.Done() while sending) and ensure the reader can also exit promptly if the recordset stalls.
There was a problem hiding this comment.
Agreed -- the bare channel send could deadlock on shutdown. Wrapped pipeline <- rec in a select with ctx.Done(), and also made the recordset.Results() recv cancellable via select so the reader can exit promptly even if the Aerospike recordset stalls.
| for rec := range pipeline { | ||
| select { | ||
| case <-ctx.Done(): | ||
| readerDone.Wait() |
There was a problem hiding this comment.
On ctx cancellation, the processor path waits on readerDone without first ensuring the reader is unblocked from recordset.Results() or a blocked send to pipeline. This can deadlock cancellation if the Aerospike recordset is stalled or if the pipeline buffer is full. Consider creating a derived context for the worker and calling its cancel/closing the recordset before waiting, so the reader goroutine is guaranteed to exit.
| readerDone.Wait() |
There was a problem hiding this comment.
Good point about the shutdown deadlock. The processor now calls recordset.Close() on ctx cancellation before waiting on readerDone, which unblocks the reader from both recordset.Results() and the pipeline send. The reader's recv is also now wrapped in a select with ctx.Done() for belt-and-suspenders safety.
…ional The fan-out case (one parent with multiple children in the same block) is extremely rare in the transaction-blaster workload. The remove-on-first-check design keeps the map self-managing at ~32MB peak. For the rare fan-out case, subsequent children send the parent update normally (KEY_NOT_FOUND handled gracefully as before). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| continue | ||
| } | ||
|
|
||
| keySource := uaerospike.CalculateKeySource(input.PreviousTxIDChainHash(), input.PreviousTxOutIndex, s.utxoBatchSize) |
There was a problem hiding this comment.
Minor hot-path inefficiency: PreviousTxIDChainHash() is called twice per input (once for parentTxID, then again for CalculateKeySource). Reuse parentTxID for CalculateKeySource to avoid the extra method call/allocation in this inner loop.
| keySource := uaerospike.CalculateKeySource(input.PreviousTxIDChainHash(), input.PreviousTxOutIndex, s.utxoBatchSize) | |
| keySource := uaerospike.CalculateKeySource(parentTxID, input.PreviousTxOutIndex, s.utxoBatchSize) |
There was a problem hiding this comment.
Good spot — was calling PreviousTxIDChainHash() twice in the hot path unnecessarily. Now reusing the parentTxID variable for CalculateKeySource.
| // Register TXID in shared set before forwarding (if record is valid) | ||
| if rec.Err == nil && rec.Record != nil && rec.Record.Bins != nil { | ||
| if txIDBytes, ok := rec.Record.Bins[s.fieldTxID].([]byte); ok && len(txIDBytes) == 32 { | ||
| var h chainhash.Hash | ||
| copy(h[:], txIDBytes) | ||
| if prunedSet != nil { | ||
| prunedSet.Add(h) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
prunedSet is populated in the reader for every scanned record before the processor determines whether that record will actually be deleted (e.g., defensive mode can continue without deleting when a child is unstable). In defensive mode this can put TXIDs into prunedSet for records that remain in Aerospike, causing CheckAndRemove to incorrectly skip required parent updates later in the same run. Consider disabling this optimization when s.defensiveEnabled is true, or only registering TXIDs after the record has been confirmed for deletion (so the set represents “will be pruned”, not merely “was scanned”).
There was a problem hiding this comment.
Really good catch — this is a genuine correctness issue. In defensive mode, the reader registers TXIDs before the processor decides whether to actually delete the record (child stability check can skip it). A skipped record stays in Aerospike but its TXID would be in the set, incorrectly suppressing a future parent update. Fixed by disabling the PrunedTxSet entirely when defensive mode is enabled. Defensive mode is about safety during reorgs, not throughput, so this is the right tradeoff.
| } | ||
|
|
||
| // Shared set tracking TXIDs scanned for pruning — used to skip wasteful parent updates | ||
| prunedSet := NewPrunedTxSet(256) |
There was a problem hiding this comment.
prunedSet is allocated unconditionally, but it’s only used inside the !s.skipParentUpdates path. When SkipParentUpdates is enabled, this adds avoidable per-record overhead (hash copy + sharded map insert) and retains memory with no effect. Consider only creating/passing a PrunedTxSet when parent updates are enabled (otherwise pass nil).
| prunedSet := NewPrunedTxSet(256) | |
| var prunedSet *PrunedTxSet | |
| if !s.skipParentUpdates { | |
| prunedSet = NewPrunedTxSet(256) | |
| } |
There was a problem hiding this comment.
Agreed — when skipParentUpdates is enabled, the set is never consulted so the per-record Add() overhead is pure waste. Now only allocating the PrunedTxSet when parent updates are enabled (and defensive mode is off). Passes nil otherwise, which the reader and processor already guard against.
- Reuse parentTxID variable instead of calling PreviousTxIDChainHash() twice - Only allocate PrunedTxSet when parent updates are enabled (nil when skipParentUpdates) - Disable PrunedTxSet in defensive mode to prevent incorrect parent update skips when records are scanned but not deleted (child stability check fails) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…uned-parent-updates
…uned-parent-updates
…uned-parent-updates
|
…uned-parent-updates
| connectionPoolWarningThreshold float64 // Threshold for connection pool auto-adjustment (0.0-1.0) | ||
| utxoSetTTL bool // Use TTL expiration instead of hard delete | ||
| partitionWorkerFn func(ctx context.Context, blockHeight uint32, partitionStart int, partitionCount int) (int64, int64, error) | ||
| skipParentUpdates bool // Skip parent update operations and input fetching |
There was a problem hiding this comment.
Good catch — confirmed dead code. skipParentUpdates was never initialized in NewService and no setting feeds it. Removed the field and simplified the conditional to if !s.defensiveEnabled { prunedSet = NewPrunedTxSet(256) }. The comment was misleading too, so it's gone with the field.
There was a problem hiding this comment.
Removed the dead skipParentUpdates field in b72cff7 — the gate at the only usage site is now simply !s.defensiveEnabled, matching the actual semantics ("only allocate the dedup set when defensive mode is off"). The field had no setter, no settings binding, and the surviving boolean expression contributed no information.
| // Accumulate parent updates, skipping parents already pruned in this session | ||
| for _, input := range inputs { | ||
| keySource := uaerospike.CalculateKeySource(input.PreviousTxIDChainHash(), input.PreviousTxOutIndex, s.utxoBatchSize) | ||
| // Check if parent TX was already pruned — if so, skip the update | ||
| parentTxID := input.PreviousTxIDChainHash() | ||
| if prunedSet != nil && prunedSet.CheckAndRemove(*parentTxID) { | ||
| prometheusUtxoParentsSkippedPruned.Inc() | ||
| continue | ||
| } | ||
|
|
||
| keySource := uaerospike.CalculateKeySource(parentTxID, input.PreviousTxOutIndex, s.utxoBatchSize) |
There was a problem hiding this comment.
Added TestProcessRecordChunk_SkipsParentsInPrunedSet and TestProcessRecordChunk_EmptyPrunedSetIsNoOp in pruned_set_skip_test.go. The first verifies that when both parents of a child's inputs are in PrunedTxSet, processRecordChunk accumulates zero parent updates and utxo_pruner_parents_skipped_pruned_total increments once per input. The second verifies an empty set is a no-op. Both run as pure unit tests (no testcontainers) because flushCleanupBatches short-circuits on empty parentUpdates and SkipDeletions gates the deletion path.
There was a problem hiding this comment.
Added TestProcessRecordChunk_SkipsParentsInPrunedSet and TestProcessRecordChunk_EmptyPrunedSetIsNoOp in stores/utxo/aerospike/pruner/pruned_set_skip_test.go (b72cff7). The first test pre-populates the shared PrunedTxSet with both parent TXIDs of a synthetic child record, drives processRecordChunk directly, and asserts that utxo_pruner_parents_skipped_pruned_total increments by exactly 2 — once per input — while the parent-update map stays empty so flushCleanupBatches never touches Aerospike. The second test confirms no metric movement when the set is empty (coinbase-style input list). Both pass under go test -race.
- Remove dead skipParentUpdates field; simplify prunedSet allocation to gate on !defensiveEnabled only. - Add TestProcessRecordChunk_SkipsParentsInPrunedSet and TestProcessRecordChunk_EmptyPrunedSetIsNoOp covering the parent-update skip path via processRecordChunk + utxo_pruner_parents_skipped_pruned_total. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| // in Aerospike. | ||
| var prunedSet *PrunedTxSet | ||
| if !s.defensiveEnabled { | ||
| prunedSet = NewPrunedTxSet(256) |
There was a problem hiding this comment.
[Minor] Potential correctness issue with prunedSet reuse across retries
The prunedSet is allocated once and shared across all retry attempts. On retry after a timeout, the same prunedSet instance is passed to workers, but it may still contain TXIDs from successfully completed partitions or partially processed records before the timeout.
This could lead to false skips if a parent TXID remains in the set but the parent was not actually deleted due to timeout. The retry will incorrectly skip updating that parent.
Suggested fix: Create a fresh prunedSet for each retry attempt inside the retry loop, or document why reuse is safe if there is a guarantee.
There was a problem hiding this comment.
Good question — worth documenting. Reuse across retries is intentional, and the correctness argument hinges on the fact that prunedSet is only ever non-nil when defensive mode is OFF. In that mode the deletedChildren bin update being skipped has no behavioural consequence (it's only consulted by the defensive safety check). Add is idempotent so re-scanning a timed-out partition is safe, and CheckAndRemove being destructive is the documented bound on the set's memory footprint. Reallocating per attempt would break the cross-partition dedup that's the whole point of the optimisation. I've extended the comment block at the allocation site to spell this out.
Address Claude Code Review concern about cross-retry reuse: the set is intentionally shared because (1) Add is idempotent, (2) CheckAndRemove being destructive is fine — at worst one wasted parent update, never a correctness issue, and (3) the suppressed deletedChildren bin update is only consulted in defensive mode, which is the only mode where prunedSet is nil. Reallocating per attempt would break the cross-partition dedup that drove this optimisation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uned-parent-updates
oskarszoon
left a comment
There was a problem hiding this comment.
Approve in principle — PrunedTxSet + the SkipParentUpdates gate is the right shape. Two questions before merge plus a small ask:
SkipParentUpdates=true+ file-store deployment:pruner_service.go:947-949classifies all external-store entries aspruned. Is that intentional (file-store entries are externally managed so the parent-update gate is irrelevant), or do file-store nodes need a separate handling path? A one-line guard either way would make the intent unambiguous.PrunedTxSetis sharded into 256 buckets. Was that count picked against observed pruner concurrency, or default? On low-concurrency deployments, over-sharding mostly adds GC pressure without throughput benefit — happy if 256 is measured, worth tuning if not.- The new
prometheusUtxoParentsUpdated/Skippedcounters expose the gate behaviour but the PR doesn't note an expected baseline skip-percentage for operators. A sentence in the longdesc ("expect ~X% skipped after warmup") would make the metric actionable.
ordishs
left a comment
There was a problem hiding this comment.
Approve with comments
Core design is sound — gating on defensiveEnabled is the right invariant, and the retry-reuse comment in PruneWithPartitions is a model for documenting subtle concurrency contracts. Two-stage reader/processor pipeline handles cancellation correctly and won't leak goroutines (the processor always drains until the reader closes the channel, so every readerDone.Wait() is non-blocking by the time it executes).
Strengths
- Clean module isolation:
PrunedTxSetis a small, well-named API in its own file. - Defensive-mode gating is correct:
prunedSetisnilwhenever thedeletedChildrenbin is consulted, so skipped parent updates never affect a code path that reads the bin. - Solid concurrency test:
TestPrunedTxSet_ConcurrentAccess(100×1000 ops) under-race, plus a chain-simulation test that exercises the actual usage pattern. - Cancellation is wired through both selects in the reader.
Suggested cleanups (non-blocking)
1. assert vs require in pruned_tx_set_test.go
Project rule (.claude/rules/testing.md): "Use require from testify (not assert)". pruned_set_skip_test.go follows this; pruned_tx_set_test.go uses assert throughout. Mechanical replace.
2. Unbounded set growth under non-chain workloads
The "~32 MB / ~400K dangling entries" estimate is workload-specific (tight chains → peak dangling = chain count, internal nodes get consumed). For a workload where most parents are from prior blocks, every TXID Added stays dangling, and Len() approaches N (total TXs pruned per session). At 10M records that's ~400 MB. Worth either:
- A sentence in the PruneWithPartitions comment acknowledging the unbounded-in-N memory cost, or
- A soft cap (e.g. degrade to no-op once
Len() > some_threshold) for safety on workloads that don't match the dev-scale chain topology.
3. math.Min(float64(readAheadBase*2), 10000) (pruner_service.go:489)
Go 1.21+ builtin min(readAheadBase*2, 10000) — same behavior, no float round-trip.
4. mask uint8 is tightly coupled to the 256 shard cap (pruned_tx_set.go:13,42)
Works only because n is capped at 256. If anyone later raises the cap, this becomes a silent bug. Either widen to uint32 (no perf cost — still h[0] & mask) or add a comment tying the type to the cap.
5. Reader ordering with rec.Err != nil
The reader Adds the TXID to the set before checking rec.Err, but the Add is already gated on rec.Err == nil at line ~521, so this is fine — just worth a glance to confirm no test depended on the prior chunk-ordering of error records.
6. Hash distribution by first byte (pruned_tx_set.go:42)
Fine for SHA-256-derived TXIDs (uniform first byte). A line in the docstring noting "relies on uniform hash distribution; not safe for non-cryptographic keys" would protect future reuse.
7. Cross-worker dedup isn't directly tested
Set semantics and in-chunk skip path are covered; the cross-worker race (the motivation for the two-stage pipeline) is only validated via the deployed metric. Not blocking — hard to test deterministically — but worth flagging that the dev-scale utxo_pruner_parents_skipped_pruned_total check is the real correctness signal.
Correctness verification
Re-checked the retry edge case: parent's deletion times out in attempt 1, child already CheckAndRemoved it. Parent's deletedChildren bin is left slightly stale on retry. Safe because the only consumer of that bin is the defensive-mode verification (processRecordChunk Step 1), and defensive mode is off whenever prunedSet is non-nil. The PR comment captures this exactly.
LGTM after the assert → require swap and ideally a note (or cap) on worst-case memory.
Generated by Claude Code
Address review feedback from @ordishs on bsv-blockchain#628: - Soft cap on PrunedTxSet (10M entries) so memory stays bounded on workloads that don't fit the tight-chain pattern. Sessions on dev-scale-1 prune ~500M records, so an unbounded set could grow to many GB if most parents live in prior blocks. Once the cap is hit, Add() becomes a no-op and the skip optimisation silently degrades to baseline — no correctness impact. - Reuse the cap comment to document the unbounded-in-N concern in PruneWithPartitions, alongside the existing retry-reuse rationale. - assert -> require in pruned_tx_set_test.go (project rule). - math.Min -> builtin min (Go 1.21+). - Doc the mask uint8 / 256-shard coupling and the SHA-256 first-byte distribution assumption on PrunedTxSet. - Add TestPrunedTxSet_SoftCap and TestPrunedTxSet_UnlimitedWhenCapZero. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Thanks for the careful read @ordishs. Walking through each point — fixed in adc7dbe: 1. 2. Unbounded set growth under non-chain workloads — Real problem, you nailed it. Pulled the actual numbers from dev-scale-1: pruner sessions delete ~493M records on average ( 3. 4. 5. Reader ordering with 6. Hash distribution docstring — Added a paragraph on 7. Cross-worker dedup test — Agreed, deterministic test is hard given the two-stage pipeline's race-by-design. Leaving the deployed Re-tested: |
|



Summary
PrunedTxSet(256-shard concurrent map) across all partition workers that tracks TXIDs as they're scanned for pruningpartitionWorkerinto a two-stage read-ahead pipeline: stage 1 reads from Aerospike and registers TXIDs into the shared set via a 10K-entry buffered channel; stage 2 processes records and checks the set before issuing parent updates — if the parent is found, the update is skippedTest plan
PrunedTxSet(add/contains, check-and-remove, length tracking, concurrent access with 100 goroutines × 1000 ops, shard distribution, chain simulation, parent-not-in-block)-raceflagTestPruneWithPartitionspartition retry tests passmake build-teranodepassesmake lintpasses with 0 issuesutxo_pruner_parents_skipped_pruned_totalcounter accounts for ~25% of parent update attemptsutxo_pruner_parents_updated_skipped_total(KEY_NOT_FOUND) drops to near zero🤖 Generated with Claude Code