Skip to content

fix: wait for BlockValidation to process invalid moveBack blocks before loading unmined txs#691

Merged
freemans13 merged 5 commits into
bsv-blockchain:mainfrom
freemans13:fix/reset-invalid-moveback-txs-unmined
Apr 16, 2026
Merged

fix: wait for BlockValidation to process invalid moveBack blocks before loading unmined txs#691
freemans13 merged 5 commits into
bsv-blockchain:mainfrom
freemans13:fix/reset-invalid-moveback-txs-unmined

Conversation

@freemans13

@freemans13 freemans13 commented Apr 13, 2026

Copy link
Copy Markdown
Collaborator

Summary

During a reorg reset, invalid moveBack blocks are handled by BlockValidation's background job (setTxMinedStatus(unsetMined=true)), which sets unmined_since on their transactions. Previously, reset() called loadUnminedTransactions without waiting for this background processing to complete, so those txs still had unmined_since=NULL and were invisible to the unmined iterator.

Changes:

  • Add waitForBlockMinedSet — polls until BlockValidation has finished processing each invalid moveBack block (exponential backoff, context-cancellable)
  • Call it for each invalid moveBack block before loadUnminedTransactions runs
  • On context cancellation: return error. On retries exhausted: proceed with warning (txs may be recovered on subsequent resets)
  • Short-circuit on non-retriable errors (ErrBlockNotFound)

Test plan

  • Existing block assembly tests pass
  • make lint — 0 issues
  • Deploy to teratestnet and verify reset correctly recovers txs from invalid moveBack blocks

🤖 Generated with Claude Code

…during reset

After InvalidateBlock, reset() skipped invalid moveBack blocks assuming
BlockValidation's async setTxMinedStatus(unsetMined=true) would handle
them. But that runs asynchronously via a BlockMinedUnset notification
and may not have completed when reset() calls loadUnminedTransactions().

The result: transactions from invalidated blocks retain unmined_since=NULL
(mined state) in the UTXO store, so loadUnminedTransactions() misses them
and they are silently lost from block assembly.

Fix: remove the invalid block skip — MarkTransactionsOnLongestChain is
idempotent, so double-marking is safe and ensures txs are recoverable.

Fixes TestLongestChainPostgres/invalid_block which regressed in PR bsv-blockchain#679.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

Issues Found:

  1. [Major] Related documentation misleads about mined_set semantics (stores/blockchain/sql/GetBlockIsMined.go - not in PR)

    The PR uses GetBlockIsMined to wait for BlockValidation processing, but the existing documentation for this method incorrectly describes mined_set as indicating "a block was mined by the local node" or "created by the local mining process".

    Actual semantics: mined_set=true means BlockValidation has completed setTxMinedStatus processing for that block, NOT that the block was mined locally. This flag is set for ALL blocks (locally mined AND network-received) after processing.

    Why this matters for this PR: The PR correctly uses GetBlockIsMined to wait for BlockValidation to finish processing invalid blocks from the network, which proves the flag is NOT about mining origin. Future developers reading the GetBlockIsMined docs might be confused about why this PR uses it for network blocks.

    Evidence:

    • BlockValidation.go:1014 calls SetBlockMinedSet after processing ANY block
    • blockchain_reference.md:325 correctly describes it as a processing completion flag
    • This PR (line 412-413) correctly explains it as waiting for setTxMinedStatus completion

    Consider filing a follow-up issue to fix the misleading documentation in GetBlockIsMined.go.

PR Code Quality: Implementation is solid - proper error handling, exponential backoff with max cap, short-circuit for non-retriable errors, and clear comments about best-effort semantics. The logic correctly addresses the race condition between reset and BlockValidation async processing.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a reset-time reorg edge case in block assembly where transactions from invalidated (moveBack) blocks could remain marked as mined (unmined_since=NULL) due to asynchronous BlockValidation timing, causing them to be missed by loadUnminedTransactions() and dropped from assembly.

Changes:

  • Stop skipping meta.Invalid moveBack blocks when collecting txs to mark as unmined during BlockAssembler.reset().
  • Add inline rationale explaining the async timing risk and why marking here is safe.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// a BlockMinedUnset notification and may not have completed yet. If we
// skip, loadUnminedTransactions() won't find these txs (unmined_since
// still NULL) and they'll be silently lost from block assembly.
// MarkTransactionsOnLongestChain is idempotent, so double-marking is safe.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — "idempotent" was misleading. Reworded the comment to clarify that it's safe for correctness (ensures unmined_since is non-NULL) even though repeated calls may bump the height value.

@github-actions

github-actions Bot commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-691 (ce972d2)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 142
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.691µ 1.907µ ~ 0.100
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 61.87n 61.49n ~ 0.100
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 61.39n 61.48n ~ 0.400
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 61.49n 61.62n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 29.98n 31.24n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 52.52n 51.60n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 108.8n 106.5n ~ 0.200
MiningCandidate_Stringify_Short-4 266.3n 266.7n ~ 1.000
MiningCandidate_Stringify_Long-4 1.892µ 1.906µ ~ 0.100
MiningSolution_Stringify-4 970.7n 980.6n ~ 0.100
BlockInfo_MarshalJSON-4 1.743µ 1.757µ ~ 0.100
NewFromBytes-4 128.6n 145.0n ~ 0.600
Mine_EasyDifficulty-4 57.54µ 57.65µ ~ 0.700
Mine_WithAddress-4 4.636µ 4.639µ ~ 1.000
BlockAssembler_AddTx-4 0.02677n 0.02893n ~ 0.400
AddNode-4 10.94 10.50 ~ 0.400
AddNodeWithMap-4 10.79 11.17 ~ 0.700
DirectSubtreeAdd/4_per_subtree-4 62.66n 64.72n ~ 0.200
DirectSubtreeAdd/64_per_subtree-4 29.11n 28.58n ~ 0.700
DirectSubtreeAdd/256_per_subtree-4 27.22n 27.34n ~ 0.100
DirectSubtreeAdd/1024_per_subtree-4 26.24n 26.27n ~ 0.400
DirectSubtreeAdd/2048_per_subtree-4 25.75n 25.87n ~ 0.200
SubtreeProcessorAdd/4_per_subtree-4 314.4n 314.5n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 313.3n 312.7n ~ 1.000
SubtreeProcessorAdd/256_per_subtree-4 311.6n 313.4n ~ 1.000
SubtreeProcessorAdd/1024_per_subtree-4 311.6n 314.2n ~ 1.000
SubtreeProcessorAdd/2048_per_subtree-4 312.1n 318.2n ~ 0.200
SubtreeProcessorRotate/4_per_subtree-4 318.0n 316.9n ~ 1.000
SubtreeProcessorRotate/64_per_subtree-4 319.8n 319.3n ~ 0.700
SubtreeProcessorRotate/256_per_subtree-4 316.1n 315.5n ~ 0.600
SubtreeProcessorRotate/1024_per_subtree-4 310.5n 309.5n ~ 1.000
SubtreeNodeAddOnly/4_per_subtree-4 66.70n 66.63n ~ 1.000
SubtreeNodeAddOnly/64_per_subtree-4 39.20n 39.43n ~ 0.400
SubtreeNodeAddOnly/256_per_subtree-4 37.98n 37.94n ~ 1.000
SubtreeNodeAddOnly/1024_per_subtree-4 37.29n 37.19n ~ 0.700
SubtreeCreationOnly/4_per_subtree-4 165.4n 164.1n ~ 0.100
SubtreeCreationOnly/64_per_subtree-4 620.5n 686.5n ~ 0.400
SubtreeCreationOnly/256_per_subtree-4 1.986µ 1.993µ ~ 1.000
SubtreeCreationOnly/1024_per_subtree-4 4.796µ 5.098µ ~ 0.400
SubtreeCreationOnly/2048_per_subtree-4 8.273µ 8.224µ ~ 0.700
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 310.0n 310.0n ~ 1.000
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 320.1n 312.2n ~ 0.400
ParallelGetAndSetIfNotExists/1k_nodes-4 980.1µ 1003.2µ ~ 0.100
ParallelGetAndSetIfNotExists/10k_nodes-4 1.966m 2.013m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 8.941m 9.000m ~ 1.000
ParallelGetAndSetIfNotExists/100k_nodes-4 17.90m 17.68m ~ 0.100
SequentialGetAndSetIfNotExists/1k_nodes-4 772.0µ 773.4µ ~ 1.000
SequentialGetAndSetIfNotExists/10k_nodes-4 3.074m 3.210m ~ 0.100
SequentialGetAndSetIfNotExists/50k_nodes-4 11.19m 11.25m ~ 0.700
SequentialGetAndSetIfNotExists/100k_nodes-4 21.10m 20.87m ~ 0.200
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 1.045m 1.040m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 4.848m 4.814m ~ 0.200
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 19.55m 19.46m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 859.5µ 854.5µ ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 6.363m 6.307m ~ 0.200
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 40.58m 40.59m ~ 1.000
DiskTxMap_SetIfNotExists-4 3.621µ 3.547µ ~ 0.700
DiskTxMap_SetIfNotExists_Parallel-4 3.403µ 3.417µ ~ 0.400
DiskTxMap_ExistenceOnly-4 308.3n 306.0n ~ 1.000
Queue-4 196.0n 194.4n ~ 0.400
AtomicPointer-4 4.114n 4.272n ~ 0.700
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 885.2µ 875.5µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/10K-4 866.1µ 906.5µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 111.2µ 118.5µ ~ 0.100
ReorgOptimizations/AllMarkFalse/New/10K-4 62.48µ 62.34µ ~ 0.400
ReorgOptimizations/HashSlicePool/Old/10K-4 71.14µ 70.86µ ~ 0.700
ReorgOptimizations/HashSlicePool/New/10K-4 11.65µ 10.60µ ~ 0.100
ReorgOptimizations/NodeFlags/Old/10K-4 6.047µ 6.073µ ~ 0.800
ReorgOptimizations/NodeFlags/New/10K-4 2.267µ 2.356µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 11.39m 10.34m ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/100K-4 10.764m 9.916m ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.161m 1.177m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 680.7µ 683.2µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/100K-4 634.9µ 655.7µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 291.1µ 317.5µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/100K-4 57.17µ 56.71µ ~ 0.700
ReorgOptimizations/NodeFlags/New/100K-4 19.20µ 20.11µ ~ 0.200
TxMapSetIfNotExists-4 51.45n 51.56n ~ 1.000
TxMapSetIfNotExistsDuplicate-4 38.06n 38.54n ~ 1.000
ChannelSendReceive-4 594.3n 610.2n ~ 0.100
CalcBlockWork-4 503.0n 539.2n ~ 1.000
CalculateWork-4 677.6n 679.0n ~ 0.200
BuildBlockLocatorString_Helpers/Size_10-4 1.393µ 1.622µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_100-4 12.21µ 12.50µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_1000-4 122.0µ 123.4µ ~ 0.100
CatchupWithHeaderCache-4 104.2m 104.2m ~ 0.700
_BufferPoolAllocation/16KB-4 3.533µ 3.310µ ~ 0.700
_BufferPoolAllocation/32KB-4 7.581µ 8.629µ ~ 0.700
_BufferPoolAllocation/64KB-4 15.30µ 15.40µ ~ 0.700
_BufferPoolAllocation/128KB-4 29.59µ 27.05µ ~ 0.100
_BufferPoolAllocation/512KB-4 119.0µ 113.4µ ~ 0.100
_BufferPoolConcurrent/32KB-4 17.63µ 18.70µ ~ 0.100
_BufferPoolConcurrent/64KB-4 28.25µ 29.78µ ~ 0.700
_BufferPoolConcurrent/512KB-4 149.7µ 142.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/16KB-4 685.5µ 617.9µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 633.9µ 647.1µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/64KB-4 643.2µ 617.9µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/128KB-4 635.5µ 624.6µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/512KB-4 651.3µ 650.5µ ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/16KB-4 35.76m 35.10m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/32KB-4 35.49m 35.94m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/64KB-4 35.39m 35.03m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/128KB-4 35.75m 35.14m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/512KB-4 36.28m 34.70m ~ 0.100
_PooledVsNonPooled/Pooled-4 736.9n 739.2n ~ 0.400
_PooledVsNonPooled/NonPooled-4 7.117µ 8.282µ ~ 0.100
_MemoryFootprint/Current_512KB_32concurrent-4 7.564µ 7.431µ ~ 0.700
_MemoryFootprint/Proposed_32KB_32concurrent-4 11.58µ 10.14µ ~ 0.400
_MemoryFootprint/Alternative_64KB_32concurrent-4 11.55µ 10.63µ ~ 0.100
_prepareTxsPerLevel-4 404.1m 400.8m ~ 0.700
_prepareTxsPerLevelOrdered-4 3.420m 3.567m ~ 0.100
_prepareTxsPerLevel_Comparison/Original-4 411.7m 413.1m ~ 1.000
_prepareTxsPerLevel_Comparison/Optimized-4 3.445m 3.474m ~ 0.400
SubtreeSizes/10k_tx_4_per_subtree-4 1.278m 1.273m ~ 1.000
SubtreeSizes/10k_tx_16_per_subtree-4 301.7µ 299.4µ ~ 0.700
SubtreeSizes/10k_tx_64_per_subtree-4 71.62µ 70.27µ ~ 0.100
SubtreeSizes/10k_tx_256_per_subtree-4 17.80µ 17.65µ ~ 0.500
SubtreeSizes/10k_tx_512_per_subtree-4 8.989µ 8.727µ ~ 0.100
SubtreeSizes/10k_tx_1024_per_subtree-4 4.408µ 4.297µ ~ 0.700
SubtreeSizes/10k_tx_2k_per_subtree-4 2.188µ 2.177µ ~ 0.400
BlockSizeScaling/10k_tx_64_per_subtree-4 70.36µ 69.19µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 17.60µ 17.38µ ~ 0.200
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.339µ 4.275µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 372.5µ 363.1µ ~ 0.100
BlockSizeScaling/50k_tx_256_per_subtree-4 88.22µ 88.15µ ~ 1.000
BlockSizeScaling/50k_tx_1024_per_subtree-4 22.05µ 21.53µ ~ 0.700
SubtreeAllocations/small_subtrees_exists_check-4 150.4µ 148.3µ ~ 0.100
SubtreeAllocations/small_subtrees_data_fetch-4 157.4µ 159.6µ ~ 0.100
SubtreeAllocations/small_subtrees_full_validation-4 306.7µ 309.4µ ~ 0.100
SubtreeAllocations/medium_subtrees_exists_check-4 9.012µ 8.801µ ~ 0.100
SubtreeAllocations/medium_subtrees_data_fetch-4 9.230µ 9.246µ ~ 1.000
SubtreeAllocations/medium_subtrees_full_validation-4 17.47µ 17.26µ ~ 0.400
SubtreeAllocations/large_subtrees_exists_check-4 2.130µ 2.100µ ~ 0.700
SubtreeAllocations/large_subtrees_data_fetch-4 2.206µ 2.215µ ~ 1.000
SubtreeAllocations/large_subtrees_full_validation-4 4.379µ 4.288µ ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 301.7µ 297.1µ ~ 0.100
StoreBlock_Sequential/AboveCSVHeight-4 299.3µ 297.8µ ~ 0.700
GetUtxoHashes-4 253.6n 254.4n ~ 0.400
GetUtxoHashes_ManyOutputs-4 46.92µ 45.76µ ~ 0.200
_NewMetaDataFromBytes-4 238.8n 237.0n ~ 0.700
_Bytes-4 622.9n 609.9n ~ 0.400
_MetaBytes-4 552.1n 552.9n ~ 1.000

Threshold: >10% with p < 0.05 | Generated: 2026-04-14 19:40 UTC

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a reset/reorg recovery edge case where transactions from invalidated (meta.Invalid) moveBack blocks could remain in a “mined” state (unmined_since = NULL) due to async timing, causing loadUnminedTransactions() to miss them and drop them from block assembly.

Changes:

  • Removes the meta.Invalid skip for moveBack blocks when collecting tx hashes to mark as unmined.
  • Adds clarifying inline comments explaining the async BlockMinedUnset timing hazard and why re-marking is correctness-safe.
Comments suppressed due to low confidence (1)

services/blockassembly/BlockAssembler.go:472

  • If GetSubtrees fails for a moveBack block, reset() currently logs and continues, which means those block’s tx hashes never get passed to MarkTransactionsOnLongestChain(false). Since loadUnminedTransactions() only scans txs with unmined_since set, this can permanently “lose” those txs from assembly after the reset. Consider failing the reset (returning an error) when subtree retrieval fails for any moveBack block, or otherwise ensuring txs from that block are still marked unmined so correctness isn’t dependent on subtree-store availability.
			block := blockWithMeta.block
			blockSubtrees, err := block.GetSubtrees(ctx, b.logger, b.subtreeStore, b.settings.Block.GetAndValidateSubtreesConcurrency)
			if err != nil {
				b.logger.Warnf("[BlockAssembler][Reset] error getting subtrees for moveBack block %s: %v (will skip)", block.Hash().String(), err)
				continue

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +458 to +466
// Do NOT skip invalid blocks here. BlockValidation handles them via
// setTxMinedStatus(unsetMined=true), but that runs asynchronously via
// a BlockMinedUnset notification and may not have completed yet. If we
// skip, loadUnminedTransactions() won't find these txs (unmined_since
// still NULL) and they'll be silently lost from block assembly.
// Double-marking is safe for correctness: MarkTransactionsOnLongestChain
// ensures unmined_since is non-NULL (the key invariant for
// loadUnminedTransactions), even though repeated calls may bump the
// unmined_since height value.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that a dedicated test for this scenario would be valuable. The existing TestParentTransactionNETCalculationDuringReset and the sequential-postgres TestLongestChainPostgres/invalid_block test do exercise this code path, but a focused unit test for reset() with meta.Invalid moveBack blocks asserting unmined_since is set would be a good regression guard. Will track as a follow-up.

…minedTransactions

The previous approach of removing the invalid block skip was insufficient
because GetSubtrees may fail for invalidated blocks. Instead, explicitly
wait for BlockValidation's setTxMinedStatus(unsetMined=true) to complete
for each invalid moveBack block before proceeding.

InvalidateBlock sets mined_set=false and sends a BlockMinedUnset
notification. The async setTxMinedStatus handler processes it — unsetting
mined status for the block's txs (setting unmined_since) — then sets
mined_set=true. waitForBlockMinedSet polls until mined_set=true,
guaranteeing loadUnminedTransactions will find the txs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@freemans13 freemans13 force-pushed the fix/reset-invalid-moveback-txs-unmined branch from 02e6585 to b00e13a Compare April 14, 2026 01:02
@freemans13 freemans13 changed the title fix: do not skip invalid moveBack blocks when marking txs as unmined during reset fix: reconcile SI-scan-missed parents in validateParentChain Apr 14, 2026
@freemans13 freemans13 force-pushed the fix/reset-invalid-moveback-txs-unmined branch from 91a8be8 to b00e13a Compare April 14, 2026 18:08
@freemans13 freemans13 changed the title fix: reconcile SI-scan-missed parents in validateParentChain fix: reset handling for invalid moveBack blocks and unmined tx recovery Apr 14, 2026
@freemans13 freemans13 requested a review from Copilot April 14, 2026 18:23

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens block-assembly reset behavior around reorgs involving invalid “moveBack” blocks so that transactions from those blocks are reliably recovered into the unmined pool (via unmined_since) before loadUnminedTransactions() runs.

Changes:

  • Add a reset-time wait loop for invalid moveBack blocks to allow BlockValidation’s async setTxMinedStatus(unsetMined=true) to complete before reloading unmined transactions.
  • Clarify comments around skipping invalid blocks during moveBack tx unmining (BlockValidation already handles them).
  • Introduce waitForBlockMinedSet() using the shared retry utility and BlockValidation retry settings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +568 to +572
},
retry.WithBackoffDurationType(b.settings.BlockValidation.IsParentMinedRetryBackoffDuration),
retry.WithBackoffMultiplier(b.settings.BlockValidation.IsParentMinedRetryBackoffMultiplier),
retry.WithRetryCount(b.settings.BlockValidation.IsParentMinedRetryMaxRetry),
)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added retry.WithMessage to identify these retries in logs. Also switched from linear to exponential backoff with a 2s max cap, which should bring worst-case total wait from ~80s down to ~30s per block.

blockHash := blockWithMeta.block.Hash()
b.logger.Infof("[BlockAssembler][Reset] waiting for invalid block %s to be processed by BlockValidation", blockHash.String())
if waitErr := b.waitForBlockMinedSet(ctx, blockHash); waitErr != nil {
b.logger.Warnf("[BlockAssembler][Reset] timeout waiting for invalid block %s mined_set: %v (proceeding anyway)", blockHash.String(), waitErr)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Reworded the comment to say "best-effort wait" and added explicit handling for context cancellation — if ctx is cancelled, we now return an error instead of proceeding. For other failures (e.g., retries exhausted), we log a warning and proceed since those txs can be recovered on the next reset cycle.

Comment on lines +419 to +421
if waitErr := b.waitForBlockMinedSet(ctx, blockHash); waitErr != nil {
b.logger.Warnf("[BlockAssembler][Reset] timeout waiting for invalid block %s mined_set: %v (proceeding anyway)", blockHash.String(), waitErr)
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — changed "timeout waiting" to "gave up waiting" which covers both retry exhaustion and other error scenarios.

return true, nil
},
retry.WithBackoffDurationType(b.settings.BlockValidation.IsParentMinedRetryBackoffDuration),
retry.WithBackoffMultiplier(b.settings.BlockValidation.IsParentMinedRetryBackoffMultiplier),

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed — switched to exponential backoff (factor 2.0) with a 2s max cap. With 20ms base and 45 retries, worst-case total is now ~90s (most of which is at the 2s cap) vs the previous ~80s linear, but individual retries ramp up much faster to the useful polling interval. Also dropped the linear backoff multiplier since it's not used with exponential mode.

- Reword "must wait" to "best-effort wait" with context cancellation as hard stop
- Change "timeout" log to "gave up waiting" for accuracy
- Add retry.WithMessage for log correlation
- Switch to exponential backoff (factor 2.0, max 2s) for waitForBlockMinedSet

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves BlockAssembler.reset() reorg/reset recovery by ensuring transactions from invalid moveBack blocks have their mined status fully unset by BlockValidation before unmined transactions are reloaded, preventing stale unmined_since and missed recovery.

Changes:

  • Adds a best-effort wait during reset for invalid moveBack blocks to reach mined_set=true (i.e., BlockValidation finished setTxMinedStatus(unsetMined=true)).
  • Keeps skipping invalid moveBack blocks during explicit “mark moveBack txs unmined” processing, but now does so after the new wait to ensure tx state is updated.
  • Introduces waitForBlockMinedSet() using the shared util/retry helper with exponential backoff and a capped max delay.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 480 to 483
if blockWithMeta.meta.Invalid {
// Skip invalid blocks - BlockValidation already handled them via unsetMined=true
// Skip invalid blocks — BlockValidation has already handled them via
// setTxMinedStatus(unsetMined=true) which we waited for above.
continue

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the PR description was misleading. Updated it to accurately describe the behavior: we still skip invalid blocks in the tx collection loop (BlockValidation handles them), but we now wait for BlockValidation to finish processing before loadUnminedTransactions runs. The description now clarifies this is a best-effort wait with context cancellation as a hard stop.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves reset-time recovery of transactions from invalid moveBack blocks by adding a best-effort synchronization point to ensure BlockValidation has finished unsetting mined status (setting unmined_since) before loadUnminedTransactions() runs.

Changes:

  • Add a best-effort wait during reset() for invalid moveBack blocks to reach mined_set=true (i.e., BlockValidation finished setTxMinedStatus(unsetMined=true)).
  • Introduce waitForBlockMinedSet() using the shared util/retry helper with exponential backoff and capped delay.
  • Clarify comments around skipping invalid blocks in the moveBack tx collection loop (BlockValidation remains responsible for unmining them).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +416 to +424
for _, blockWithMeta := range moveBackBlocksWithMeta {
if blockWithMeta.meta.Invalid {
blockHash := blockWithMeta.block.Hash()
b.logger.Infof("[BlockAssembler][Reset] waiting for invalid block %s to be processed by BlockValidation", blockHash.String())
if waitErr := b.waitForBlockMinedSet(ctx, blockHash); waitErr != nil {
if ctx.Err() != nil {
return errors.NewProcessingError("[Reset] context cancelled while waiting for invalid block mined_set", waitErr)
}
b.logger.Warnf("[BlockAssembler][Reset] gave up waiting for invalid block %s mined_set: %v (proceeding anyway — txs may be recovered on next reset)", blockHash.String(), waitErr)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point about sequential waiting. In practice, having multiple invalid moveBack blocks during a single reset is rare (typically 0-1), so the sequential approach keeps the code simple. If this becomes a real bottleneck, adding concurrent waiting with a semaphore would be straightforward. Leaving as-is for now.

Comment on lines +562 to +566
_, err := retry.Retry(ctx, b.logger, func() (bool, error) {
isMined, err := b.blockchainClient.GetBlockIsMined(ctx, blockHash)
if err != nil {
return false, err
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Added short-circuiting for ErrBlockNotFound by cancelling the retry context immediately — avoids burning the full retry budget on a block that will never appear. Other transient errors (RPC failures, etc.) still get retried normally.

- Short-circuit retries on non-retriable errors (ErrBlockNotFound) in
  waitForBlockMinedSet by cancelling the retry context immediately

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves BlockAssembler.reset() behavior during reorg resets by best-effort waiting for BlockValidation to finish asynchronously unmining transactions from invalid moveBack blocks before loadUnminedTransactions() runs, preventing missed recovery of unmined txs.

Changes:

  • Add a best-effort per-invalid-block wait (with hard-stop on context cancellation) before reloading unmined transactions.
  • Introduce waitForBlockMinedSet() using util/retry to poll mined_set until BlockValidation has completed processing.
  • Update reset comments to clarify responsibility boundaries and the best-effort nature of the wait.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return false, err
}
if !isMined {
return false, errors.NewBlockParentNotMinedError(

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BlockParentNotMinedError usage here mirrors the exact same pattern used in BlockValidation.go (line ~1676) for the same purpose — polling mined_set. The error never escapes the retry loop (it's only used as a retry signal), so it won't trigger incorrect alerts/metrics. Introducing a new error type would add complexity without practical benefit given the error stays internal to the retry. Keeping consistent with the existing BlockValidation pattern.

// Short-circuit on non-retriable errors (block doesn't exist in DB)
if errors.Is(err, errors.ErrBlockNotFound) {
nonRetriableErr = errors.NewProcessingError(
"[waitForBlockMinedSet] block %s not found — cannot wait for mined_set", blockHash.String())

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — now passing the original err as the last param to NewProcessingError so it gets wrapped and the original cause is preserved in logs.

- Wrap original error in NewProcessingError for ErrBlockNotFound
  short-circuit to preserve cause context in logs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
31.4% Coverage on New Code (required ≥ 80%)
C Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@freemans13 freemans13 changed the title fix: reset handling for invalid moveBack blocks and unmined tx recovery fix: wait for BlockValidation to process invalid moveBack blocks before loading unmined txs Apr 14, 2026
@freemans13 freemans13 requested a review from icellan April 16, 2026 07:53
@icellan

icellan commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

@freemans13 Please don't merge until we've merged #704

@freemans13 freemans13 merged commit 31d7e63 into bsv-blockchain:main Apr 16, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants