Skip to content

perf(netsync): broaden quickValidationMode to cover CATCHINGBLOCKS + behind-tip#723

Closed
freemans13 wants to merge 19 commits into
bsv-blockchain:mainfrom
freemans13:stu/netsync-quick-validation-catchup
Closed

perf(netsync): broaden quickValidationMode to cover CATCHINGBLOCKS + behind-tip#723
freemans13 wants to merge 19 commits into
bsv-blockchain:mainfrom
freemans13:stu/netsync-quick-validation-catchup

Conversation

@freemans13

@freemans13 freemans13 commented Apr 18, 2026

Copy link
Copy Markdown
Collaborator

Problem

Legacy netsync has a quickValidationMode (parallel UTXO create-then-spend, skip checkSubtreeFromBlock) that activates in LEGACYSYNCING. The safety argument: bsvd netsync only enters that state while catching up to a hardcoded checkpoint via headers-first sync, so blocks in that window are anchored to the checkpoint hash.

When teranode runs both legacy and p2p services and the p2p path drives catch-up, the FSM transitions to CATCHINGBLOCKS instead. If catch-up is still within the checkpoint-covered range, the same safety argument holds — but the gating never activated the quick path there.

Live profiling on 2026-04-17 showed the quick path ran on <1% of catch-up blocks, leaving checkSubtreeFromBlock (~337 ms/block mainnet, 2.4 s/block testnet) on the hot path.

Fix

Activate quickValidationMode in CATCHINGBLOCKS too, with the same checkpoint-anchor requirement: the block being validated must be at or below the chain's final hardcoded checkpoint. Past the final checkpoint, full subtree validation always runs — tip blocks must not be fast-validated on trust.

New helper shouldUseQuickValidationMode(inLegacySync, inCatchingBlocks, belowFinalCheckpoint); prepareSubtrees now uses a single GetFSMCurrentState call instead of two sequential IsFSMCurrentState calls.

Behaviour

FSM state block ≤ final checkpoint block > final checkpoint
LEGACYSYNCING quick (unchanged) n/a — bsvd only enters below checkpoint
CATCHINGBLOCKS quick (new) full validation (unchanged)
RUNNING / IDLE full validation (unchanged)

No new operator settings — the checkpoint anchor is load-bearing and isn't overridable.

Testing

Unit tests for shouldUseQuickValidationMode; existing services/legacy/netsync tests pass under `-race`; `make lint` clean.

@github-actions

github-actions Bot commented Apr 18, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete


Current Review:

No blocking issues found. The PR successfully addresses the performance optimization goal and includes proper defensive checks in the new code paths.

Key Changes Verified:

  • ✅ Defensive nil-safety checks added to extendFromTxMap (lines 1064-1078)
  • ✅ BatchPreviousOutputsDecorate properly parallelized for Aerospike (stores/utxo/aerospike/get.go:1198-1223)
  • ✅ findLocalSubtreeFile correctly checks both FileTypeSubtreeToCheck and FileTypeSubtree
  • ✅ Test coverage added for findLocalSubtreeFile including the quickValidationMode regression case
  • ✅ Bounds checking uses uint32 comparison to avoid 32-bit overflow

Notes:

  • Pre-existing code (ExtendTransaction function at lines 1286-1377) lacks the same defensive checks but is out of scope for this performance-focused PR. Recommend filing a separate security hardening issue.

History:

  • ✅ Fixed: Multiple issues from earlier reviews (nil checks, race conditions, documentation accuracy, batching performance) have been resolved through iterative improvements.

@github-actions

github-actions Bot commented Apr 18, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-723 (be7a3ba)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 142
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.187µ 1.191µ ~ 1.000
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 55.23n 54.87n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 54.82n 55.17n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 54.84n 54.85n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 24.21n 24.80n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 40.70n 41.28n ~ 0.200
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 94.16n 94.99n ~ 1.000
MiningCandidate_Stringify_Short-4 171.9n 171.6n ~ 0.700
MiningCandidate_Stringify_Long-4 1.231µ 1.221µ ~ 0.200
MiningSolution_Stringify-4 639.7n 636.7n ~ 0.700
BlockInfo_MarshalJSON-4 1.315µ 1.315µ ~ 1.000
NewFromBytes-4 126.0n 124.9n ~ 0.200
Mine_EasyDifficulty-4 61.02µ 61.50µ ~ 0.700
Mine_WithAddress-4 7.009µ 6.776µ ~ 0.100
DirectSubtreeAdd/4_per_subtree-4 59.48n 59.70n ~ 0.700
DirectSubtreeAdd/64_per_subtree-4 28.20n 28.20n ~ 1.000
DirectSubtreeAdd/256_per_subtree-4 27.03n 27.07n ~ 0.400
DirectSubtreeAdd/1024_per_subtree-4 26.07n 26.05n ~ 0.900
DirectSubtreeAdd/2048_per_subtree-4 25.71n 25.66n ~ 0.100
SubtreeProcessorAdd/4_per_subtree-4 280.0n 281.7n ~ 0.700
SubtreeProcessorAdd/64_per_subtree-4 272.7n 272.5n ~ 0.600
SubtreeProcessorAdd/256_per_subtree-4 275.1n 275.6n ~ 1.000
SubtreeProcessorAdd/1024_per_subtree-4 267.3n 267.8n ~ 1.000
SubtreeProcessorAdd/2048_per_subtree-4 269.6n 270.7n ~ 0.700
SubtreeProcessorRotate/4_per_subtree-4 272.0n 274.4n ~ 0.700
SubtreeProcessorRotate/64_per_subtree-4 272.8n 268.5n ~ 0.100
SubtreeProcessorRotate/256_per_subtree-4 271.8n 272.2n ~ 1.000
SubtreeProcessorRotate/1024_per_subtree-4 271.5n 271.3n ~ 0.400
SubtreeNodeAddOnly/4_per_subtree-4 53.85n 53.83n ~ 1.000
SubtreeNodeAddOnly/64_per_subtree-4 34.05n 34.18n ~ 0.200
SubtreeNodeAddOnly/256_per_subtree-4 33.06n 33.33n ~ 0.200
SubtreeNodeAddOnly/1024_per_subtree-4 32.57n 32.53n ~ 0.800
SubtreeCreationOnly/4_per_subtree-4 112.0n 112.3n ~ 0.400
SubtreeCreationOnly/64_per_subtree-4 383.7n 382.5n ~ 0.800
SubtreeCreationOnly/256_per_subtree-4 1.292µ 1.301µ ~ 0.400
SubtreeCreationOnly/1024_per_subtree-4 4.294µ 4.380µ ~ 0.100
SubtreeCreationOnly/2048_per_subtree-4 7.642µ 7.746µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 268.0n 270.6n ~ 0.400
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 268.8n 271.4n ~ 0.100
ParallelGetAndSetIfNotExists/1k_nodes-4 807.4µ 578.1µ ~ 0.100
ParallelGetAndSetIfNotExists/10k_nodes-4 1.594m 1.316m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 6.842m 6.673m ~ 0.100
ParallelGetAndSetIfNotExists/100k_nodes-4 13.50m 13.41m ~ 0.400
SequentialGetAndSetIfNotExists/1k_nodes-4 661.5µ 652.4µ ~ 0.100
SequentialGetAndSetIfNotExists/10k_nodes-4 2.899m 2.734m ~ 0.100
SequentialGetAndSetIfNotExists/50k_nodes-4 10.51m 10.29m ~ 0.100
SequentialGetAndSetIfNotExists/100k_nodes-4 19.94m 19.82m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 625.2µ 624.4µ ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 4.166m 4.174m ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 16.63m 16.63m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 683.3µ 690.1µ ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 5.700m 5.683m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 37.52m 37.98m ~ 0.100
BlockAssembler_AddTx-4 0.02292n 0.02137n ~ 0.400
AddNode-4 8.950 8.600 ~ 1.000
AddNodeWithMap-4 8.647 8.837 ~ 1.000
DiskTxMap_SetIfNotExists-4 3.646µ 3.752µ ~ 0.100
DiskTxMap_SetIfNotExists_Parallel-4 3.521µ 3.635µ ~ 0.700
DiskTxMap_ExistenceOnly-4 318.4n 315.7n ~ 1.000
Queue-4 186.4n 186.1n ~ 1.000
AtomicPointer-4 3.701n 3.703n ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 829.9µ 805.0µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/10K-4 772.6µ 760.4µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 125.7µ 108.1µ ~ 0.100
ReorgOptimizations/AllMarkFalse/New/10K-4 64.80µ 64.40µ ~ 0.400
ReorgOptimizations/HashSlicePool/Old/10K-4 53.22µ 54.35µ ~ 0.400
ReorgOptimizations/HashSlicePool/New/10K-4 11.04µ 11.11µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/10K-4 4.494µ 4.371µ ~ 0.400
ReorgOptimizations/NodeFlags/New/10K-4 1.561µ 1.589µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.080m 9.259m ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/100K-4 9.173m 9.169m ~ 1.000
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.138m 1.068m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 706.1µ 704.2µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/100K-4 474.5µ 582.2µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 202.6µ 200.4µ ~ 1.000
ReorgOptimizations/NodeFlags/Old/100K-4 43.35µ 48.05µ ~ 0.100
ReorgOptimizations/NodeFlags/New/100K-4 14.74µ 16.35µ ~ 0.100
TxMapSetIfNotExists-4 46.56n 46.43n ~ 0.700
TxMapSetIfNotExistsDuplicate-4 38.75n 39.19n ~ 0.100
ChannelSendReceive-4 593.9n 580.0n ~ 0.100
CalcBlockWork-4 568.0n 563.8n ~ 0.700
CalculateWork-4 768.8n 769.4n ~ 1.000
BuildBlockLocatorString_Helpers/Size_10-4 1.328µ 1.334µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_100-4 13.26µ 12.50µ ~ 0.200
BuildBlockLocatorString_Helpers/Size_1000-4 129.5µ 159.1µ ~ 0.700
CatchupWithHeaderCache-4 104.3m 104.5m ~ 0.400
_BufferPoolAllocation/16KB-4 3.477µ 3.496µ ~ 0.400
_BufferPoolAllocation/32KB-4 9.487µ 8.906µ ~ 0.400
_BufferPoolAllocation/64KB-4 16.96µ 15.65µ ~ 0.700
_BufferPoolAllocation/128KB-4 30.42µ 30.13µ ~ 1.000
_BufferPoolAllocation/512KB-4 116.6µ 126.2µ ~ 0.100
_BufferPoolConcurrent/32KB-4 19.20µ 17.77µ ~ 0.100
_BufferPoolConcurrent/64KB-4 30.36µ 27.91µ ~ 0.100
_BufferPoolConcurrent/512KB-4 151.0µ 143.2µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/16KB-4 649.5µ 623.3µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 639.0µ 623.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 624.1µ 621.0µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/128KB-4 633.6µ 626.0µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 648.8µ 647.2µ ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/16KB-4 35.69m 35.81m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/32KB-4 35.79m 35.92m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/64KB-4 35.36m 35.70m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/128KB-4 35.84m 35.59m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/512KB-4 35.67m 35.40m ~ 0.200
_PooledVsNonPooled/Pooled-4 741.8n 735.7n ~ 0.200
_PooledVsNonPooled/NonPooled-4 6.982µ 7.074µ ~ 0.700
_MemoryFootprint/Current_512KB_32concurrent-4 6.822µ 6.882µ ~ 0.700
_MemoryFootprint/Proposed_32KB_32concurrent-4 10.083µ 9.368µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 10.299µ 9.367µ ~ 0.100
_prepareTxsPerLevel-4 429.0m 428.2m ~ 1.000
_prepareTxsPerLevelOrdered-4 3.790m 4.124m ~ 0.400
_prepareTxsPerLevel_Comparison/Original-4 427.9m 432.4m ~ 0.200
_prepareTxsPerLevel_Comparison/Optimized-4 3.737m 3.766m ~ 1.000
SubtreeSizes/10k_tx_4_per_subtree-4 1.250m 1.297m ~ 1.000
SubtreeSizes/10k_tx_16_per_subtree-4 300.9µ 297.9µ ~ 0.700
SubtreeSizes/10k_tx_64_per_subtree-4 71.50µ 71.52µ ~ 1.000
SubtreeSizes/10k_tx_256_per_subtree-4 17.77µ 17.87µ ~ 0.700
SubtreeSizes/10k_tx_512_per_subtree-4 8.901µ 8.790µ ~ 0.700
SubtreeSizes/10k_tx_1024_per_subtree-4 4.376µ 4.401µ ~ 0.400
SubtreeSizes/10k_tx_2k_per_subtree-4 2.177µ 2.172µ ~ 0.400
BlockSizeScaling/10k_tx_64_per_subtree-4 68.95µ 70.06µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 17.37µ 17.38µ ~ 1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.334µ 4.351µ ~ 0.400
BlockSizeScaling/50k_tx_64_per_subtree-4 370.1µ 368.6µ ~ 0.400
BlockSizeScaling/50k_tx_256_per_subtree-4 87.71µ 86.65µ ~ 0.200
BlockSizeScaling/50k_tx_1024_per_subtree-4 21.36µ 21.48µ ~ 0.400
SubtreeAllocations/small_subtrees_exists_check-4 148.4µ 149.7µ ~ 0.400
SubtreeAllocations/small_subtrees_data_fetch-4 156.8µ 157.5µ ~ 0.700
SubtreeAllocations/small_subtrees_full_validation-4 307.2µ 307.4µ ~ 0.700
SubtreeAllocations/medium_subtrees_exists_check-4 8.809µ 8.929µ ~ 0.100
SubtreeAllocations/medium_subtrees_data_fetch-4 9.291µ 9.411µ ~ 0.400
SubtreeAllocations/medium_subtrees_full_validation-4 17.49µ 17.43µ ~ 1.000
SubtreeAllocations/large_subtrees_exists_check-4 2.093µ 2.084µ ~ 0.600
SubtreeAllocations/large_subtrees_data_fetch-4 2.229µ 2.234µ ~ 1.000
SubtreeAllocations/large_subtrees_full_validation-4 4.347µ 4.359µ ~ 1.000
StoreBlock_Sequential/BelowCSVHeight-4 305.1µ 305.0µ ~ 1.000
StoreBlock_Sequential/AboveCSVHeight-4 314.6µ 315.1µ ~ 1.000
GetUtxoHashes-4 274.4n 276.1n ~ 1.000
GetUtxoHashes_ManyOutputs-4 52.27µ 45.71µ ~ 0.100
_NewMetaDataFromBytes-4 232.3n 231.2n ~ 1.000
_Bytes-4 612.3n 612.7n ~ 1.000
_MetaBytes-4 569.2n 563.2n ~ 0.700

Threshold: >10% with p < 0.05 | Generated: 2026-05-14 08:54 UTC

freemans13 added a commit to freemans13/teranode that referenced this pull request Apr 19, 2026
@freemans13 freemans13 self-assigned this Apr 20, 2026
freemans13 and others added 7 commits April 21, 2026 23:19
prepareSubtrees previously gated quickValidationMode on FSM == LEGACYSYNCING
only. In production setups that restart from a pre-existing chain, the FSM
begins in RUNNING and only transitions to CATCHINGBLOCKS (not LEGACYSYNCING)
during peer-driven catch-up, so the quick path ran on <1% of catch-up blocks
in live profiling (captured 2026-04-17).

Use the new shouldUseQuickValidationMode helper to activate the quick path
on LEGACYSYNCING, CATCHINGBLOCKS (gated by
legacy_quickValidationInCatchingBlocks, default true), and unconditionally
when legacy_forceQuickValidation=true.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Subtrees

A subtree written to the local store may carry either FileType — one of
`FileTypeSubtreeToCheck` (downloaded from a peer, pending validation) or
`FileTypeSubtree` (already validated; produced by block assembly, block
persister, or legacy catch-up's quickValidationMode). Both files carry the
same tx-hash list.

CheckBlockSubtrees only consulted `FileTypeSubtreeToCheck`. If we already
had `FileTypeSubtree` locally, the lookup still missed and we fell through
to an HTTP fetch from `request.BaseUrl`. For `BaseUrl == "legacy"` the
synthesised URL has no scheme, so the fetch errors with
"unsupported protocol scheme" and the whole block fails validation. In
practice this made every legacy-catch-up block fail the first time —
retries eventually succeed via a different path, turning what should be a
fast sync into a slow retry loop.

Broadening quickValidationMode to cover CATCHINGBLOCKS exposed this
latent bug: pre-existing LEGACYSYNCING-only gating hid it because mid-sync
nodes rarely hit that state.

Consult both file types via a small helper before falling back to HTTP.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`extendTransactions` previously issued one PreviousOutputsDecorate DB call
per transaction. On a 20,981-tx testnet block we observed this step take
3m 32s — 94% of total block-processing time — because the non-local-parent
path scaled linearly with block size: one IN-clause query per tx, serialised
only by the outpoint-batcher goroutine limit.

Split extendTransactions into two phases:

  1. Same-block parents via txMap, in per-tx goroutines (unchanged behaviour,
     waits for the parent tx to finish being extended). Pulled out into
     `extendFromTxMap`.
  2. Non-local parents via a single `BatchPreviousOutputsDecorate` call. The
     batch skips inputs already populated by phase 1, so only still-unfilled
     inputs hit the database. For a 20k-tx block that's one chunked query
     instead of 20k.

On a batch-decoration error that resembles "parent pruned / missing", fall
back to the original per-tx path so the DAH'd-parent recovery
(`utxoStore.Get(txHash, fields.Tx)`) still applies.

Also convert `BatchPreviousOutputsDecorate` to use a composite
`(t.hash, o.idx)` IN predicate — previously flagged as a TODO in the file
because the bulk path wasn't on the legacy-sync hot path. Now that it is,
the composite predicate avoids scanning every output of every referenced
parent, which is the behaviour that made PreviousOutputsDecorate dominate
heap on data-carrier-heavy blocks.

Expected impact on testnet 20k-tx block: extendTransactions 3m 32s → <10s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous implementation polled prevTxWrapper.Tx.IsExtended() on a 10ms
tick before reading the parent's Outputs. That's unnecessary: Outputs are
populated at wire-parse time and never mutated afterwards; IsExtended
only checks whether the parent's *inputs* have PreviousTxScript set,
which is unrelated to the parent's outputs that the child needs.

The wait also introduced a deadlock under the new two-phase
extendTransactions flow: a pure-non-local-parent tx only becomes
"extended" after phase 2 (batch DB call). Phase 1 goroutines waiting
for such a parent would time out at 120s and fail the block. Observed
on testnet block 1630240.

Drop the poll and read the parent's outputs directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
quickValidationMode skips checkSubtreeFromBlock, which is only safe when the
block being validated is covered by the legacy headers-first checkpoint anchor.
LEGACYSYNCING carries that guarantee by construction (bsvd netsync only enters
the state below the final checkpoint). CATCHINGBLOCKS does not: the state flips
purely on "peer is >10 blocks ahead", which also fires past the final checkpoint
on a restart near tip, on a brief disconnect, or against a forked peer. In that
regime, skipping subtree validation would accept subtree content on trust.

Tighten the helper: CATCHINGBLOCKS activation additionally requires the block
being processed to be at or below chainParams.Checkpoints' final entry.
legacy_forceQuickValidation remains the explicit opt-out for trusted networks.

Tests updated to cover the past-final-checkpoint CATCHINGBLOCKS case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@freemans13 freemans13 force-pushed the stu/netsync-quick-validation-catchup branch from 99c27bf to c37e522 Compare April 22, 2026 06:25
Comment thread services/legacy/netsync/handle_block.go Outdated
// pure-non-local-parent tx only becomes "extended" after phase 2 runs.
//
// No lock needed — each goroutine writes to a unique input index.
tx.Inputs[i].PreviousTxSatoshis = prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex].Satoshis

@github-actions github-actions Bot Apr 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

[Critical] Missing bounds check before array access - This issue has been fixed.

The code now includes proper bounds checking at services/legacy/netsync/handle_block.go:1041-1049:

  • Line 1041-1043: Checks if prevTxWrapper.Tx is nil
  • Line 1045-1047: Validates PreviousTxOutIndex is in range
  • Line 1049-1054: Checks for nil output and nil locking script

Malformed transactions will now return a processing error instead of panicking the node.

Comment thread services/legacy/netsync/handle_block.go Outdated
var inLegacySync, inCatchingBlocks bool
currentState, fsmErr := sm.blockchainClient.GetFSMCurrentState(sm.ctx)
if fsmErr != nil {
sm.logger.Errorf("[prepareSubtrees] Failed to get FSM state: %s", fsmErr)

@github-actions github-actions Bot Apr 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope / no longer applicable

The FSM query code mentioned in this comment doesn't appear in the current version of the PR. This issue appears to have been addressed through refactoring or is no longer relevant to the current implementation.

freemans13 and others added 2 commits April 21, 2026 23:36
With the final-checkpoint gate in place, the only scenario the force flag
covered was "skip subtree validation on tip blocks on trust," which is a
security regression with no defensible production use case. Remove the
setting and the forceFlag input on shouldUseQuickValidationMode.

Teranode has no precedent for blanket trust overrides of validation; narrow
knobs like legacy_config_DisableCheckpoints and blockvalidation_skipCheckParentMined
target specific, well-understood scenarios. The checkpoint anchor is load-bearing
for the quick path's safety argument, so it stays mandatory.

legacy_quickValidationInCatchingBlocks is preserved as the disable knob for
the CATCHINGBLOCKS path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
With CATCHINGBLOCKS gated on belowFinalCheckpoint, the post-gating code path
is identical to LEGACYSYNCING (which has no disable knob). There's no
FSM-state-dependent behavior left that would warrant a separate kill switch
for CATCHINGBLOCKS.

Collapse shouldUseQuickValidationMode to three inputs (inLegacySync,
inCatchingBlocks, belowFinalCheckpoint) and remove the setting and
wiring. The setting was introduced on this branch, so no migration needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

[Major] PR description is stale

The PR description documents two settings that were removed in later commits:

  • legacy_quickValidationInCatchingBlocks — removed in commit "refactor(netsync): drop legacy_quickValidationInCatchingBlocks setting"
  • legacy_forceQuickValidation — removed in commit "refactor(netsync): drop legacy_forceQuickValidation override"

The current implementation gates quick-validation mode purely on FSM state and checkpoint position, with no configurable overrides. The PR description should be updated to match the final implementation.

Impact: Misleading for reviewers and future readers trying to understand what changed.

Run gci on settings.go and legacy_settings.go to fix field-alignment drift
from the removed ForceQuickValidation and QuickValidationCatchupInCatchingBlocks
fields. No functional change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends legacy netsync’s quickValidationMode eligibility to include CATCHINGBLOCKS when the block being validated is at/below the chain’s final hardcoded checkpoint, and adds related performance improvements to reduce hot-path subtree and UTXO-decoration overhead during catch-up.

Changes:

  • Broaden quickValidationMode gating via shouldUseQuickValidationMode(...), using a single GetFSMCurrentState call and checkpoint-height anchoring.
  • Avoid unnecessary subtree HTTP fetches by checking both FileTypeSubtreeToCheck and FileTypeSubtree locally first.
  • Improve catch-up performance by batching previous-output decoration across transactions and tightening SQL lookups to exact (tx_hash, vout) pairs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
stores/utxo/sql/sql.go Switch bulk previous-output lookup to a composite (t.hash, o.idx) IN (...) to avoid scanning unreferenced outputs and improve decoration throughput.
services/subtreevalidation/check_block_subtrees.go Prefer local subtree reads (validated or pending-validation file types) before attempting HTTP fetches.
services/legacy/netsync/handle_block.go Add shouldUseQuickValidationMode, compute below-final-checkpoint gating, and refactor transaction extension to a 2-phase (txMap then batched DB) flow.
services/legacy/netsync/handle_block_test.go Add unit tests for shouldUseQuickValidationMode.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +360 to +369
belowFinalCheckpoint := false
if cps := sm.chainParams.Checkpoints; len(cps) > 0 {
belowFinalCheckpoint = block.Height() <= cps[len(cps)-1].Height
}

// quick validation mode is used when we are in legacy mode
// we can skip some of the processing since we assume the block is valid
quickValidationMode := legacyMode
quickValidationMode := shouldUseQuickValidationMode(
inLegacySync,
inCatchingBlocks,
belowFinalCheckpoint,
)

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved - The PR now includes the disableCheckpoints field on SyncManager and gates the belowFinalCheckpoint check accordingly (services/legacy/netsync/handle_block.go:364-368).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed: plumbed DisableCheckpoints from Config onto SyncManager.disableCheckpoints and in prepareSubtrees we now force belowFinalCheckpoint=false when checkpoints are disabled, so the CATCHINGBLOCKS path cannot skip subtree validation in that mode.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +1012 to +1014
g.Go(func() error {
txWrapper.SomeParentsInBlock = true

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved - The data race is fixed. txWrapper.SomeParentsInBlock = true is now set once from the caller goroutine before spawning the per-input goroutines (services/legacy/netsync/handle_block.go:1020).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — moved the txWrapper.SomeParentsInBlock = true write out of the goroutine. It's now set once from the caller goroutine before g.Go(...), so concurrent writes from the per-input goroutines are gone. Verified via go test -race on the netsync package.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +976 to +978
// block (available via txMap). It waits up to 120 seconds for each same-block parent
// to be fully extended itself, which is necessary because child and parent may be
// processed by different goroutines in the enclosing errgroup.

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved - The doc comment has been updated to match the current implementation, which reads parent outputs immediately without waiting (services/legacy/netsync/handle_block.go:980-987).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the doc comment to match the new behavior: we no longer wait/poll for parent extension — outputs are available at parse time, so we read them immediately.

- Plumb DisableCheckpoints from Config onto SyncManager; force
  belowFinalCheckpoint=false when checkpoints are disabled so the
  CATCHINGBLOCKS quickValidationMode path cannot skip subtree
  validation when the operator passes --nocheckpoints.
- Move txWrapper.SomeParentsInBlock = true out of the per-input
  goroutine in extendFromTxMap to eliminate a data race under -race.
- Update extendFromTxMap doc comment to reflect that parent outputs
  are now read immediately (no 120s wait).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +923 to +926
// served from the in-memory txMap, so no DB work is needed here. We run per-tx
// goroutines (bounded by OutpointBatcherSize) because each tx's own inputs are
// populated independently and may need to wait for a same-block parent to finish
// being extended first.

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Phase 1 comment says per-tx goroutines "may need to wait for a same-block parent to finish being extended first", but the new implementation no longer waits/polls on parent extension (extendFromTxMap reads parent outputs immediately). Please update/remove the comment to match the current behavior so future changes don't reintroduce the old deadlock workaround.

Suggested change
// served from the in-memory txMap, so no DB work is needed here. We run per-tx
// goroutines (bounded by OutpointBatcherSize) because each tx's own inputs are
// populated independently and may need to wait for a same-block parent to finish
// being extended first.
// served directly from the in-memory txMap, so no DB work is needed here. We run
// per-tx goroutines (bounded by OutpointBatcherSize) because each transaction's
// inputs can be populated independently from txMap lookups.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the Phase 1 comment to remove the stale "wait for a same-block parent" wording. Now reads: goroutines are bounded by OutpointBatcherSize and each tx's inputs are populated independently from txMap lookups — no waiting/polling.

Comment thread stores/utxo/sql/sql.go Outdated
Comment on lines +3739 to +3740
// Inline composite-IN construction keeps this function standalone;
// mirrors the pattern used in PreviousOutputsDecorate above.

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline placeholder construction is described as "mirrors the pattern used in PreviousOutputsDecorate above", but PreviousOutputsDecorate currently uses buildINClause with a simple t.hash IN (...) predicate (not composite tuple placeholders). Consider either switching PreviousOutputsDecorate to the same composite-IN helper/pattern or rewording this comment to avoid implying the two implementations are structurally identical.

Suggested change
// Inline composite-IN construction keeps this function standalone;
// mirrors the pattern used in PreviousOutputsDecorate above.
// Build a composite (t.hash, o.idx) IN clause inline here to keep this
// function standalone.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded the comment as suggested. It no longer implies structural equivalence with PreviousOutputsDecorate — it just states that a composite (t.hash, o.idx) IN clause is built inline to keep the function standalone. I'll leave PreviousOutputsDecorate as-is in this PR (its single-column t.hash IN (...) form is fine for that call site's semantics).

Comment on lines +183 to 203
// A subtree may be available locally under either:
// - FileTypeSubtreeToCheck — fetched from a peer, pending validation
// - FileTypeSubtree — already validated (e.g. legacy catch-up's
// quickValidationMode validated txs inline
// before writing the subtree)
// We must consult both before falling back to an HTTP fetch. Otherwise
// CheckBlockSubtrees will try to HTTP-download a subtree we already have
// — and for baseURL="legacy" the synthetic URL has no scheme, so the
// request fails outright.
localFileType, localExists, err := u.findLocalSubtreeFile(gCtx, subtreeHash)
if err != nil {
return errors.NewProcessingError("[CheckBlockSubtrees][%s] failed to check if subtree exists in store", subtreeHash.String(), err)
}

var subtreeToCheck *subtreepkg.Subtree

if subtreeToCheckExists {
// get the subtreeToCheck from the store
subtreeReader, err := u.subtreeStore.GetIoReader(gCtx, subtreeHash[:], fileformat.FileTypeSubtreeToCheck)
if localExists {
// read from whichever local file we found
subtreeReader, err := u.subtreeStore.GetIoReader(gCtx, subtreeHash[:], localFileType)
if err != nil {
return errors.NewStorageError("[CheckBlockSubtrees][%s] failed to get subtree from store", subtreeHash.String(), err)

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior: CheckBlockSubtrees now reads the subtree structure from FileTypeSubtree as well as FileTypeSubtreeToCheck. The existing unit tests only cover the FileTypeSubtreeToCheck path; please add a test case where only FileTypeSubtree exists (and FileTypeSubtreeToCheck is absent) to ensure we don’t regress back to the HTTP fetch path (especially for baseURL="legacy").

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TestFindLocalSubtreeFile with three subtests: (1) only FileTypeSubtreeToCheck present, (2) only FileTypeSubtree present (the new legacy quickValidationMode path), and (3) neither present. The FileTypeSubtree-only case is the regression guard you asked for — it fails if we ever drop back to FileTypeSubtreeToCheck-only lookup which would reintroduce the HTTP fallback for baseURL="legacy".

Comment on lines +470 to +482
return "", false, err
}
if exists {
return fileformat.FileTypeSubtreeToCheck, true, nil
}
exists, err = u.subtreeStore.Exists(ctx, subtreeHash[:], fileformat.FileTypeSubtree)
if err != nil {
return "", false, err
}
if exists {
return fileformat.FileTypeSubtree, true, nil
}
return "", false, nil

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findLocalSubtreeFile returns a raw empty string ("") for the FileType on error/not-found paths. Since FileTypeUnknown is defined as the empty value, returning fileformat.FileTypeUnknown would make the intent clearer and keep return values consistently typed.

Suggested change
return "", false, err
}
if exists {
return fileformat.FileTypeSubtreeToCheck, true, nil
}
exists, err = u.subtreeStore.Exists(ctx, subtreeHash[:], fileformat.FileTypeSubtree)
if err != nil {
return "", false, err
}
if exists {
return fileformat.FileTypeSubtree, true, nil
}
return "", false, nil
return fileformat.FileTypeUnknown, false, err
}
if exists {
return fileformat.FileTypeSubtreeToCheck, true, nil
}
exists, err = u.subtreeStore.Exists(ctx, subtreeHash[:], fileformat.FileTypeSubtree)
if err != nil {
return fileformat.FileTypeUnknown, false, err
}
if exists {
return fileformat.FileTypeSubtree, true, nil
}
return fileformat.FileTypeUnknown, false, nil

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion — switched all three not-found / error paths in findLocalSubtreeFile to return fileformat.FileTypeUnknown explicitly instead of the raw empty string. Keeps intent clear and return values consistently typed.

- handle_block.go: remove stale "wait for same-block parent" wording
  from the Phase 1 comment in extendTransactions; the new flow reads
  parent outputs immediately from txMap with no polling.
- check_block_subtrees.go: return fileformat.FileTypeUnknown instead of
  a raw empty string from findLocalSubtreeFile on not-found/error
  paths.
- check_block_subtrees_test.go: add TestFindLocalSubtreeFile covering
  FileTypeSubtreeToCheck-present, FileTypeSubtree-only (legacy quick
  validation path), and neither-present cases.
- stores/utxo/sql/sql.go: reword inline composite-IN comment so it
  describes what the code does instead of implying structural
  equivalence with PreviousOutputsDecorate.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +272 to +286
// - belowFinalCheckpoint: the block being validated is at or below the
// chain's final hardcoded checkpoint height. When true, the headers-first
// chain guarantees subtree-content integrity and skipping subtree-level
// validation is safe. Above the final checkpoint (or when checkpoints are
// disabled) no such anchor exists and full subtree validation is required.
//
// CATCHINGBLOCKS activation additionally requires belowFinalCheckpoint so the
// checkpoint-anchor safety argument matches LEGACYSYNCING. There is no
// override for running quickValidationMode past the final checkpoint: subtree
// validation on tip blocks protects against malicious or forked peers and
// must not be skipped.
func shouldUseQuickValidationMode(inLegacySync, inCatchingBlocks, belowFinalCheckpoint bool) bool {
if inLegacySync {
return true
}

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment says that when checkpoints are disabled, full subtree validation is required (no checkpoint anchor), but shouldUseQuickValidationMode currently returns true whenever inLegacySync is true regardless of belowFinalCheckpoint (which the caller forces to false when checkpoints are disabled). Please align the comment and logic: either (a) update the comment to clarify that the checkpoints-disabled restriction only applies to the CATCHINGBLOCKS branch, or (b) incorporate the checkpoints-disabled/belowFinalCheckpoint guard into the LEGACYSYNCING branch as well.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded the doc comment: (a) called out that the caller forces belowFinalCheckpoint=false when checkpoints are disabled, and (b) added a paragraph explaining why LEGACYSYNCING doesn't re-check it here — the DisableCheckpoints guard in manager.go (around line 2302) means nextCheckpoint stays nil and headers-first mode never activates, so the checkpoints-disabled guard is applied upstream. Keeping the LEGACYSYNCING branch unconditional matches bsvd's model; the comment now makes that reasoning explicit instead of implicit.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +1031 to +1033
tx.Inputs[i].PreviousTxSatoshis = prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex].Satoshis
tx.Inputs[i].PreviousTxScript = bscript.NewFromBytes(*prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex].LockingScript)

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extendFromTxMap indexes prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex] without validating that PreviousTxOutIndex is in-range. A malformed/hostile block could trigger an out-of-bounds panic here (DoS) before the block is rejected. Please add a bounds check (and a nil output/script check if applicable) and return a validation/processing error instead of panicking.

Suggested change
tx.Inputs[i].PreviousTxSatoshis = prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex].Satoshis
tx.Inputs[i].PreviousTxScript = bscript.NewFromBytes(*prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex].LockingScript)
if prevTxWrapper == nil || prevTxWrapper.Tx == nil {
return errors.NewProcessingError("failed to extend transaction %s: previous transaction %s is missing", tx.TxIDChainHash(), prevTxHash)
}
prevOutIdx := int(input.PreviousTxOutIndex)
if prevOutIdx < 0 || prevOutIdx >= len(prevTxWrapper.Tx.Outputs) {
return errors.NewProcessingError("failed to extend transaction %s: previous output index %d out of range for transaction %s", tx.TxIDChainHash(), input.PreviousTxOutIndex, prevTxHash)
}
prevOutput := prevTxWrapper.Tx.Outputs[prevOutIdx]
if prevOutput == nil {
return errors.NewProcessingError("failed to extend transaction %s: previous output %d is nil for transaction %s", tx.TxIDChainHash(), input.PreviousTxOutIndex, prevTxHash)
}
if prevOutput.LockingScript == nil {
return errors.NewProcessingError("failed to extend transaction %s: previous output %d has nil locking script for transaction %s", tx.TxIDChainHash(), input.PreviousTxOutIndex, prevTxHash)
}
tx.Inputs[i].PreviousTxSatoshis = prevOutput.Satoshis
tx.Inputs[i].PreviousTxScript = bscript.NewFromBytes(*prevOutput.LockingScript)

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good defensive hardening — added the checks inside the goroutine: (1) prevTxWrapper.Tx nil, (2) PreviousTxOutIndex out of range vs len(prevTxWrapper.Tx.Outputs), (3) prevOutput nil or LockingScript nil. Each returns a NewProcessingError so the block is rejected cleanly instead of panicking. Tests still green under -race.

Comment thread stores/utxo/sql/sql.go Outdated
Comment on lines +3719 to +3720
// maxINClauseSize bounds the total parameter count. Each pair contributes
// two parameters, so chunk size must be halved.

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says maxINClauseSize bounds the total parameter count, but the constant is defined as the maximum number of placeholders in a single IN clause (chosen to stay under SQLite's 999 bind-param limit with headroom). Since this code now uses two bind params per tuple, reword the comment to reflect the actual constraint (e.g., maxINClauseSize is an item-count limit derived from the bind-param cap, so we use maxINClauseSize/2 tuples).

Suggested change
// maxINClauseSize bounds the total parameter count. Each pair contributes
// two parameters, so chunk size must be halved.
// maxINClauseSize is the per-IN-clause placeholder limit derived from the
// database bind-parameter cap. Each (hash, idx) pair uses two parameters,
// so each chunk can contain at most maxINClauseSize/2 composite tuples.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the suggested wording. The comment now states that maxINClauseSize is the per-IN-clause placeholder limit derived from the bind-parameter cap, and that each composite (hash, idx) pair uses two params so each chunk can hold at most maxINClauseSize/2 tuples.

- handle_block.go: expand shouldUseQuickValidationMode doc comment to
  clarify that the checkpoints-disabled guard is applied by the
  caller (for the CATCHINGBLOCKS path) and upstream in manager.go
  for LEGACYSYNCING (nextCheckpoint stays nil + headers-first never
  activates when DisableCheckpoints is true).
- handle_block.go: bounds-check PreviousTxOutIndex and nil-check
  prevTxWrapper.Tx / prevOutput / LockingScript in extendFromTxMap
  so a malformed or hostile block returns a processing error
  instead of panicking with a slice OOB.
- stores/utxo/sql/sql.go: reword the pairChunkSize comment to state
  that maxINClauseSize is a placeholder/bind-param limit, and that
  each (hash, idx) tuple uses two params.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +953 to +961
tx := txWrapper.Tx
txs = append(txs, tx)

g.Go(func() error {
if err := sm.extendFromTxMap(gCtx, tx, txMap); err != nil {
return errors.NewTxError("failed to extend transaction from txMap", err)
}
return nil
})

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In extendTransactions phase 1, the goroutine closure captures the loop-local variable tx without re-binding it per iteration. Because tx is assigned each loop iteration, goroutines can end up all operating on the last transaction (classic loop-variable capture bug). Rebind tx inside the loop (e.g., tx := tx) or pass it as a parameter to the closure before calling extendFromTxMap.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but I believe this is actually safe. Two reasons:

  1. tx is not a loop variable — it's declared inside the loop body via tx := txWrapper.Tx (line 953). Each iteration creates a fresh variable, so the closure on line 956 captures that iteration's tx. The classic gotcha applies to the loop variables in the for header (idx, wireTx), not to := declarations inside the loop body.

  2. Even if it were a loop variable, this repo is on Go 1.26 (go.mod line 3), so per-iteration scoping has been the default since the Go 1.22 LoopVar change.

For safety I did run the netsync suite under -race after my round 1 fix and saw no races, and the integration tests exercise this path with blocks containing same-block parent/child dependencies.

Leaving as-is unless you can point out a specific case where the current code misbehaves.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +968 to +981
// Phase 2: for inputs whose parents are NOT same-block, batch the decoration into
// a single IN-clause DB query per chunk instead of issuing one per tx. For a
// 20k-tx block this collapses ~20k round-trips into O(N / maxINClauseSize).
//
// BatchPreviousOutputsDecorate skips inputs that already have PreviousTxScript set,
// so Phase 1's work is preserved. If it returns a processing/not-found error the
// most likely cause is a parent that's been pruned (DAH'd) because the child
// already had a prior processing pass. Fall back to per-tx decoration so the
// existing recovery path (utxoStore.Get on the child itself) can still kick in.
if batchErr := sm.utxoStore.BatchPreviousOutputsDecorate(ctx, txs); batchErr != nil {
if errors.Is(batchErr, errors.ErrProcessing) || errors.Is(batchErr, errors.ErrTxNotFound) {
return sm.extendPerTxFallback(ctx, txs)
}
return errors.NewProcessingError("failed to batch-decorate previous outputs", batchErr)

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extendTransactions phase 2 now calls utxoStore.BatchPreviousOutputsDecorate(ctx, txs) once for the whole block. For the Aerospike UTXO store, BatchPreviousOutputsDecorate currently just loops PreviousOutputsDecorate per-tx sequentially (stores/utxo/aerospike/get.go:1183-1189), which removes the previous per-tx concurrency and prevents the outpoint batcher from coalescing requests across transactions. Consider either (a) making the Aerospike BatchPreviousOutputsDecorate enqueue all inputs across all txs before waiting, or (b) keeping a concurrent per-tx call path here when the backing store doesn’t implement a true bulk operation, to avoid a performance regression on large blocks.

Suggested change
// Phase 2: for inputs whose parents are NOT same-block, batch the decoration into
// a single IN-clause DB query per chunk instead of issuing one per tx. For a
// 20k-tx block this collapses ~20k round-trips into O(N / maxINClauseSize).
//
// BatchPreviousOutputsDecorate skips inputs that already have PreviousTxScript set,
// so Phase 1's work is preserved. If it returns a processing/not-found error the
// most likely cause is a parent that's been pruned (DAH'd) because the child
// already had a prior processing pass. Fall back to per-tx decoration so the
// existing recovery path (utxoStore.Get on the child itself) can still kick in.
if batchErr := sm.utxoStore.BatchPreviousOutputsDecorate(ctx, txs); batchErr != nil {
if errors.Is(batchErr, errors.ErrProcessing) || errors.Is(batchErr, errors.ErrTxNotFound) {
return sm.extendPerTxFallback(ctx, txs)
}
return errors.NewProcessingError("failed to batch-decorate previous outputs", batchErr)
// Phase 2: for inputs whose parents are NOT same-block, decorate previous outputs
// with bounded per-tx concurrency. This preserves the pre-existing concurrency
// characteristics for stores whose BatchPreviousOutputsDecorate implementation is
// only a sequential wrapper around PreviousOutputsDecorate, while still leaving
// Phase 1's same-block population intact.
if err = sm.extendPreviousOutputsConcurrent(ctx, txs, outpointBatcherSize); err != nil {
return err
}
return nil
}
func (sm *SyncManager) extendPreviousOutputsConcurrent(ctx context.Context, txs []*bt.Tx, limit int) error {
g, gCtx := errgroup.WithContext(ctx)
util.SafeSetLimit(g, limit)
for _, tx := range txs {
tx := tx
g.Go(func() error {
if err := sm.utxoStore.PreviousOutputsDecorate(gCtx, tx); err != nil {
return err
}
return nil
})
}
if err := g.Wait(); err != nil {
if errors.Is(err, errors.ErrProcessing) || errors.Is(err, errors.ErrTxNotFound) {
return sm.extendPerTxFallback(ctx, txs)
}
return errors.NewProcessingError("failed to decorate previous outputs", err)

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed by parallelizing BatchPreviousOutputsDecorate inside the Aerospike store itself (rather than in the caller) so the SQL store keeps its true bulk IN-clause path and Aerospike regains cross-tx batching via its shared outpointBatcher.

Details:

  • Wrapped the per-tx PreviousOutputsDecorate calls in an errgroup bounded by s.settings.UtxoStore.OutpointBatcherSize (default 100) so goroutine count stays bounded on large blocks while still letting many txs' inputs pipeline into the same batcher flush.
  • BatchPreviousOutputsDecorate's contract (dedup via PreviousTxScript != nil, error semantics) is unchanged.

This preserves the single-pass DB read for SQL and recovers the cross-tx outpoint coalescing for Aerospike without having to special-case the call site.

Comment thread services/legacy/netsync/handle_block.go Outdated
Comment on lines +1045 to +1049
if int(input.PreviousTxOutIndex) >= len(prevTxWrapper.Tx.Outputs) {
return errors.NewProcessingError("failed to extend transaction %s: previous output index %d out of range for transaction %s (outputs=%d)", tx.TxIDChainHash(), input.PreviousTxOutIndex, prevTxHash, len(prevTxWrapper.Tx.Outputs))
}

prevOutput := prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex]

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extendFromTxMap bounds-checkes PreviousTxOutIndex by casting it to int (if int(input.PreviousTxOutIndex) >= len(...)), but then indexes the slice using the original uint32 value. On 32-bit builds, the uint32 -> int cast can overflow to a negative value and bypass the check, leading to an out-of-bounds panic (DoS) for a hostile/malformed block. Compare using unsigned types (e.g., if input.PreviousTxOutIndex >= uint32(len(outputs))) or otherwise guard the cast before indexing.

Suggested change
if int(input.PreviousTxOutIndex) >= len(prevTxWrapper.Tx.Outputs) {
return errors.NewProcessingError("failed to extend transaction %s: previous output index %d out of range for transaction %s (outputs=%d)", tx.TxIDChainHash(), input.PreviousTxOutIndex, prevTxHash, len(prevTxWrapper.Tx.Outputs))
}
prevOutput := prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex]
if input.PreviousTxOutIndex >= uint32(len(prevTxWrapper.Tx.Outputs)) {
return errors.NewProcessingError("failed to extend transaction %s: previous output index %d out of range for transaction %s (outputs=%d)", tx.TxIDChainHash(), input.PreviousTxOutIndex, prevTxHash, len(prevTxWrapper.Tx.Outputs))
}
outputIndex := int(input.PreviousTxOutIndex)
prevOutput := prevTxWrapper.Tx.Outputs[outputIndex]

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point — switched the bounds check to a uint32 comparison: if input.PreviousTxOutIndex >= uint32(len(prevTxWrapper.Tx.Outputs)). Removes the 32-bit-build int() overflow hole. Also dropped the subsequent int() cast since we index with the uint32 directly.

- stores/utxo/aerospike/get.go: parallelize BatchPreviousOutputsDecorate
  so per-tx PreviousOutputsDecorate calls run concurrently (bounded by
  UtxoStore.OutpointBatcherSize), letting the shared outpointBatcher
  coalesce inputs across txs on large blocks. The previous sequential
  loop regressed cross-tx batching under the new netsync two-phase
  flow.
- services/legacy/netsync/handle_block.go: compare PreviousTxOutIndex
  as uint32 against len(Outputs) instead of casting to int, so 32-bit
  builds can't overflow past the bounds check. Indexing uses the
  uint32 value directly.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@freemans13 freemans13 requested a review from icellan April 22, 2026 11:02
@freemans13 freemans13 force-pushed the stu/netsync-quick-validation-catchup branch from 2fcc259 to f5072d3 Compare April 23, 2026 15:58
Comment thread services/legacy/netsync/handle_block.go Outdated
// (IsExtended checks the parent's *inputs*, not its outputs) and caused
// a deadlock under the two-phase flow in extendTransactions, where a
// pure-non-local-parent tx only becomes "extended" after phase 2 runs.
tx.Inputs[i].PreviousTxSatoshis = prevTxWrapper.Tx.Outputs[input.PreviousTxOutIndex].Satoshis

@github-actions github-actions Bot Apr 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this PR

This issue exists in pre-existing code (ExtendTransaction function) that is not modified by this PR. The PR correctly adds defensive nil-safety checks to the new extendFromTxMap function (lines 1064-1078), which is the code path this PR introduces/modifies.

The ExtendTransaction function (lines 1286-1377) should have the same defensive checks, but that's a separate issue to be addressed in a dedicated security hardening PR, not bundled into this performance optimization work.

Recommendation: File a separate issue to harden ExtendTransaction with the same defensive checks (nil Tx, bounds check using uint32 comparison, nil output/LockingScript checks) to maintain security consistency across both code paths.

…eckpoint

Extends the legacy catch-up quick path to also fire in CATCHINGBLOCKS when
the block being validated is at or below the chain's final hardcoded
checkpoint. The checkpoint anchor gives the same headers-first safety
guarantee as LEGACYSYNCING, so skipping subtree-level re-validation for
those blocks is safe. Past the final checkpoint we fall back to the full
path — subtree validation on tip blocks must not be skipped.

--nocheckpoints disables the anchor entirely; in that mode quickValidationMode
can only activate via LEGACYSYNCING (which bsvd itself gates on checkpoint
coverage upstream in manager.go).

Inlines the branching directly in prepareSubtrees rather than introducing
a helper — the condition is a two-case OR.

Also:
- Wire Config.DisableCheckpoints into SyncManager so prepareSubtrees can
  honour --nocheckpoints.
- subtreevalidation: accept FileTypeSubtree as a local source in
  CheckBlockSubtrees. legacy catch-up's quickValidationMode writes the
  subtree under FileTypeSubtree (not FileTypeSubtreeToCheck) because the
  txs were already validated inline, so the old Exists-check missed it
  and fell back to an HTTP download that cannot succeed for the
  synthetic baseURL="legacy".
- aerospike: run BatchPreviousOutputsDecorate per-tx in parallel via
  errgroup so the shared outpointBatcher can coalesce inputs across txs.
  The sequential loop regressed throughput on large blocks.
@freemans13 freemans13 force-pushed the stu/netsync-quick-validation-catchup branch from f5072d3 to bff8321 Compare April 23, 2026 16:07
freemans13 added 2 commits May 6, 2026 13:52
…alidation-catchup

# Conflicts:
#	services/legacy/netsync/handle_block.go
#	stores/utxo/sql/sql.go
…chup' into stu/netsync-quick-validation-catchup

# Conflicts:
#	services/legacy/netsync/handle_block.go

@oskarszoon oskarszoon left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has a merge conflict with main — please rebase before review.

@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
55.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@freemans13

Copy link
Copy Markdown
Collaborator Author

Closing — the core change here (broadening quickValidationMode to CATCHINGBLOCKS) was superseded by #847 (merged 2026-05-12), which takes a cleaner approach: gate on highest-checkpoint height rather than FSM state, and drops the per-block IsFSMCurrentState RPC entirely. The disableCheckpoints plumbing in this PR is moot under that approach.

The three independent improvements bundled here will be re-opened as focused PRs:

  • defensive nil-checks in extendFromTxMap (prevTxWrapper.Tx, prevOutput, LockingScript) — not covered by fix(legacy/netsync): bounds-check parent output index in ExtendTransaction #768's bounds-check
  • findLocalSubtreeFile in CheckBlockSubtrees — prevents an HTTP fallback when the subtree exists locally as FileTypeSubtree
  • concurrent BatchPreviousOutputsDecorate on the Aerospike store, with a safer concurrency cap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants