Skip to content

refactor(subtreevalidation): remove redundant orphanage; rely on sequential revalidation#978

Merged
ordishs merged 6 commits into
bsv-blockchain:mainfrom
ordishs:fix/4641-remove-subtreevalidation-orphanage
May 29, 2026
Merged

refactor(subtreevalidation): remove redundant orphanage; rely on sequential revalidation#978
ordishs merged 6 commits into
bsv-blockchain:mainfrom
ordishs:fix/4641-remove-subtreevalidation-orphanage

Conversation

@ordishs

@ordishs ordishs commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Remove the Orphanage mechanism from services/subtreevalidation end to end — type, callers, settings, tests, and docs. Net diff: 15 files, +45 / -1632.

Cross-subtree parent ordering in the block path is already handled by validateMissingSubtreesWithOrderedRetry (Phase-3 ordered sequential revalidation). The function's own contract states: "One ordered pass is therefore sufficient; any remaining failure is a real validation error, not an ordering artefact." processOrphans ran after Phase-3, so it was a redundant trailing sweep.

Why

Review of the current code identified three independent mechanisms that already guarantee parent-before-child resolution without the orphanage:

  1. Level-based topological processing within a subtree/batch — parents blessed before children.
  2. All-missing-parent deferral in processTransactionsInLevels — driven by missingParentErrors, not the orphanage. Hands cross-batch parents to the sequential pass.
  3. Phase-3 sequential revalidation — walks failed subtrees in block order; by the time subtree N retries, every earlier subtree has been validated.

Additional smells supporting redundancy:

  • In the Kafka path, adding to the orphanage didn't salvage the failing subtree (errorsFound still triggered a failure return); the only re-bless (processOrphans) was never called there.
  • The orphanage was gated on FSM == RUNNING, so it was inactive during initial sync — exactly when out-of-order is most likely — yet sync works.

Kafka-path missing-parent behavior is intentionally unchanged (subtree still fails; block path remains the backstop).

Test Plan

  • go build ./...
  • go vet -tags testtxmetacache ./services/subtreevalidation/... ./settings/... — clean
  • go test -race -tags testtxmetacache ./services/subtreevalidation/... — pass (~162s)
  • go test -race ./settings/... — pass (~1.8s)
  • golangci-lint run ./services/subtreevalidation/... ./settings/... — only pre-existing prealloc findings in untouched test files
  • Re-framed contract test TestProcessTransactionsInLevels/MissingParentErrors continues to pass, now asserting the deferral instead of orphanage membership

Notes for operators

  • Deployed settings.conf / env vars naming subtreevalidation_orphanageTimeout or subtreevalidation_orphanageMaxSize will be silently ignored — no startup break.
  • Monitoring or dashboards keyed on the old "added N to orphanage" log line should switch to "deferred to sequential revalidation".

Closes #4641

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

No issues found. This PR successfully removes the redundant orphanage mechanism with clean refactoring and comprehensive documentation updates.

Review Summary:

Correctness: The removal is well-justified. The PR description clearly documents that three independent mechanisms already guarantee parent-before-child resolution:

  • Level-based topological processing within subtrees
  • All-missing-parent deferral in processTransactionsInLevels
  • Phase-3 sequential revalidation in block order

Code Quality:

  • Clean deletions of orphanage code (166 lines from orphanage.go, 1057 lines of tests)
  • Proper error handling maintained throughout
  • Simplified logging from "added to orphanage" to "deferred to sequential revalidation"
  • No orphanage initialization, no processOrphans calls, no FSM state checks

Documentation Accuracy:

  • All references removed from subtreevalidation_reference.md, subtreeValidation.md, and settings docs
  • UTXO settings doc correctly updated: removed orphanage reference, replaced with "sequential revalidation pass"
  • Inline comments accurately reflect new missing-parent handling

Testing:

  • Test plan shows comprehensive validation: build, vet, race tests, golangci-lint all pass
  • Reframed contract test now asserts deferral instead of orphanage membership

Compatibility:

  • Silent backward compatibility maintained: old settings (subtreevalidation_orphanageTimeout, subtreevalidation_orphanageMaxSize) are ignored without startup breaks
  • Clear operator notes for monitoring transition

The refactoring is minimal, focused, and safe.

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-978 (ea31a6b)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 144
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.758µ 1.768µ ~ 0.700
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 61.75n 61.87n ~ 1.000
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 61.67n 61.69n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 61.62n 61.81n ~ 0.200
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 31.66n 32.28n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 53.02n 53.24n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 109.9n 107.8n ~ 0.700
MiningCandidate_Stringify_Short-4 260.0n 263.6n ~ 0.100
MiningCandidate_Stringify_Long-4 1.932µ 1.957µ ~ 0.100
MiningSolution_Stringify-4 989.9n 993.1n ~ 0.700
BlockInfo_MarshalJSON-4 1.814µ 1.808µ ~ 1.000
NewFromBytes-4 113.8n 114.6n ~ 0.200
AddTxBatchColumnar_Validation-4 2.314µ 2.407µ ~ 0.100
OffsetValidationLoop-4 634.2n 517.5n ~ 0.100
Mine_EasyDifficulty-4 66.88µ 66.82µ ~ 0.700
Mine_WithAddress-4 6.981µ 7.057µ ~ 0.200
DirectSubtreeAdd/4_per_subtree-4 59.26n 58.52n ~ 1.000
DirectSubtreeAdd/64_per_subtree-4 30.00n 31.71n ~ 0.100
DirectSubtreeAdd/256_per_subtree-4 28.97n 30.54n ~ 0.100
DirectSubtreeAdd/1024_per_subtree-4 27.93n 29.37n ~ 0.100
DirectSubtreeAdd/2048_per_subtree-4 27.57n 28.89n ~ 0.100
SubtreeProcessorAdd/4_per_subtree-4 285.3n 285.5n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 282.4n 282.3n ~ 0.700
SubtreeProcessorAdd/256_per_subtree-4 282.4n 280.5n ~ 1.000
SubtreeProcessorAdd/1024_per_subtree-4 273.6n 276.4n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 270.8n 277.8n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 278.1n 278.3n ~ 1.000
SubtreeProcessorRotate/64_per_subtree-4 276.0n 277.1n ~ 0.400
SubtreeProcessorRotate/256_per_subtree-4 274.9n 279.3n ~ 0.100
SubtreeProcessorRotate/1024_per_subtree-4 273.1n 281.8n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 54.65n 54.34n ~ 0.400
SubtreeNodeAddOnly/64_per_subtree-4 34.27n 34.17n ~ 1.000
SubtreeNodeAddOnly/256_per_subtree-4 33.30n 33.65n ~ 0.100
SubtreeNodeAddOnly/1024_per_subtree-4 32.53n 32.67n ~ 0.100
SubtreeCreationOnly/4_per_subtree-4 114.1n 113.5n ~ 1.000
SubtreeCreationOnly/64_per_subtree-4 405.6n 400.8n ~ 0.200
SubtreeCreationOnly/256_per_subtree-4 1.376µ 1.351µ ~ 0.100
SubtreeCreationOnly/1024_per_subtree-4 4.485µ 4.384µ ~ 0.200
SubtreeCreationOnly/2048_per_subtree-4 8.474µ 8.095µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 272.8n 275.6n ~ 0.700
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 274.3n 275.0n ~ 0.700
ParallelGetAndSetIfNotExists/1k_nodes-4 2.243m 2.218m ~ 0.200
ParallelGetAndSetIfNotExists/10k_nodes-4 5.511m 5.380m ~ 0.700
ParallelGetAndSetIfNotExists/50k_nodes-4 8.398m 7.307m ~ 0.100
ParallelGetAndSetIfNotExists/100k_nodes-4 10.97m 10.33m ~ 0.100
SequentialGetAndSetIfNotExists/1k_nodes-4 1.958m 1.958m ~ 1.000
SequentialGetAndSetIfNotExists/10k_nodes-4 4.454m 4.386m ~ 0.100
SequentialGetAndSetIfNotExists/50k_nodes-4 12.91m 12.90m ~ 1.000
SequentialGetAndSetIfNotExists/100k_nodes-4 22.95m 23.38m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 2.356m 2.262m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 8.456m 8.306m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 14.19m 13.22m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 2.008m 2.026m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 7.644m 7.558m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 41.73m 42.66m ~ 0.400
BlockAssembler_AddTx-4 0.02896n 0.02712n ~ 0.700
AddNode-4 12.00 12.48 ~ 0.100
AddNodeWithMap-4 12.52 12.70 ~ 0.400
DiskTxMap_SetIfNotExists-4 4.283µ 4.205µ ~ 0.400
DiskTxMap_SetIfNotExists_Parallel-4 3.906µ 4.034µ ~ 0.400
DiskTxMap_ExistenceOnly-4 381.9n 435.8n ~ 0.100
Queue-4 207.4n 211.6n ~ 0.200
AtomicPointer-4 8.174n 8.133n ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 872.6µ 846.8µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/10K-4 824.1µ 749.8µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 127.6µ 133.7µ ~ 0.400
ReorgOptimizations/AllMarkFalse/New/10K-4 58.29µ 58.69µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/10K-4 68.14µ 57.82µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/10K-4 11.81µ 11.80µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/10K-4 5.281µ 5.081µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.719µ 1.703µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 12.41m 11.97m ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/100K-4 13.18m 12.94m ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.237m 1.250m ~ 1.000
ReorgOptimizations/AllMarkFalse/New/100K-4 733.4µ 739.7µ ~ 0.200
ReorgOptimizations/HashSlicePool/Old/100K-4 639.3µ 618.7µ ~ 0.700
ReorgOptimizations/HashSlicePool/New/100K-4 357.1µ 328.7µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/100K-4 48.28µ 49.83µ ~ 0.200
ReorgOptimizations/NodeFlags/New/100K-4 16.46µ 17.67µ ~ 0.100
TxMapSetIfNotExists-4 53.28n 53.28n ~ 0.800
TxMapSetIfNotExistsDuplicate-4 48.28n 48.21n ~ 1.000
ChannelSendReceive-4 680.3n 688.5n ~ 0.100
CalcBlockWork-4 497.6n 471.0n ~ 0.600
CalculateWork-4 628.6n 628.6n ~ 1.000
BuildBlockLocatorString_Helpers/Size_10-4 1.357µ 1.352µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_100-4 12.96µ 12.92µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_1000-4 149.0µ 139.3µ ~ 0.400
CatchupWithHeaderCache-4 104.8m 104.9m ~ 0.400
_BufferPoolAllocation/16KB-4 4.031µ 4.022µ ~ 1.000
_BufferPoolAllocation/32KB-4 8.520µ 11.498µ ~ 0.200
_BufferPoolAllocation/64KB-4 17.34µ 20.69µ ~ 0.100
_BufferPoolAllocation/128KB-4 33.00µ 36.69µ ~ 0.100
_BufferPoolAllocation/512KB-4 115.7µ 142.5µ ~ 0.100
_BufferPoolConcurrent/32KB-4 20.73µ 23.11µ ~ 0.100
_BufferPoolConcurrent/64KB-4 32.32µ 35.44µ ~ 0.700
_BufferPoolConcurrent/512KB-4 152.9µ 152.0µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/16KB-4 655.8µ 673.5µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/32KB-4 677.7µ 666.7µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/64KB-4 697.5µ 644.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 700.9µ 649.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 646.3µ 675.8µ ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/16KB-4 37.01m 36.79m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/32KB-4 37.31m 36.76m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/64KB-4 37.31m 36.95m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/128KB-4 36.85m 36.82m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/512KB-4 36.67m 36.26m ~ 0.700
_PooledVsNonPooled/Pooled-4 746.2n 740.6n ~ 0.100
_PooledVsNonPooled/NonPooled-4 8.547µ 8.539µ ~ 1.000
_MemoryFootprint/Current_512KB_32concurrent-4 7.084µ 6.653µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 12.216µ 9.801µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 10.433µ 9.462µ ~ 0.100
_prepareTxsPerLevel-4 406.7m 411.5m ~ 0.700
_prepareTxsPerLevelOrdered-4 3.460m 3.765m ~ 0.700
_prepareTxsPerLevel_Comparison/Original-4 412.7m 412.4m ~ 1.000
_prepareTxsPerLevel_Comparison/Optimized-4 3.453m 3.472m ~ 0.700
SubtreeSizes/10k_tx_4_per_subtree-4 1.350m 1.301m ~ 0.400
SubtreeSizes/10k_tx_16_per_subtree-4 301.6µ 305.6µ ~ 0.700
SubtreeSizes/10k_tx_64_per_subtree-4 74.18µ 73.47µ ~ 0.400
SubtreeSizes/10k_tx_256_per_subtree-4 18.29µ 18.25µ ~ 1.000
SubtreeSizes/10k_tx_512_per_subtree-4 9.060µ 8.943µ ~ 0.200
SubtreeSizes/10k_tx_1024_per_subtree-4 4.489µ 4.401µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.198µ 2.195µ ~ 0.700
BlockSizeScaling/10k_tx_64_per_subtree-4 70.70µ 69.92µ ~ 0.400
BlockSizeScaling/10k_tx_256_per_subtree-4 18.57µ 17.47µ ~ 0.100
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.408µ 4.399µ ~ 1.000
BlockSizeScaling/50k_tx_64_per_subtree-4 372.2µ 368.5µ ~ 0.700
BlockSizeScaling/50k_tx_256_per_subtree-4 88.08µ 88.35µ ~ 0.700
BlockSizeScaling/50k_tx_1024_per_subtree-4 21.71µ 21.97µ ~ 0.100
SubtreeAllocations/small_subtrees_exists_check-4 151.6µ 152.9µ ~ 0.700
SubtreeAllocations/small_subtrees_data_fetch-4 158.8µ 160.4µ ~ 0.400
SubtreeAllocations/small_subtrees_full_validation-4 311.7µ 311.2µ ~ 1.000
SubtreeAllocations/medium_subtrees_exists_check-4 8.967µ 9.008µ ~ 1.000
SubtreeAllocations/medium_subtrees_data_fetch-4 9.382µ 9.326µ ~ 0.200
SubtreeAllocations/medium_subtrees_full_validation-4 17.61µ 17.60µ ~ 1.000
SubtreeAllocations/large_subtrees_exists_check-4 2.107µ 2.126µ ~ 0.700
SubtreeAllocations/large_subtrees_data_fetch-4 2.246µ 2.261µ ~ 0.700
SubtreeAllocations/large_subtrees_full_validation-4 4.390µ 4.370µ ~ 0.700
StoreBlock_Sequential/BelowCSVHeight-4 314.0µ 311.2µ ~ 0.400
StoreBlock_Sequential/AboveCSVHeight-4 312.0µ 311.8µ ~ 0.100
GetUtxoHashes-4 268.9n 269.1n ~ 0.700
GetUtxoHashes_ManyOutputs-4 51.20µ 45.03µ ~ 0.100
_NewMetaDataFromBytes-4 213.2n 214.8n ~ 0.700
_Bytes-4 398.2n 398.5n ~ 0.900
_MetaBytes-4 139.0n 139.3n ~ 0.700

Threshold: >10% with p < 0.05 | Generated: 2026-05-28 14:02 UTC

@ordishs

ordishs commented May 28, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the review. Triage of the medium/minor findings:

Stale orphanage references in stores/utxo/aerospike/ — fixed in c8ae514. Both the infrastructureResultCodes doc comment and the matching utxo_settings.md section now credit the actual handler ("per-record error paths in the calling code — missing parents flow into subtreevalidation's sequential revalidation pass; Lua business-rule rejections surface to the caller") instead of the deleted orphanage. The exclusion rationale itself was correct and is unchanged.

Test coverage of the Kafka-failure path. Took the suggestion to confirm existing coverage rather than add a new bespoke test:

  • Block path's deferral + Phase-3 resolution (the load-bearing logic): TestProcessTransactionsInLevels/MissingParentErrors (re-framed in this PR) plus TestCheckBlockSubtrees and TestCheckBlockSubtrees_SiblingFailureDoesNotCancelInFlight in services/subtreevalidation/check_block_subtrees_test.go cover Phase-2/Phase-3 directly.
  • End-to-end out-of-order multi-node scenarios are exercised implicitly by test/multinode/ and test/e2e/chainintegrity/chain_integrity_test.go.
  • There is no PR-introduced bespoke "Kafka subtree fails → later block resolves the orphan" integration test. Worth filing as a hardening follow-up if reviewers want a named regression test for that specific interaction, but I didn't want to bundle new test infrastructure into a deletion PR.

Verification disclosure. Test plan in the PR description ran go build, go vet -tags testtxmetacache, go test -race -tags testtxmetacache, and golangci-lint. Additionally:

  • staticcheck was not run locally due to a Go-version mismatch between the project's pinned staticcheck and the local toolchain (subagent flagged this during Task 3 verification). CI's staticcheck step is the source of truth.
  • govulncheck and gosec were intentionally skipped: this is a pure deletion that adds no dependencies and no new logic, so they can only re-surface pre-existing findings already present on main.

@ordishs ordishs requested review from freemans13 and icellan and removed request for freemans13 May 28, 2026 13:51
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
40.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@ordishs ordishs merged commit e125d1e into bsv-blockchain:main May 29, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants