Skip to content

fix(utxo/aerospike): clear locked flag on pagination records when mining (#1037)#1039

Merged
oskarszoon merged 3 commits into
bsv-blockchain:mainfrom
oskarszoon:fix/sync-locked
Jun 5, 2026
Merged

fix(utxo/aerospike): clear locked flag on pagination records when mining (#1037)#1039
oskarszoon merged 3 commits into
bsv-blockchain:mainfrom
oskarszoon:fix/sync-locked

Conversation

@oskarszoon

@oskarszoon oskarszoon commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #1037 — fresh legacy IBD wedges forever at testnet block 604,315 with TX_LOCKED on two parent UTXOs that read as unlocked.

Root cause: SetMinedMulti (the "mined ⇒ spendable" safety net) only cleared the locked flag on the master UTXO record. For external/paginated transactions (>utxoBatchSize outputs) the pagination/extra records kept locked=true forever, so a child spending an output that lives on a pagination record (vout ≥ utxoBatchSize) failed permanently with TX_LOCKED.

Causal chain (verified on the wedged node)

  1. A tx is created WithLocked(true) on every record (master + pagination) by the block-validation quick-validate pipeline (quick_validate.go:1147) and the validator 2PC.
  2. The child-aware unlock (SetLocked(false), which recurses into pagination records) only runs on success. quick_validate.go:440-443 documents that on error the unlock is never called; recovery is expected from SetMinedMulti on retry (block 604,314 = internal id 635313 fails at SetMinedMulti, then permanently at block-add BLOCK_EXISTS, so the unlock never runs).
  3. SetMinedMulti (lua setMined UDF teranode.lua:647 and the Go expression batch set_mined_expressions.go:276) is keyed on hash[:] = master only, so the documented recovery never touches the pagination records.

Live aerospikereader on both stuck parents:

record locked generation
master (vouts 0–511) false 15
extra rec 1 (vouts 512+) true 2

Children spend cdb5409c… vout 827 and 1519b4cd… vout 575 — both >512, both on the locked pagination record. The master reads unlocked while SPEND_BATCH_LUA reports the parent locked. Node runs the lua path (EnableSetMinedFilterExpressions=false).

Fix

SetMinedMulti now clears locked on all of a mined tx's records, restoring the documented invariant.

  • teranode.lua setMined surfaces totalExtraRecs as childCount on every mine so the client knows how many pagination records to unlock. Lua version bumped v59v60 so the UDF re-registers on startup.
  • New clearLockedOnRecordsMulti helper (unfiltered UPDATE_ONLY Locked=false, tolerant of KEY_NOT_FOUND) clears pagination records. The lock-clearing follow-up is returned from the result processors as a lockClearWork value and executed by the public SetMinedMulti / SetMinedMultiWithExpressions methods (applyLockClearWork), so the processors do no follow-up I/O and stay unit-testable.
  • Expression path, FILTERED_OUT records (blockID already present, where the filter skips the whole write incl. Locked=false and the reads): fully unlocked via SetLocked(false), which reads the child count from the master and clears every record. Closes the filter-gating bug.

Testing

  • go build ./..., go vet ./stores/utxo/aerospike/, gofmt -l: clean.
  • Regression tests (set_mined_locked_test.go, lua + expression paths):
    • TestSetMinedMultiClearsLockOnPaginationRecords — mine clears all records.
    • TestSetMinedMultiWithExpressionsClearsLockWhenBlockIDFilteredOut — master unlocked on a FILTERED_OUT write.
    • TestSetMinedMultiHealsAlreadyMinedLockedPaginationRecord — re-mine of an already-mined tx whose pagination record is still locked heals it (this failed on the expression path before the FILTERED_OUT fix).
  • Full stores/utxo/aerospike package suite: ok.

Deploy / remediation — important

This fix is triggered by SetMinedMulti. It prevents the permanent lock and heals an orphan whenever SetMinedMulti re-runs on the affected tx (i.e. while its block is (re)processed during IBD).

It does not passively heal an instance that is already wedged with the affected block as the accepted tip: on such a node only the failing block (604,315) loops, its parents' block (604,314) is never reprocessed, so SetMinedMulti is never called on the parents and nothing clears the stale pagination lock. To clear an already-wedged instance, do one of:

  • Reset + resync (simplest). On a fresh sync the parents' block is processed under the new code, so the create→fail→retry→SetMinedMulti path heals the orphan as the block is applied — no permanent wedge.
  • Force-reprocess the parents' block without a full resync: invalidateblock <604314 hash> then reconsiderblock <604314 hash>, which re-runs SetMinedMulti on the parents and clears the pagination locks.

Requires the v60 bump (included) so the UDF re-registers on startup.

Server investigation was read-only (docker ps/inspect/logs, asinfo, aerospikereader, getfsmstate, settings dump) — no writes, no FSM changes, no restarts.

…ing (bsv-blockchain#1037)

SetMinedMulti — the "mined => spendable" safety net — only cleared the
`locked` flag on the master UTXO record. For external/paginated transactions
(>utxoBatchSize outputs) the pagination/extra records kept locked=true forever.

A tx is created WithLocked(true) on EVERY record by the block-validation
quick-validate pipeline and the validator 2PC. Those paths unlock via
SetLocked(false), which does recurse into child records — but only on success.
quick_validate documents that on error the unlock is never called and recovery
comes from SetMinedMulti on retry. Because SetMinedMulti was keyed on the master
record only, the pagination records were never unlocked, so a child spending an
output that lives on a pagination record (vout >= utxoBatchSize) failed
permanently with TX_LOCKED, wedging legacy IBD in a reset loop.

Fix: SetMinedMulti now clears locked on all of a mined tx's records.
- teranode.lua setMined surfaces totalExtraRecs as childCount on every mine so
  the client knows how many pagination records to unlock (lua version bumped to
  v60 so the UDF re-registers).
- New clearLockedOnRecordsMulti helper: unfiltered UPDATE_ONLY Locked=false batch
  over the pagination records, tolerant of KEY_NOT_FOUND. The lock-clear I/O is
  performed by the public SetMinedMulti / SetMinedMultiWithExpressions methods so
  the result processors stay free of follow-up writes and remain unit-testable.
- Expression path also clears the master lock on FILTERED_OUT records (blockID
  already present), where the blockID filter would otherwise skip the Locked=false
  op entirely.

Regression test (set_mined_locked_test.go) covers both the Lua UDF and
filter-expression paths plus the FILTERED_OUT case; it fails without this change.
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

Summary

This PR correctly fixes issue #1037 where legacy IBD wedged forever at block 604,315 with TX_LOCKED on pagination records. The root cause analysis is accurate and the fix is sound.

Findings

No issues found. The implementation demonstrates careful engineering:

  • Both code paths covered: Lua UDF and filter-expression paths both clear pagination locks
  • FILTERED_OUT edge case handled: Expression path uses SetLocked(false) when blockID filter skips the write
  • Comprehensive testing: Three regression tests cover the main bug, the filter-gating bug, and the re-mine healing scenario
  • Testable design: Lock-clearing I/O separated from result processors for unit-testability
  • Error handling: clearLockedOnRecordsMulti tolerates KEY_NOT_FOUND appropriately
  • Lua version bump: v59→v60 ensures UDF re-registers on startup
  • Documentation: Extensive inline comments explain the causal chain and design decisions

The fix properly maintains the invariant that mining a transaction makes ALL outputs spendable by clearing locks on master + pagination records.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-1039 (fe21f3e)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 148
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.598µ 1.596µ ~ 0.800
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 71.01n 71.15n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 71.30n 71.71n ~ 0.100
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 71.06n 71.14n ~ 0.400
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 32.81n 32.59n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 55.93n 54.78n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 128.1n 130.7n ~ 0.300
MiningCandidate_Stringify_Short-4 217.2n 217.2n ~ 0.800
MiningCandidate_Stringify_Long-4 1.650µ 1.645µ ~ 1.000
MiningSolution_Stringify-4 850.1n 848.8n ~ 0.400
BlockInfo_MarshalJSON-4 1.717µ 1.721µ ~ 0.100
NewFromBytes-4 124.4n 124.3n ~ 1.000
AddTxBatchColumnar_Validation-4 2.510µ 2.516µ ~ 1.000
OffsetValidationLoop-4 719.4n 719.6n ~ 1.000
Mine_EasyDifficulty-4 60.45µ 60.81µ ~ 0.200
Mine_WithAddress-4 6.810µ 7.012µ ~ 1.000
BlockAssembler_AddTx-4 0.02708n 0.02818n ~ 0.300
AddNode-4 12.46 12.21 ~ 0.100
AddNodeWithMap-4 13.21 13.49 ~ 0.700
DirectSubtreeAdd/4_per_subtree-4 58.15n 58.30n ~ 1.000
DirectSubtreeAdd/64_per_subtree-4 31.65n 32.56n ~ 0.100
DirectSubtreeAdd/256_per_subtree-4 31.05n 30.66n ~ 0.700
DirectSubtreeAdd/1024_per_subtree-4 29.35n 29.27n ~ 0.200
DirectSubtreeAdd/2048_per_subtree-4 28.87n 28.93n ~ 0.100
SubtreeProcessorAdd/4_per_subtree-4 281.0n 285.3n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 276.0n 277.7n ~ 0.100
SubtreeProcessorAdd/256_per_subtree-4 278.3n 283.5n ~ 0.100
SubtreeProcessorAdd/1024_per_subtree-4 270.6n 270.9n ~ 0.700
SubtreeProcessorAdd/2048_per_subtree-4 269.4n 270.9n ~ 0.700
SubtreeProcessorRotate/4_per_subtree-4 276.9n 275.9n ~ 0.400
SubtreeProcessorRotate/64_per_subtree-4 273.3n 279.5n ~ 0.400
SubtreeProcessorRotate/256_per_subtree-4 274.7n 280.7n ~ 0.100
SubtreeProcessorRotate/1024_per_subtree-4 275.5n 271.9n ~ 0.400
SubtreeNodeAddOnly/4_per_subtree-4 54.01n 54.62n ~ 0.100
SubtreeNodeAddOnly/64_per_subtree-4 34.31n 34.38n ~ 0.400
SubtreeNodeAddOnly/256_per_subtree-4 33.28n 33.46n ~ 0.300
SubtreeNodeAddOnly/1024_per_subtree-4 32.75n 32.59n ~ 0.500
SubtreeCreationOnly/4_per_subtree-4 114.4n 114.4n ~ 0.700
SubtreeCreationOnly/64_per_subtree-4 403.5n 405.1n ~ 1.000
SubtreeCreationOnly/256_per_subtree-4 1.356µ 1.364µ ~ 0.700
SubtreeCreationOnly/1024_per_subtree-4 4.403µ 4.434µ ~ 0.400
SubtreeCreationOnly/2048_per_subtree-4 8.110µ 8.231µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 275.2n 277.6n ~ 0.400
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 274.2n 277.4n ~ 0.100
ParallelGetAndSetIfNotExists/1k_nodes-4 2.216m 2.193m ~ 0.100
ParallelGetAndSetIfNotExists/10k_nodes-4 5.525m 5.450m ~ 0.400
ParallelGetAndSetIfNotExists/50k_nodes-4 7.781m 7.580m ~ 0.700
ParallelGetAndSetIfNotExists/100k_nodes-4 10.59m 10.80m ~ 0.700
SequentialGetAndSetIfNotExists/1k_nodes-4 1.948m 1.933m ~ 0.100
SequentialGetAndSetIfNotExists/10k_nodes-4 4.493m 4.550m ~ 0.700
SequentialGetAndSetIfNotExists/50k_nodes-4 12.56m 13.76m ~ 0.100
SequentialGetAndSetIfNotExists/100k_nodes-4 22.50m 24.02m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 2.260m 2.258m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 8.527m 8.311m ~ 0.400
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 13.71m 13.06m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 1.961m 1.951m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 7.819m 7.424m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 42.33m 41.25m ~ 0.100
DiskTxMap_SetIfNotExists-4 3.390µ 3.476µ ~ 0.700
DiskTxMap_SetIfNotExists_Parallel-4 3.379µ 3.186µ ~ 0.100
DiskTxMap_ExistenceOnly-4 303.0n 294.7n ~ 0.100
Queue-4 190.5n 188.0n ~ 0.700
AtomicPointer-4 4.404n 4.266n ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 852.0µ 857.5µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/10K-4 802.0µ 812.0µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 105.8µ 109.2µ ~ 0.700
ReorgOptimizations/AllMarkFalse/New/10K-4 61.57µ 61.47µ ~ 1.000
ReorgOptimizations/HashSlicePool/Old/10K-4 65.00µ 56.62µ ~ 0.700
ReorgOptimizations/HashSlicePool/New/10K-4 11.37µ 12.00µ ~ 0.100
ReorgOptimizations/NodeFlags/Old/10K-4 4.553µ 4.724µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.579µ 1.654µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.088m 9.231m ~ 0.700
ReorgOptimizations/DedupFilterPipeline/New/100K-4 8.981m 9.040m ~ 0.400
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.039m 1.083m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 680.2µ 681.6µ ~ 1.000
ReorgOptimizations/HashSlicePool/Old/100K-4 600.4µ 618.3µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 301.4µ 338.3µ ~ 0.100
ReorgOptimizations/NodeFlags/Old/100K-4 48.09µ 49.83µ ~ 0.100
ReorgOptimizations/NodeFlags/New/100K-4 16.66µ 17.22µ ~ 0.100
TxMapSetIfNotExists-4 51.91n 52.07n ~ 0.100
TxMapSetIfNotExistsDuplicate-4 40.02n 39.90n ~ 0.700
ChannelSendReceive-4 595.2n 627.4n ~ 0.100
CalcBlockWork-4 363.1n 364.7n ~ 1.000
CalculateWork-4 489.5n 498.9n ~ 0.700
CheckOldBlockIDs/on-chain-prefetch/1000-4 55.38µ 54.82µ ~ 0.200
CheckOldBlockIDs/off-chain-prefetch/1000-4 45.23µ 45.39µ ~ 0.400
CheckOldBlockIDs/on-chain-prefetch/10000-4 417.7µ 542.9µ ~ 0.100
CheckOldBlockIDs/off-chain-prefetch/10000-4 332.8µ 348.0µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_10-4 1.355µ 1.448µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_100-4 12.90µ 14.02µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_1000-4 127.4µ 139.0µ ~ 0.100
CatchupWithHeaderCache-4 104.5m 104.7m ~ 0.100
_BufferPoolAllocation/16KB-4 3.856µ 3.836µ ~ 0.500
_BufferPoolAllocation/32KB-4 8.344µ 8.519µ ~ 0.700
_BufferPoolAllocation/64KB-4 18.39µ 15.72µ ~ 0.700
_BufferPoolAllocation/128KB-4 33.63µ 27.16µ ~ 0.200
_BufferPoolAllocation/512KB-4 111.0µ 110.4µ ~ 0.700
_BufferPoolConcurrent/32KB-4 20.67µ 18.75µ ~ 0.200
_BufferPoolConcurrent/64KB-4 30.38µ 29.78µ ~ 0.200
_BufferPoolConcurrent/512KB-4 150.1µ 146.3µ ~ 0.200
_SubtreeDeserializationWithBufferSizes/16KB-4 649.7µ 704.2µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 637.5µ 695.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 650.5µ 703.9µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 650.1µ 709.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 625.4µ 624.4µ ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/16KB-4 36.83m 36.59m ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.88m 36.29m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/64KB-4 36.89m 36.36m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/128KB-4 36.85m 36.29m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/512KB-4 36.70m 36.00m ~ 0.100
_PooledVsNonPooled/Pooled-4 835.3n 831.0n ~ 0.100
_PooledVsNonPooled/NonPooled-4 8.871µ 7.939µ ~ 0.100
_MemoryFootprint/Current_512KB_32concurrent-4 7.123µ 7.360µ ~ 0.700
_MemoryFootprint/Proposed_32KB_32concurrent-4 10.014µ 9.642µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.576µ 9.106µ ~ 0.200
_prepareTxsPerLevel-4 404.0m 407.3m ~ 0.400
_prepareTxsPerLevelOrdered-4 3.544m 3.910m ~ 0.400
_prepareTxsPerLevel_Comparison/Original-4 397.0m 406.7m ~ 0.200
_prepareTxsPerLevel_Comparison/Optimized-4 3.545m 3.709m ~ 0.100
SubtreeSizes/10k_tx_4_per_subtree-4 1.255m 1.275m ~ 0.400
SubtreeSizes/10k_tx_16_per_subtree-4 295.8µ 299.9µ ~ 0.700
SubtreeSizes/10k_tx_64_per_subtree-4 71.02µ 71.01µ ~ 1.000
SubtreeSizes/10k_tx_256_per_subtree-4 17.70µ 17.96µ ~ 0.400
SubtreeSizes/10k_tx_512_per_subtree-4 8.828µ 8.825µ ~ 0.700
SubtreeSizes/10k_tx_1024_per_subtree-4 4.451µ 4.379µ ~ 0.700
SubtreeSizes/10k_tx_2k_per_subtree-4 2.199µ 2.156µ ~ 0.100
BlockSizeScaling/10k_tx_64_per_subtree-4 70.92µ 68.99µ ~ 0.700
BlockSizeScaling/10k_tx_256_per_subtree-4 18.06µ 17.31µ ~ 0.100
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.399µ 4.331µ ~ 0.200
BlockSizeScaling/50k_tx_64_per_subtree-4 368.2µ 368.4µ ~ 0.400
BlockSizeScaling/50k_tx_256_per_subtree-4 87.38µ 87.56µ ~ 1.000
BlockSizeScaling/50k_tx_1024_per_subtree-4 21.42µ 21.69µ ~ 0.100
SubtreeAllocations/small_subtrees_exists_check-4 149.9µ 151.0µ ~ 0.700
SubtreeAllocations/small_subtrees_data_fetch-4 157.7µ 158.4µ ~ 0.700
SubtreeAllocations/small_subtrees_full_validation-4 309.6µ 307.6µ ~ 0.200
SubtreeAllocations/medium_subtrees_exists_check-4 8.899µ 8.910µ ~ 1.000
SubtreeAllocations/medium_subtrees_data_fetch-4 9.283µ 9.251µ ~ 1.000
SubtreeAllocations/medium_subtrees_full_validation-4 17.55µ 17.31µ ~ 0.700
SubtreeAllocations/large_subtrees_exists_check-4 2.097µ 2.104µ ~ 0.700
SubtreeAllocations/large_subtrees_data_fetch-4 2.222µ 2.205µ ~ 1.000
SubtreeAllocations/large_subtrees_full_validation-4 4.321µ 4.400µ ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 341.1µ 336.5µ ~ 0.200
StoreBlock_Sequential/AboveCSVHeight-4 339.7µ 339.6µ ~ 1.000
GetUtxoHashes-4 282.5n 276.0n ~ 1.000
GetUtxoHashes_ManyOutputs-4 45.71µ 45.90µ ~ 0.700
_NewMetaDataFromBytes-4 213.5n 213.6n ~ 1.000
_Bytes-4 403.1n 398.0n ~ 1.000
_MetaBytes-4 140.4n 138.5n ~ 0.200

Threshold: >10% with p < 0.05 | Generated: 2026-06-04 21:06 UTC

…re-mine (bsv-blockchain#1037)

Follow-up to the pagination-lock fix. Two gaps closed:

1. Expression path, FILTERED_OUT (blockID already present): the batch write —
   and all its reads — are skipped, so neither the child count nor the master's
   lock state are known. The previous attempt only cleared the master, leaving
   pagination records locked, so an already-orphaned paginated tx never healed on
   re-mine. Now such records go through SetLocked(false), which reads the child
   count from the master and clears every record (master + all pagination records).

2. The lock-clearing follow-up is now returned from the result processors as a
   lockClearWork value and executed by the public SetMinedMulti /
   SetMinedMultiWithExpressions methods via applyLockClearWork, so the processors
   do no follow-up I/O and stay unit-testable.

Adds TestSetMinedMultiHealsAlreadyMinedLockedPaginationRecord, which reproduces
the wedged state (master mined, pagination record still locked) and re-mines with
an already-present blockID; it failed on the expression path before this change.
…in#1037 pagination-lock fix

Findings from a mixture-of-experts review of PR bsv-blockchain#1039.

P1 — close the unit-coverage gap on the load-bearing FILTERED_OUT unlock:
- TestProcessBatchResultsForSetMinedExpressions_FilteredOutSynthesizesMapEntry now
  binds the lockClearWork return and asserts work.fullUnlock contains the hash (and
  work.items is empty). Previously it discarded the return, so deleting the
  work.fullUnlock append in set_mined_expressions.go left the test green; only the
  container integration test (skipped without Docker) guarded it.

P2 / nits:
- teranode.lua: document the childCount == totalExtraRecs invariant behind the
  unconditional FIELD_CHILD_COUNT overwrite on a mine (comment only, no version bump).
- set_mined.go applyLockClearWork: assign the first non-nil error directly instead of
  errors.Join(nil, err), which would drop the rich *errors.Error type.
- set_mined.go clearLockedOnRecordsMulti: guard the child-index loop against a
  non-positive childCount (uint32 wrap), and document the reliance on per-record
  (not top-level) KEY_NOT_FOUND reporting.
- set_mined_expressions.go: collapse the nested `if nrErrors > 0 { if errs != nil }`
  to the single guard the Lua path uses (`errs != nil || nrErrors > 0`).

New integration test TestSetMinedMultiUnlocksDespiteSiblingError (lua + expression):
a paginated tx that mines successfully alongside a sibling hash that errors must
still have its pagination records unlocked — guards the "runs even when err != nil"
contract that prevents a single bad tx from stranding a healthy parent's lock.
@sonarqubecloud

sonarqubecloud Bot commented Jun 4, 2026

Copy link
Copy Markdown

@ordishs ordishs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

Verified the fix against the actual record semantics rather than the description:

  • Expression path writes Locked=false on the master (set_mined_expressions.go:276) and reads back TotalExtraRecs (:266).
  • Lua childCount invariant holds: setDeleteAtHeight returns childCount == BIN_TOTAL_EXTRA_RECS for external records, and the new unconditional overwrite sits outside the signal block, so DAH follow-up work is unaffected.
  • fullUnlock via SetLocked(false) recurses into pagination records (locked.go:177-204).
  • GetAerospikeBatchWritePolicy returns a fresh policy, so mutating RecordExistsAction = UPDATE_ONLY is local (no shared-state race).
  • clearLockedOnRecordsMulti mirrors the accepted SetDAHForChildRecordsMulti pattern.

Strengths: thorough root-cause evidence, testable processor/IO separation, and TestSetMinedMultiUnlocksDespiteSiblingError pins the load-bearing 'run lock-clear even on sibling error' behavior.

Non-blocking notes:

  • Write amplification: every mine of an external tx now issues an extra batch write to pagination records unconditionally (heavier on the expression FILTERED_OUT path = full SetLocked per tx). Inherent to the approach; worth monitoring SetMinedBatch metrics + Aerospike write load post-deploy.
  • Ensure CI actually runs the aerospike-tagged tests — all new behavioral tests skip under -short, so they're the only end-to-end protection for this fix.
  • Carry the 'already-wedged tip needs reset+resync or invalidateblock/reconsiderblock on 604,314' caveat into the deploy runbook.

@oskarszoon oskarszoon self-assigned this Jun 5, 2026

@freemans13 freemans13 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

Correct, well-reasoned fix for a real production wedge (#1037). Root-cause analysis is thorough and the fix matches it: SetMinedMulti now clears locked on all of a mined tx's records (master + pagination), restoring the documented mined-⇒-spendable invariant. Clean separation — result processors stay write-free and return []lockClearItem, the public methods perform the lock-clear I/O. Lua v59→v60 bump is present. Build passes locally.

Verified the one open question (expression-path FILTERED_OUT only clears the master, not pagination) is not a forward-operation bug: under normal operation the first mine always passes the filter and clears pagination, so the 'blockID present + pagination still locked' state is unreachable except from pre-fix data. The reported incident is on the default lua path, which self-heals correctly on re-mine. Residual gap is a bounded self-heal limitation for EE expression-path nodes carrying pre-fix wedged txs — acceptable to address as a follow-up.

Non-blocking follow-ups for later:

  • Close the expression-path FILTERED_OUT pagination gap (e.g. route filtered txs through the lua UDF for the heal), or document the lua-path remediation.
  • Add a non-integration unit test for clearLockedOnRecordsMulti counting / KEY_NOT_FOUND tolerance (current new tests are Aerospike integration-only, skipped in make test).

@oskarszoon oskarszoon merged commit 88efbce into bsv-blockchain:main Jun 5, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

legacy: fresh IBD wedges at testnet 604,315 — TX_LOCKED on parent UTXOs created by parallel propagation/peer path, never unlocked

3 participants