Skip to content

fix(utxopersister): drop double-read of fileformat magic in verifyLastSet#971

Merged
liam merged 4 commits into
bsv-blockchain:mainfrom
liam:liam/utxopersister-verifylastset-double-read
May 29, 2026
Merged

fix(utxopersister): drop double-read of fileformat magic in verifyLastSet#971
liam merged 4 commits into
bsv-blockchain:mainfrom
liam:liam/utxopersister-verifylastset-double-read

Conversation

@liam

@liam liam commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

verifyLastSet (and CreateUTXOSet's previous-set read path) was calling fileformat.ReadHeader(r) on a reader that the store layer had already advanced past the 8-byte fileformat header. The redundant read consumed the next 8 bytes of the UTXO set body — the first bytes of the block hash field — and reliably surfaced as:

ERROR | service_manager.go:259 | ServiceManager| Received error: unknown magic: [<8 bytes of block hash>]

The unwrapped fmt.Errorf from pkg/fileformat propagated up through processNextBlock to the errgroup, and ServiceManager treated it as fatal — gracefully shutting down the entire core sidecar (RPC, alert, blockpersister, utxopersister, pruner, legacy) on a peer that received a freshly-mined block.

Why "unknown magic: [<random-looking 8 bytes>]"?

UTXO set file layout (per CreateUTXOSet in UTXOSet.go:543-553):

[8 bytes]  fileformat magic   ("US-1.0  ")
[32 bytes] block hash
[4 bytes]  block height
[32 bytes] previous block hash
[N bytes]  UTXO records

The store layer's GetIoReader advances past the 8-byte magic before returning the reader (and, in the file store, validates magic + FileType — stores/blob/file/file.go:1011-1023; the memory store just strips by size — stores/blob/memory/memory.go:269-271). The two read sites then called ReadHeader again, reading the first 8 bytes of the block hash field — essentially random bytes — and rejecting them as an unknown magic.

The bytes from the original crash log ([202 100 224 254 161 3 101 216]) match the expected shape of a chainhash prefix.

When the crash fires

Only when lastWrittenUTXOSetHash != GenesisHash (Server.go:466) — i.e. once the first non-genesis UTXO set has been written. Before that, verifyLastSet is short-circuited, which is why the crash typically manifested a short while into a node's lifetime rather than at boot.

Originally surfaced by the split-mode chaos suite in #958: when a node mined and broadcast a block after a chaos cycle, peers receiving the block hit the second iteration of processNextBlock, triggered verifyLastSet, and crashed.

Fix

verifyLastSet: Drop the redundant fileformat.ReadHeader call. A successful GetUTXOSetReader is sufficient evidence that the file exists.

CreateUTXOSet's previous-set read path (UTXOSet.go:592): Same double-read pattern, with a worse failure mode. The gate at line 567 had prevented this path from ever being reached against a non-genesis previous set, because verifyLastSet was crashing first. Just removing the ReadHeader here would have unblocked the silent-corruption path (the OUTER NewUTXOWrapperFromReader loop misaligned by 8 bytes, consolidating garbage instead of the previous block's UTXOs). The fix:

  • Drops ReadHeader.
  • Consumes the 68 bytes of per-file metadata (current block hash + height + previous block hash) before the wrapper loop.
  • New behaviour (defense in depth): validates that the stored current-block hash equals c.firstPreviousBlockHash. This is a new failure mode — a corrupt or mis-keyed previous set will now fail loudly with a clear block hash mismatch error where before it would have silently consolidated UTXOs from the wrong ancestor. Surfaced because the bug fix exposes a code path that had been entirely unreachable in production.

Docstring / comment cleanup:

  • verifyLastSet comment on memory-vs-file-store header behaviour rewritten for precision (the memory store advances past the header by size; only the file store validates the magic + FileType).
  • verifyLastSet docstring previously promised a footer integrity check that was never implemented; rewritten to describe what the function actually does.

Regression tests

  • TestVerifyLastSet_ExistingValidSet — happy path; without the fix, fails with the exact production error string. Also asserts NotContains "unknown magic" so future divergent failures don't quietly mask this regression.
  • TestCreateUTXOSet_PreviousSetReadDoesNotDoubleReadMagic — happy path against the second site; without the fix fails with the production error chain (STORAGE_ERROR (69): error reading previous utxo-set header -> UNKNOWN (0): unknown magic: [...]).
  • TestCreateUTXOSet_PreviousSetWrongBlockHash — pins the new defense-in-depth check.

All three new tests verified to fail with the corresponding fix removed and pass with it restored.

Test plan

  • go test ./services/utxopersister/... — full package passes (~60s)
  • go test -race -run 'TestVerifyLastSet|TestCreateUTXOSet' ./services/utxopersister/ — passes under race detector
  • Manually verified each new test fails with the corresponding buggy code restored
  • go build ./services/utxopersister/... clean
  • gci diff clean

Out of scope (separate follow-up)

cmd/utxovalidator/utxo_validator.go:77 has the same double-read pattern against blockStore.GetIoReader. It's a CLI tool not on the hot path; deserves its own small PR.

Companion work

This is the second of the two teranode-side blockers preventing test/multinode_split/scenario_04_block_assembly_isolation from running. The first (fix(utxopersister): nil-guard CreateUTXOSet against empty ConsolidateBlockRange) is in #969. Once both land, the scenario's t.Skip line can be removed.

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

Current Review:
No critical issues found. The fix correctly addresses the double-read bug that was causing production crashes.

Verified:

  • Root cause analysis is accurate: store layer consumes 8-byte magic, redundant ReadHeader consumed first 8 bytes of block hash field
  • File layout matches documented structure (magic + 32-byte hash + 4-byte height + 32-byte prev hash + UTXOs)
  • Both fix sites (verifyLastSet and CreateUTXOSet) correctly drop the redundant ReadHeader call
  • New validation in CreateUTXOSet (block hash match check) adds defense-in-depth
  • Test coverage comprehensive: regression tests pin the exact production failure mode

Minor documentation inconsistency (pre-existing, not blocking):
The error message at UTXOSet.go:552 says "previous block hash" but actually refers to the current block hash (c.lastBlockHash). This is a minor labeling issue in existing code, not introduced by this PR.

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-971 (757dc8b)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 144
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.774µ 1.627µ ~ 0.400
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 71.03n 71.17n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 71.28n 71.34n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 71.05n 71.02n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 32.93n 34.07n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 55.88n 58.64n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 133.7n 151.5n ~ 0.100
MiningCandidate_Stringify_Short-4 221.3n 222.5n ~ 0.300
MiningCandidate_Stringify_Long-4 1.652µ 1.668µ ~ 0.300
MiningSolution_Stringify-4 860.7n 866.9n ~ 0.400
BlockInfo_MarshalJSON-4 1.755µ 1.847µ ~ 0.100
NewFromBytes-4 125.6n 146.2n ~ 0.600
AddTxBatchColumnar_Validation-4 2.596µ 2.371µ ~ 0.400
OffsetValidationLoop-4 722.3n 720.6n ~ 0.700
Mine_EasyDifficulty-4 66.03µ 66.10µ ~ 1.000
Mine_WithAddress-4 7.291µ 7.213µ ~ 0.700
DiskTxMap_SetIfNotExists-4 3.926µ 3.525µ ~ 0.200
DiskTxMap_SetIfNotExists_Parallel-4 3.388µ 3.258µ ~ 0.100
DiskTxMap_ExistenceOnly-4 317.3n 303.9n ~ 0.100
Queue-4 188.1n 188.3n ~ 1.000
AtomicPointer-4 4.703n 4.924n ~ 0.400
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 878.1µ 871.4µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/10K-4 843.1µ 848.8µ ~ 0.700
ReorgOptimizations/AllMarkFalse/Old/10K-4 115.1µ 115.0µ ~ 0.700
ReorgOptimizations/AllMarkFalse/New/10K-4 62.32µ 62.74µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/10K-4 59.84µ 57.23µ ~ 1.000
ReorgOptimizations/HashSlicePool/New/10K-4 11.61µ 11.33µ ~ 0.200
ReorgOptimizations/NodeFlags/Old/10K-4 5.149µ 4.592µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.650µ 1.583µ ~ 0.200
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.590m 9.321m ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/100K-4 9.978m 9.670m ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.089m 1.105m ~ 1.000
ReorgOptimizations/AllMarkFalse/New/100K-4 688.9µ 682.8µ ~ 0.100
ReorgOptimizations/HashSlicePool/Old/100K-4 702.7µ 704.5µ ~ 1.000
ReorgOptimizations/HashSlicePool/New/100K-4 340.1µ 339.1µ ~ 1.000
ReorgOptimizations/NodeFlags/Old/100K-4 50.09µ 51.81µ ~ 0.400
ReorgOptimizations/NodeFlags/New/100K-4 18.17µ 17.40µ ~ 0.700
TxMapSetIfNotExists-4 53.98n 52.76n ~ 0.700
TxMapSetIfNotExistsDuplicate-4 41.91n 40.78n ~ 0.700
ChannelSendReceive-4 639.0n 627.4n ~ 0.100
BlockAssembler_AddTx-4 0.02661n 0.02746n ~ 0.700
AddNode-4 11.29 10.86 ~ 0.400
AddNodeWithMap-4 12.10 10.87 ~ 0.100
DirectSubtreeAdd/4_per_subtree-4 61.25n 56.86n ~ 0.700
DirectSubtreeAdd/64_per_subtree-4 28.83n 29.11n ~ 0.100
DirectSubtreeAdd/256_per_subtree-4 27.63n 27.93n ~ 0.100
DirectSubtreeAdd/1024_per_subtree-4 26.51n 26.51n ~ 0.700
DirectSubtreeAdd/2048_per_subtree-4 26.07n 26.15n ~ 0.100
SubtreeProcessorAdd/4_per_subtree-4 298.7n 314.1n ~ 0.100
SubtreeProcessorAdd/64_per_subtree-4 285.5n 347.2n ~ 0.100
SubtreeProcessorAdd/256_per_subtree-4 295.9n 313.8n ~ 0.100
SubtreeProcessorAdd/1024_per_subtree-4 280.6n 299.9n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 279.7n 302.4n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 287.3n 306.8n ~ 0.100
SubtreeProcessorRotate/64_per_subtree-4 284.0n 291.8n ~ 0.100
SubtreeProcessorRotate/256_per_subtree-4 283.9n 285.5n ~ 0.200
SubtreeProcessorRotate/1024_per_subtree-4 287.2n 281.2n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 55.39n 55.20n ~ 1.000
SubtreeNodeAddOnly/64_per_subtree-4 36.18n 36.15n ~ 1.000
SubtreeNodeAddOnly/256_per_subtree-4 35.14n 35.21n ~ 0.400
SubtreeNodeAddOnly/1024_per_subtree-4 34.47n 34.44n ~ 1.000
SubtreeCreationOnly/4_per_subtree-4 109.8n 110.9n ~ 0.100
SubtreeCreationOnly/64_per_subtree-4 350.9n 351.4n ~ 1.000
SubtreeCreationOnly/256_per_subtree-4 1.216µ 1.233µ ~ 0.100
SubtreeCreationOnly/1024_per_subtree-4 3.748µ 3.824µ ~ 0.100
SubtreeCreationOnly/2048_per_subtree-4 6.840µ 6.962µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 280.5n 294.3n ~ 0.100
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 281.3n 324.9n ~ 0.100
ParallelGetAndSetIfNotExists/1k_nodes-4 2.030m 2.042m ~ 0.200
ParallelGetAndSetIfNotExists/10k_nodes-4 5.261m 5.561m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 7.333m 8.232m ~ 0.100
ParallelGetAndSetIfNotExists/100k_nodes-4 10.18m 11.41m ~ 0.100
SequentialGetAndSetIfNotExists/1k_nodes-4 1.789m 1.796m ~ 1.000
SequentialGetAndSetIfNotExists/10k_nodes-4 4.679m 5.229m ~ 0.100
SequentialGetAndSetIfNotExists/50k_nodes-4 14.67m 14.71m ~ 0.700
SequentialGetAndSetIfNotExists/100k_nodes-4 26.52m 26.76m ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 2.080m 2.063m ~ 0.400
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 8.456m 8.367m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 13.78m 13.81m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 1.885m 1.805m ~ 0.200
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 8.524m 8.435m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 47.66m 49.29m ~ 1.000
CalcBlockWork-4 536.2n 515.0n ~ 0.700
CalculateWork-4 689.4n 710.1n ~ 0.100
BuildBlockLocatorString_Helpers/Size_10-4 1.417µ 1.401µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_100-4 15.70µ 14.81µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_1000-4 129.7µ 132.1µ ~ 0.100
CatchupWithHeaderCache-4 104.8m 104.6m ~ 0.200
_prepareTxsPerLevel-4 418.6m 411.5m ~ 0.400
_prepareTxsPerLevelOrdered-4 3.816m 3.690m ~ 0.400
_prepareTxsPerLevel_Comparison/Original-4 418.4m 423.0m ~ 0.200
_prepareTxsPerLevel_Comparison/Optimized-4 3.595m 3.997m ~ 0.100
_BufferPoolAllocation/16KB-4 3.877µ 3.769µ ~ 0.100
_BufferPoolAllocation/32KB-4 7.877µ 10.682µ ~ 0.100
_BufferPoolAllocation/64KB-4 16.58µ 19.67µ ~ 0.700
_BufferPoolAllocation/128KB-4 30.62µ 32.27µ ~ 0.700
_BufferPoolAllocation/512KB-4 109.3µ 128.9µ ~ 0.100
_BufferPoolConcurrent/32KB-4 19.01µ 18.36µ ~ 0.200
_BufferPoolConcurrent/64KB-4 30.48µ 29.57µ ~ 0.100
_BufferPoolConcurrent/512KB-4 146.6µ 141.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/16KB-4 616.6µ 639.2µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 692.3µ 641.5µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/64KB-4 665.2µ 643.6µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/128KB-4 705.1µ 645.6µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/512KB-4 598.4µ 623.7µ ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/16KB-4 35.94m 36.67m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/32KB-4 35.89m 36.44m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/64KB-4 35.86m 36.15m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/128KB-4 35.88m 36.30m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/512KB-4 35.95m 36.02m ~ 0.700
_PooledVsNonPooled/Pooled-4 831.2n 833.6n ~ 0.200
_PooledVsNonPooled/NonPooled-4 7.582µ 8.465µ ~ 0.100
_MemoryFootprint/Current_512KB_32concurrent-4 6.779µ 7.199µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 9.575µ 10.081µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 8.959µ 9.749µ ~ 0.100
SubtreeSizes/10k_tx_4_per_subtree-4 1.245m 1.294m ~ 0.100
SubtreeSizes/10k_tx_16_per_subtree-4 300.6µ 299.7µ ~ 1.000
SubtreeSizes/10k_tx_64_per_subtree-4 70.12µ 70.46µ ~ 1.000
SubtreeSizes/10k_tx_256_per_subtree-4 17.53µ 17.90µ ~ 0.400
SubtreeSizes/10k_tx_512_per_subtree-4 8.853µ 8.722µ ~ 0.700
SubtreeSizes/10k_tx_1024_per_subtree-4 4.314µ 4.333µ ~ 1.000
SubtreeSizes/10k_tx_2k_per_subtree-4 2.180µ 2.137µ ~ 0.100
BlockSizeScaling/10k_tx_64_per_subtree-4 68.42µ 68.48µ ~ 0.700
BlockSizeScaling/10k_tx_256_per_subtree-4 17.65µ 17.33µ ~ 0.100
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.434µ 4.285µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 365.1µ 368.8µ ~ 0.700
BlockSizeScaling/50k_tx_256_per_subtree-4 89.11µ 86.55µ ~ 0.100
BlockSizeScaling/50k_tx_1024_per_subtree-4 21.14µ 21.37µ ~ 0.200
SubtreeAllocations/small_subtrees_exists_check-4 151.9µ 152.5µ ~ 0.700
SubtreeAllocations/small_subtrees_data_fetch-4 158.2µ 157.4µ ~ 1.000
SubtreeAllocations/small_subtrees_full_validation-4 306.9µ 307.7µ ~ 0.400
SubtreeAllocations/medium_subtrees_exists_check-4 9.075µ 9.097µ ~ 0.400
SubtreeAllocations/medium_subtrees_data_fetch-4 9.396µ 9.308µ ~ 0.700
SubtreeAllocations/medium_subtrees_full_validation-4 17.52µ 17.42µ ~ 1.000
SubtreeAllocations/large_subtrees_exists_check-4 2.131µ 2.132µ ~ 1.000
SubtreeAllocations/large_subtrees_data_fetch-4 2.229µ 2.236µ ~ 0.700
SubtreeAllocations/large_subtrees_full_validation-4 4.340µ 4.341µ ~ 0.700
StoreBlock_Sequential/BelowCSVHeight-4 334.0µ 337.5µ ~ 0.200
StoreBlock_Sequential/AboveCSVHeight-4 340.8µ 334.6µ ~ 0.100
GetUtxoHashes-4 269.7n 270.9n ~ 1.000
GetUtxoHashes_ManyOutputs-4 45.82µ 49.94µ ~ 0.400
_NewMetaDataFromBytes-4 218.4n 214.9n ~ 0.700
_Bytes-4 405.3n 398.2n ~ 0.200
_MetaBytes-4 143.0n 139.2n ~ 0.100

Threshold: >10% with p < 0.05 | Generated: 2026-05-28 18:36 UTC

@liam liam requested review from ordishs and oskarszoon May 28, 2026 10:26

@ordishs ordishs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — the fix is clean, well-explained, and the regression test is well-designed (uses real layout bytes so restoring the buggy line reproduces the exact production "unknown magic" failure mode rather than a generic header mismatch).

Strong follow-up request: same bug pattern at UTXOSet.go:592

The identical double-read anti-pattern exists in CreateUTXOSet:

// UTXOSet.go:569 — store layer already strips/validates the header
previousUTXOSetReader, err := us.store.GetIoReader(ctx, c.firstPreviousBlockHash[:], fileformat.FileTypeUtxoSet)
...
previousUTXOSetReader = &readCloserWrapper{
    Reader: bufio.NewReaderSize(previousUTXOSetReader, bufferSize.Int()),
    Closer: previousUTXOSetReader.(io.Closer),
}
defer previousUTXOSetReader.Close()

// UTXOSet.go:592 — same double-read bug
previousHeader, err := fileformat.ReadHeader(previousUTXOSetReader)
if err != nil {
    return errors.NewStorageError("error reading previous utxo-set header", err)
}
if previousHeader.FileType() != fileformat.FileTypeUtxoSet {
    return errors.NewStorageError("previous utxo-set header is not a utxo-set header")
}
  • The bufio.NewReaderSize wrap does not reset position — it buffers reads from a reader that is already past the header.
  • The guard at line 567 (firstPreviousBlockHash != GenesisHash) is the same gate that masks the bug during early lifetime, mirroring verifyLastSet.
  • The FileType check is also redundant — validateFileHeader in the file store already enforces it.
  • Subsequent reads via NewUTXOWrapperFromReader would then be misaligned by 8 bytes, likely silently corrupting consolidation rather than crashing — potentially worse than the original bug.

Given the companion work in #969 is actively exercising CreateUTXOSet, please either fix this in the same PR (it's a one-line removal with the same justification) or open an immediate follow-up before scenario_04 is unskipped.

Minor (non-blocking)

  1. Server.go:646 — "already validates the fileformat header (8-byte magic + matching FileType)" is precise for the file store but the memory store at memory.go:269-271 only strips by header size and does not validate the magic. Consider: "already advances past the fileformat header (and, in the file store, validates the magic + FileType)".

  2. Pre-existing: verifyLastSet docstring still promises footer validation that neither the old nor new code performs — worth a follow-up cleanup.

  3. Consider asserting on the absence of "unknown magic" in the test, to make the regression intent explicit against future divergent failures.

@liam

liam commented May 28, 2026

Copy link
Copy Markdown
Collaborator Author

Pushed 4b88151b0 addressing all review feedback:

Strong follow-up (UTXOSet.go:592) — Fixed in the same PR. The reviewer's read of the surrounding code was exactly right: the OUTER UTXOWrapper loop had never reached non-genesis previous sets in practice because verifyLastSet was crashing first. Just dropping the ReadHeader here would have unblocked the silent-corruption path. The fix now:

  • Drops the redundant ReadHeader call.
  • Consumes the 68-byte metadata block that CreateUTXOSet writes (32B current block hash + 4B height + 32B previous block hash) before the wrapper loop.
  • Validates that the stored current-block hash equals c.firstPreviousBlockHash so file/key confusion fails loudly with a clear error instead of silently consolidating from the wrong ancestor.

Two new regression tests:

  • TestCreateUTXOSet_PreviousSetReadDoesNotDoubleReadMagic — happy path; without the fix it fails with the production "unknown magic" error chain (STORAGE_ERROR (69): error reading previous utxo-set header -> UNKNOWN (0): unknown magic: [...]).
  • TestCreateUTXOSet_PreviousSetWrongBlockHash — pins the new mismatch detection.

Minor (non-blocking) — all addressed:

  • Server.go:646 comment rewritten — was file-store-precise but misleading for memory store. New comment is precise for both without overpromising memory store's magic validation.
  • verifyLastSet docstring rewritten — dropped the footer-validation promise that was never implemented; now describes what the function actually does (existence + openability).
  • NotContains "unknown magic" assertion added to TestVerifyLastSet_ExistingValidSet to make regression intent explicit against future divergent failures.

Out of scope but noted for follow-up: cmd/utxovalidator/utxo_validator.go:77 has the same double-read pattern against blockStore.GetIoReader. It's a CLI tool not on the hot path so leaving it for a separate small PR — happy to pick that up next if useful.

@icellan icellan self-requested a review May 28, 2026 11:29

@oskarszoon oskarszoon left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve. Root cause and fix both correct.

Both file and memory stores already advance past the 8-byte magic before returning a reader (file.go:GetIoReader via validateFileHeader; memory.go:Get:271 slices bd.data[header.Size():]). Calling fileformat.ReadHeader on top consumed the first 8 bytes of the body (start of the stored current-block-hash field) and crashed the sidecar.

  • verifyLastSet: ReadHeader removed; store-layer open already checks existence + FileType == FileTypeUtxoSet. Sufficient.
  • CreateUTXOSet (UTXOSet.go:592-619): new code reads the 68-byte metadata explicitly (32 current hash + 4 height + 32 parent hash) matching the layout written at UTXOSet.go:543-556. The storedCurrentBlockHash.IsEqual(c.firstPreviousBlockHash) check is a real correctness improvement — would have caught file/key confusion silently before.

Tests pin the regression on both paths.

Nit: UTXOSet_test.go:113 uses assert.Contains; repo convention is require.Contains. Safe in practice (follows require.Error) but a one-line fix.

@icellan

icellan commented May 28, 2026

Copy link
Copy Markdown
Contributor

Code Review

Verification (local, on 4b88151b0): go build, go vet, full services/utxopersister/... tests (61s), and go test -race on the new regression tests all pass clean. golangci-lint flags 2 trivial prealloc warnings in the new tests.

Verdict: approve with changes. Diagnosis is right, fix is minimal in the right places, and the new tests pin the exact production failure mode. Couple of items worth addressing before merge:

High — dead reads in CreateUTXOSet (services/utxopersister/UTXOSet.go:610-619)

var (
    storedBlockHeight       uint32
    storedPreviousBlockHash chainhash.Hash
)
if err := binary.Read(previousUTXOSetReader, binary.LittleEndian, &storedBlockHeight); err != nil {
    return errors.NewStorageError("error reading previous utxo-set block height", err)
}
if _, err := io.ReadFull(previousUTXOSetReader, storedPreviousBlockHash[:]); err != nil {
    return errors.NewStorageError("error reading previous utxo-set previous block hash", err)
}

These 36 bytes are read into named variables but never compared against anything. The hash-mismatch check above already establishes the file is correctly keyed, so either validate them against something meaningful (note: c.lastBlockHeight / c.previousBlockHash describe the new set being written, not the previous one's stored values — so they don't make a natural comparison without an extra lookup) or just consume the bytes:

if _, err := io.CopyN(io.Discard, previousUTXOSetReader, 4+32); err != nil {
    return errors.NewStorageError("error skipping previous utxo-set header trailer", err)
}

As-is it reads like a half-finished check and invites future "is this var ever used?" confusion.

Medium — PR description undersells the change

The body says "drop the redundant fileformat.ReadHeader," but CreateUTXOSet also gains a new storedCurrentBlockHash.IsEqual(c.firstPreviousBlockHash) validation. That's a defensible defense-in-depth addition (it catches file/key confusion that the old magic check couldn't), but it's a new failure mode — a corrupt or mis-keyed previous set will now fail loudly where before it silently consolidated the wrong UTXOs. Worth one line in the PR body so reviewers don't gloss over it.

Low — preallocate in new tests

services/utxopersister/UTXOSet_test.go:50 and :95golangci-lint wants:

body := make([]byte, 0, len(previousBlockHash)+len(heightBuf)+len(grandparentHash))

Worth fixing while you're here to land lint-clean.

Out of scope (follow-up issue)

cmd/utxovalidator/utxo_validator.go:77 reproduces the exact same anti-pattern — it calls fileformat.ReadHeader(br) on a reader returned by blockStore.GetIoReader(...). Against a File-backed store this CLI tool will hit the same unknown magic failure mode. Do not bundle here; deserves its own PR.

Other notes

  • Docstring on verifyLastSet honestly notes "the previous docstring also promised a footer integrity check that this function never implemented" — appreciated, that's the right way to handle stale comments.
  • Layout cross-check: write side in CreateUTXOSet (UTXOSet.go:543-553) and the direct-file reader in cmd/seeder/seeder.go:410-436 both agree on magic | currentHash | height | prevHash | UTXOs…. The new read side is aligned.
  • No BSV protocol, consensus, or cryptography surface touched.

@icellan icellan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment

liam added 4 commits May 28, 2026 19:18
…tSet

verifyLastSet was calling fileformat.ReadHeader(r) on a reader that the
store layer had already advanced past the 8-byte fileformat header
(file.go's GetIoReader -> validateFileHeader; the memory store's
GetIoReader returns content with the header already stripped).

The extra read consumed the *next* 8 bytes of the UTXO set body —
which, per the file layout written by CreateUTXOSet, are the first 8
bytes of the block hash field. Block hashes are essentially random
bytes, so this reliably failed with:

  ERROR | service_manager.go:259 | ServiceManager| Received error: \
    unknown magic: [<8 bytes of block hash>]

The unwrapped fmt.Errorf from pkg/fileformat propagated through
processNextBlock back to the errgroup, and ServiceManager treated it
as fatal — gracefully shutting down the entire core sidecar (RPC,
alert, blockpersister, utxopersister, pruner, legacy) along with it.

This only fires when lastWrittenUTXOSetHash != GenesisHash
(Server.go:466), i.e. once the first non-genesis UTXO set has been
written. Before that, verifyLastSet is short-circuited. That's why
the crash typically manifested a short while into a node's lifetime
(after the first block past genesis was persisted) rather than at
boot.

Fix: trust the store layer's header validation and drop the redundant
ReadHeader call. A successful GetUTXOSetReader is sufficient evidence
that the file exists and has the correct FileTypeUtxoSet magic.

Regression test (TestVerifyLastSet_ExistingValidSet) seeds a store
with a valid UTXO set whose post-header body matches the real
on-disk layout (32-byte block hash + 4-byte height); without the fix
the test fails with exactly the production error. The existing
TestVerifyLastSet_NoUTXOSetExists still pins the missing-file path.
CI lint failed on services/utxopersister/Server_test.go with gci
flagging mis-aligned inline comments on consecutive append lines.
…ment cleanup

Addresses review feedback on the verifyLastSet fix.

1. The identical double-read pattern existed in CreateUTXOSet at
   UTXOSet.go:592. The store layer's GetIoReader had already advanced
   past the 8-byte fileformat magic, but the function called
   fileformat.ReadHeader on the wrapped (bufio) reader anyway —
   bufio.NewReaderSize does not reset position, it buffers reads from
   the already-advanced reader. Same gate (firstPreviousBlockHash !=
   GenesisHash) masked it during early lifetime. Once verifyLastSet
   stopped crashing, this path would have been exercised and would
   have either errored loudly ("unknown magic") or — worse — silently
   misaligned the OUTER UTXOWrapper loop by 8 bytes, consolidating
   garbage instead of the previous block's UTXOs.

   Fix: drop the ReadHeader call (the store already validated the
   header in the file store; the memory store advances past it by
   size, which is fine for tests) and consume the 68 bytes of
   per-file metadata that CreateUTXOSet itself writes (32-byte
   current block hash + 4-byte height + 32-byte previous block
   hash) before the wrapper loop. Validate the stored current
   block hash equals c.firstPreviousBlockHash so file/key
   confusion fails loudly instead of silently consolidating from
   the wrong ancestor.

2. Server.go verifyLastSet comment was precise for the file store
   but glossed over that the memory store does not validate the
   magic (only strips by size). Rewritten to be precise without
   misrepresenting either implementation.

3. verifyLastSet docstring previously promised header + footer
   integrity checks that the function never implemented. Rewritten
   to describe what it actually does (existence + openability).

4. TestVerifyLastSet_ExistingValidSet now also asserts that the
   error, if any, does not contain "unknown magic" — making the
   regression intent explicit so future divergent failures
   (e.g. some new store-layer change) aren't confused with this
   specific bug.

5. Two new regression tests for the CreateUTXOSet path:
   - PreviousSetReadDoesNotDoubleReadMagic exercises the
     happy-path read of a valid previous UTXO set; without the
     fix it fails with the production "unknown magic" error chain.
   - PreviousSetWrongBlockHash pins the new validation that catches
     file/key confusion.

All four tests verified to fail without the corresponding fix
restored and pass with it.
- UTXOSet.go: replace the two dead-read named variables with an
  io.CopyN(io.Discard) over the 36 trailing metadata bytes. The
  hash-equality check above is the only meaningful validation; the
  height and grandparent hash have no natural comparison target in
  the new set being written, so consuming them rather than parsing
  avoids "is this variable ever used?" confusion.

- UTXOSet_test.go: preallocate the body slices in the two new
  regression tests to satisfy golangci-lint's prealloc check.
@liam liam force-pushed the liam/utxopersister-verifylastset-double-read branch from cfaefb6 to 7ac456f Compare May 28, 2026 18:21
@sonarqubecloud

Copy link
Copy Markdown

@liam liam requested a review from icellan May 29, 2026 08:59
@liam liam merged commit 3306639 into bsv-blockchain:main May 29, 2026
25 checks passed
liam added a commit to liam/teranode that referenced this pull request May 31, 2026
CreateUTXOSet appends a 16-byte footer (txCount + utxoCount) after the
final UTXOWrapper, but the OUTER loop that reads a previous block's UTXO
set during consolidation only broke on a clean io.EOF. With the footer
present, the next txid read came up 16 bytes short and io.ReadFull
returned io.ErrUnexpectedEOF, which the loop treated as fatal:

  STORAGE_ERROR: error reading previous utxo-set (...) at iteration 1
  -> failed to read txid, expected 32 bytes got 16 -> unexpected EOF

That killed the UTXOPersister service and, via errgroup, the whole core
sidecar on any non-genesis consolidation. The path was dead code until
the verifyLastSet/CreateUTXOSet magic double-read fixes (bsv-blockchain#971) made it
reachable, so this latent bug only surfaced afterwards.

Also break the loop when the wrapper read short-reads the footer. The
match is by substring ("unexpected EOF"): teranode's errors.New flattens
a non-*Error cause to its message, so errors.Is(err, io.ErrUnexpectedEOF)
would reduce to the same check - it is not structural. This is narrower
than cmd/utxovalidator (which also matches "failed to read txid"), so a
real non-EOF read error stays loud. A structural fix - a typed sentinel
from FromReader plus validating records-read == txCount - is a follow-up.

Add a unit test that stages a realistic previous set (header + one
wrapper + footer); it reproduces the exact crash without the fix and
passes with it. The pre-existing previous-set test passed only because
it staged zero wrappers and no footer, hitting a clean io.EOF.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants