Skip to content

fix(utxovalidator): consume fileformat magic at the reader source, not twice in validate#979

Merged
liam merged 2 commits into
bsv-blockchain:mainfrom
liam:liam/utxovalidator-double-read
May 29, 2026
Merged

fix(utxovalidator): consume fileformat magic at the reader source, not twice in validate#979
liam merged 2 commits into
bsv-blockchain:mainfrom
liam:liam/utxovalidator-double-read

Conversation

@liam

@liam liam commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Same double-read anti-pattern as #971, this time in the utxovalidator CLI tool. Flagged in #971's review as a separate follow-up because this isn't on a hot path.

ValidateUTXOFile accepts a path and produces an io.ReadCloser from one of two sources:

getLocalFileReader  -> os.Open(path)
getBlobStoreReader  -> blockStore.GetIoReader(...)

Only the second consumes the 8-byte fileformat magic before returning (via the file store's validateFileHeader or the memory store's strip-by-size). validateUTXOSet then unconditionally called fileformat.ReadHeader(br), which:

  • Worked for local-file callers — the file still has the magic at offset 0.
  • Crashed for blob-store callers — the next 8 bytes of the body are the first bytes of the block hash field, so the read failed with unknown magic: [<random hash bytes>].

The blob-store path is reachable when the CLI tool is pointed at a 64-char block hash instead of a file path (utxo_validator.go:170-176).

Fix

Uniformise the two readers so both hand validateUTXOSet a body-only reader:

  • getLocalFileReader now consumes and validates the magic itself (returns a reader positioned past the magic — same shape as GetIoReader).
  • validateUTXOSet drops the ReadHeader call entirely.
  • The FileType == FileTypeUtxoSet assertion moves into getLocalFileReader. The blob-store path keeps its existing FileType check via the file store's validateFileHeader.

Regression tests

  • TestValidateUTXOFile_LocalDoesNotDoubleReadMagic — writes a minimal valid UTXO set file to a tempdir and verifies ValidateUTXOFile parses the block hash / height / previous hash correctly. Without the fix, restoring the ReadHeader call in validateUTXOSet causes this test to fail with the exact production error chain (PROCESSING (4): error reading file header -> UNKNOWN (0): unknown magic: [...]).
  • TestValidateUTXOFile_LocalRejectsWrongFileType — writes a file with subtree magic and verifies the FileType check that moved into getLocalFileReader still rejects mistyped files with a clear error.

Test plan

  • go test ./cmd/utxovalidator/... clean
  • go test -race -run TestValidateUTXOFile_Local ./cmd/utxovalidator/ clean
  • go build ./cmd/utxovalidator/... clean
  • gci diff clean
  • Manually verified each new test fails with the buggy ReadHeader line restored

Related

Same anti-pattern, different site as #971 (fix(utxopersister): drop double-read of fileformat magic in verifyLastSet). Reviewer in #971 explicitly asked to land this as a separate small PR.

…t twice in validate

ValidateUTXOFile routes through two reader sources:

  getLocalFileReader -> os.Open(path)
  getBlobStoreReader -> blockStore.GetIoReader(...)

Only the second consumes the 8-byte fileformat magic before returning
(via the file store's validateFileHeader, or the memory store's
strip-by-size). validateUTXOSet was then unconditionally calling
fileformat.ReadHeader(br), which:

- worked for local-file callers (file still has the magic), but
- against the blob store, read the *next* 8 bytes of the body —
  the first bytes of the block hash field — and failed with
  "unknown magic: [<random hash bytes>]"

The blob-store path is the same anti-pattern as the verifyLastSet /
CreateUTXOSet fix in bsv-blockchain#971; flagged in its review as a separate
follow-up because this is a CLI tool not on the hot path.

Fix: uniformise the two readers so both hand validateUTXOSet a
body-only reader.

  - getLocalFileReader now consumes + validates the magic itself
    (returns a reader positioned past the magic, same shape as
    GetIoReader).
  - validateUTXOSet drops its ReadHeader call. Both reader sources
    are now equivalent.
  - The FileType==FileTypeUtxoSet assertion moves into
    getLocalFileReader since that's where the read happens. The
    blob-store path keeps its existing FileType check via the
    file store's validateFileHeader.

Regression tests on the local-file path:

  - LocalDoesNotDoubleReadMagic: a tempfile with valid magic +
    metadata + no wrappers; without the fix, fails with the exact
    production error chain.
  - LocalRejectsWrongFileType: a tempfile with subtree magic;
    pins that the FileType check moved into getLocalFileReader
    still rejects mistyped files with a clear error.
@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

No issues found. The fix correctly unifies the reader contract between local-file and blob-store paths.

Summary:

  • The bug fix is sound: moved magic consumption into getLocalFileReader to match the blob-store contract
  • Both reader sources now consistently return body-only readers positioned past the 8-byte magic
  • Tests properly verify the fix (double-read regression test + file-type validation)
  • Code follows project patterns and includes clear documentation

// not be confused with the original double-read bug.
if err != nil {
assert.NotContains(t, err.Error(), "unknown magic")
}

@github-actions github-actions Bot May 28, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor] Lines 82-84 are unreachable since line 79 uses require.NoError which exits the test on error. Consider removing:

if err != nil {
    assert.NotContains(t, err.Error(), "unknown magic")
}

The regression documentation value is already covered by the require statement message at line 79.

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-979 (da8ed52)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 144
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.634µ 1.611µ ~ 0.100
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 71.39n 71.78n ~ 1.000
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 71.02n 71.22n ~ 0.200
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 71.07n 71.30n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 35.28n 39.09n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 61.95n 62.88n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 145.9n 152.8n ~ 1.000
MiningCandidate_Stringify_Short-4 220.8n 219.1n ~ 0.400
MiningCandidate_Stringify_Long-4 1.668µ 1.635µ ~ 0.100
MiningSolution_Stringify-4 869.5n 862.2n ~ 0.400
BlockInfo_MarshalJSON-4 1.760µ 1.828µ ~ 0.100
NewFromBytes-4 123.9n 123.3n ~ 0.100
AddTxBatchColumnar_Validation-4 2.353µ 2.366µ ~ 1.000
OffsetValidationLoop-4 720.6n 720.2n ~ 1.000
Mine_EasyDifficulty-4 60.67µ 60.94µ ~ 0.100
Mine_WithAddress-4 6.780µ 6.781µ ~ 0.700
BlockAssembler_AddTx-4 0.02926n 0.02505n ~ 0.400
AddNode-4 10.75 10.51 ~ 1.000
AddNodeWithMap-4 11.50 10.54 ~ 0.200
DirectSubtreeAdd/4_per_subtree-4 58.43n 57.46n ~ 1.000
DirectSubtreeAdd/64_per_subtree-4 29.09n 28.97n ~ 0.400
DirectSubtreeAdd/256_per_subtree-4 27.96n 28.13n ~ 1.000
DirectSubtreeAdd/1024_per_subtree-4 26.51n 26.43n ~ 0.700
DirectSubtreeAdd/2048_per_subtree-4 26.07n 26.13n ~ 0.600
SubtreeProcessorAdd/4_per_subtree-4 300.8n 288.5n ~ 0.100
SubtreeProcessorAdd/64_per_subtree-4 303.2n 282.3n ~ 0.100
SubtreeProcessorAdd/256_per_subtree-4 291.6n 283.9n ~ 0.100
SubtreeProcessorAdd/1024_per_subtree-4 283.6n 277.5n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 283.7n 277.1n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 287.0n 281.1n ~ 0.200
SubtreeProcessorRotate/64_per_subtree-4 297.1n 278.9n ~ 0.200
SubtreeProcessorRotate/256_per_subtree-4 301.7n 280.3n ~ 0.200
SubtreeProcessorRotate/1024_per_subtree-4 284.2n 279.7n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 55.25n 54.81n ~ 0.100
SubtreeNodeAddOnly/64_per_subtree-4 36.14n 36.20n ~ 1.000
SubtreeNodeAddOnly/256_per_subtree-4 35.07n 35.16n ~ 0.300
SubtreeNodeAddOnly/1024_per_subtree-4 34.42n 34.52n ~ 0.500
SubtreeCreationOnly/4_per_subtree-4 111.5n 111.3n ~ 0.400
SubtreeCreationOnly/64_per_subtree-4 356.7n 346.7n ~ 0.100
SubtreeCreationOnly/256_per_subtree-4 1.273µ 1.245µ ~ 0.200
SubtreeCreationOnly/1024_per_subtree-4 3.906µ 3.818µ ~ 0.100
SubtreeCreationOnly/2048_per_subtree-4 7.114µ 6.800µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 282.0n 279.9n ~ 1.000
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 286.8n 277.8n ~ 0.100
ParallelGetAndSetIfNotExists/1k_nodes-4 2.028m 2.007m ~ 0.700
ParallelGetAndSetIfNotExists/10k_nodes-4 5.209m 5.133m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 7.357m 7.130m ~ 0.200
ParallelGetAndSetIfNotExists/100k_nodes-4 10.36m 10.34m ~ 0.400
SequentialGetAndSetIfNotExists/1k_nodes-4 1.810m 1.786m ~ 0.700
SequentialGetAndSetIfNotExists/10k_nodes-4 4.657m 4.533m ~ 0.200
SequentialGetAndSetIfNotExists/50k_nodes-4 14.01m 14.12m ~ 0.700
SequentialGetAndSetIfNotExists/100k_nodes-4 25.66m 26.53m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 2.103m 2.077m ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 8.469m 8.332m ~ 0.200
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 13.64m 14.04m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 1.841m 1.835m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 9.333m 9.642m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 46.00m 50.88m ~ 0.100
DiskTxMap_SetIfNotExists-4 3.759µ 3.729µ ~ 0.700
DiskTxMap_SetIfNotExists_Parallel-4 3.484µ 3.563µ ~ 0.200
DiskTxMap_ExistenceOnly-4 310.7n 331.5n ~ 0.100
Queue-4 196.4n 191.5n ~ 0.700
AtomicPointer-4 3.668n 3.633n ~ 0.700
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 892.7µ 883.5µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/10K-4 826.2µ 832.4µ ~ 0.700
ReorgOptimizations/AllMarkFalse/Old/10K-4 112.2µ 108.1µ ~ 0.700
ReorgOptimizations/AllMarkFalse/New/10K-4 64.29µ 64.56µ ~ 1.000
ReorgOptimizations/HashSlicePool/Old/10K-4 62.98µ 52.38µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/10K-4 11.10µ 11.25µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/10K-4 5.225µ 4.616µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 2.008µ 1.661µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 10.056m 9.797m ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/100K-4 9.502m 10.877m ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.156m 1.095m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 705.1µ 709.6µ ~ 0.200
ReorgOptimizations/HashSlicePool/Old/100K-4 588.3µ 580.6µ ~ 0.700
ReorgOptimizations/HashSlicePool/New/100K-4 202.7µ 206.6µ ~ 1.000
ReorgOptimizations/NodeFlags/Old/100K-4 49.60µ 50.26µ ~ 0.100
ReorgOptimizations/NodeFlags/New/100K-4 16.98µ 17.97µ ~ 0.100
TxMapSetIfNotExists-4 50.60n 50.36n ~ 1.000
TxMapSetIfNotExistsDuplicate-4 41.65n 42.97n ~ 0.100
ChannelSendReceive-4 600.5n 622.0n ~ 0.200
CalcBlockWork-4 371.3n 366.2n ~ 0.100
CalculateWork-4 497.4n 506.3n ~ 1.000
BuildBlockLocatorString_Helpers/Size_10-4 1.346µ 1.331µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_100-4 14.41µ 15.75µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_1000-4 127.4µ 128.0µ ~ 0.200
CatchupWithHeaderCache-4 104.5m 104.7m ~ 0.200
_BufferPoolAllocation/16KB-4 4.113µ 3.898µ ~ 0.400
_BufferPoolAllocation/32KB-4 9.084µ 9.027µ ~ 0.700
_BufferPoolAllocation/64KB-4 18.36µ 17.12µ ~ 0.700
_BufferPoolAllocation/128KB-4 36.91µ 38.09µ ~ 0.100
_BufferPoolAllocation/512KB-4 121.3µ 131.2µ ~ 0.100
_BufferPoolConcurrent/32KB-4 22.50µ 21.95µ ~ 0.700
_BufferPoolConcurrent/64KB-4 30.94µ 30.17µ ~ 0.100
_BufferPoolConcurrent/512KB-4 146.0µ 148.2µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/16KB-4 663.7µ 632.1µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 664.2µ 645.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 657.9µ 641.9µ ~ 0.200
_SubtreeDeserializationWithBufferSizes/128KB-4 658.5µ 642.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 647.8µ 604.3µ ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/16KB-4 37.12m 36.86m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.96m 36.48m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/64KB-4 36.83m 36.54m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/128KB-4 36.54m 36.93m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/512KB-4 36.71m 35.96m ~ 0.100
_PooledVsNonPooled/Pooled-4 833.2n 832.1n ~ 0.700
_PooledVsNonPooled/NonPooled-4 7.945µ 8.238µ ~ 0.400
_MemoryFootprint/Current_512KB_32concurrent-4 7.553µ 7.207µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 9.558µ 9.566µ ~ 1.000
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.543µ 10.068µ ~ 0.200
_prepareTxsPerLevel-4 408.4m 405.8m ~ 0.400
_prepareTxsPerLevelOrdered-4 3.754m 3.725m ~ 0.700
_prepareTxsPerLevel_Comparison/Original-4 398.3m 403.8m ~ 0.700
_prepareTxsPerLevel_Comparison/Optimized-4 3.895m 3.709m ~ 0.700
SubtreeSizes/10k_tx_4_per_subtree-4 963.7µ 975.1µ ~ 0.400
SubtreeSizes/10k_tx_16_per_subtree-4 237.2µ 242.7µ ~ 0.400
SubtreeSizes/10k_tx_64_per_subtree-4 56.69µ 56.08µ ~ 0.400
SubtreeSizes/10k_tx_256_per_subtree-4 14.01µ 14.13µ ~ 0.700
SubtreeSizes/10k_tx_512_per_subtree-4 6.929µ 6.964µ ~ 0.700
SubtreeSizes/10k_tx_1024_per_subtree-4 3.409µ 3.428µ ~ 0.800
SubtreeSizes/10k_tx_2k_per_subtree-4 1.711µ 1.712µ ~ 1.000
BlockSizeScaling/10k_tx_64_per_subtree-4 54.01µ 54.64µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 13.79µ 13.76µ ~ 1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4 3.373µ 3.421µ ~ 0.200
BlockSizeScaling/50k_tx_64_per_subtree-4 294.0µ 288.3µ ~ 0.400
BlockSizeScaling/50k_tx_256_per_subtree-4 68.39µ 69.54µ ~ 0.700
BlockSizeScaling/50k_tx_1024_per_subtree-4 17.00µ 17.10µ ~ 0.700
SubtreeAllocations/small_subtrees_exists_check-4 117.8µ 117.0µ ~ 0.700
SubtreeAllocations/small_subtrees_data_fetch-4 124.5µ 123.4µ ~ 1.000
SubtreeAllocations/small_subtrees_full_validation-4 237.3µ 238.9µ ~ 1.000
SubtreeAllocations/medium_subtrees_exists_check-4 7.111µ 7.041µ ~ 0.100
SubtreeAllocations/medium_subtrees_data_fetch-4 7.311µ 7.396µ ~ 0.700
SubtreeAllocations/medium_subtrees_full_validation-4 13.84µ 13.88µ ~ 0.700
SubtreeAllocations/large_subtrees_exists_check-4 1.680µ 1.659µ ~ 0.100
SubtreeAllocations/large_subtrees_data_fetch-4 1.791µ 1.782µ ~ 1.000
SubtreeAllocations/large_subtrees_full_validation-4 3.505µ 3.416µ ~ 0.700
StoreBlock_Sequential/BelowCSVHeight-4 332.0µ 338.9µ ~ 0.700
StoreBlock_Sequential/AboveCSVHeight-4 331.0µ 330.0µ ~ 0.700
GetUtxoHashes-4 252.0n 252.8n ~ 0.700
GetUtxoHashes_ManyOutputs-4 49.85µ 50.07µ ~ 0.700
_NewMetaDataFromBytes-4 227.3n 227.6n ~ 0.300
_Bytes-4 424.8n 423.7n ~ 0.400
_MetaBytes-4 137.1n 138.5n ~ 0.600

Threshold: >10% with p < 0.05 | Generated: 2026-05-28 18:54 UTC

@liam liam requested a review from sugh01 May 28, 2026 18:38
The if err != nil { assert.NotContains(...) } block sat below
require.NoError, which exits the test on error — so the branch was
unreachable dead code. The require.NoError message already names the
"unknown magic" regression mode, so dropping the dead block doesn't
lose any documentation value.
@sonarqubecloud

Copy link
Copy Markdown

@liam liam requested a review from ordishs May 29, 2026 09:09
@liam liam requested a review from freemans13 May 29, 2026 11:52
@liam liam merged commit d4069bd into bsv-blockchain:main May 29, 2026
31 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants