Skip to content

fix(fileformat): make header readers consistent + accept chunked reads + fuzz#1006

Merged
liam merged 1 commit into
bsv-blockchain:mainfrom
liam:liam/fuzz-fileformat-header
Jun 2, 2026
Merged

fix(fileformat): make header readers consistent + accept chunked reads + fuzz#1006
liam merged 1 commit into
bsv-blockchain:mainfrom
liam:liam/fuzz-fileformat-header

Conversation

@liam

@liam liam commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Third in the deserializer-fuzzing series (after #1004 and #1005). I expected the fileformat header to be low-yield "insurance" (it already has a length guard) — but a differential fuzz of its two parsers surfaced two real robustness bugs.

Bugs

1. Header.Read used a single r.Read for the 8-byte magic. io.Reader.Read is allowed to return fewer than 8 bytes with a nil error — sockets, TLS conns, io.Pipe, and bufio over a slow source all do this. The code then hit n != 8 and rejected a perfectly valid header that simply arrived in chunks ("expected to read 8 bytes, got 1"). A file or bytes.Reader fills 8 in one call so it was invisible there, but streaming readers fail. → use io.ReadFull.

2. ReadHeaderFromBytes lacked the trailing-0x000x20 normalization that Header.Read has. Older files padded the magic with NULs instead of spaces. Header.Read normalizes them (see the existing TestHeader_Read_BackwardCompatibility); ReadHeaderFromBytes did not — so the same file was accepted via the streaming path and rejected as "unknown magic" via the byte-slice path. That reader/writer disagreement is the same class that has crashed the core sidecar before.

Fix

Extract the normalization into a shared normalizeMagic helper and apply it in both readers; switch Header.Read to io.ReadFull. Existing error-message assertions are preserved.

Tests

  • TestHeaderRead_AcceptsChunkedReader (bug 1): every valid magic parses when delivered one byte at a time via iotest.OneByteReader.
  • TestReadHeaderFromBytes_BackwardCompatibility (bug 2): a NUL-padded magic is accepted with the correct file type.
  • FuzzHeaderParsersAgree: differential fuzz asserting ReadHeaderFromBytes and Header.Read (over a one-byte-at-a-time reader) agree on validity and file type — catches both normalization drift and short-read mishandling.
  • FuzzReadHeaderFromBytes: no-panic.

Verification

  • Both new unit tests fail before the fix (got 1, unknown magic: [66 45 49 46 48 0 0 0]), pass after.
  • FuzzHeaderParsersAgree ~1.8M execs, FuzzReadHeaderFromBytes ~1.3M execs — no crashers.
  • Full pkg/fileformat -race, go vet, gofmt, gci — clean. Existing tests unchanged.

Completes the planned three-target series. Net: 3 PRs, 4 distinct bugs (two unbounded-allocation/OOM, two header-parser inconsistencies), each pinned by a fuzz target + regression test.

Fuzzing the two header parsers against each other surfaced two real
robustness bugs:

1. Header.Read used a single r.Read for the 8-byte magic. io.Reader.Read
   is permitted to return fewer than 8 bytes with a nil error (sockets,
   TLS, io.Pipe, bufio over a slow source), so a valid header arriving in
   chunks was rejected with "expected to read 8 bytes". Use io.ReadFull.

2. ReadHeaderFromBytes did not apply the trailing-0x00 -> 0x20 backward-
   compatibility normalization that Header.Read does, so an older NUL-
   padded magic was accepted via the streaming path but rejected as
   "unknown magic" via the byte-slice path — the same reader/writer
   disagreement class that has crashed the core sidecar before.

Extract the normalization into a shared normalizeMagic helper and apply
it in both readers. Existing error-message assertions are preserved.

Tests:
  - TestHeaderRead_AcceptsChunkedReader (bug 1): every valid magic parses
    when delivered one byte at a time via iotest.OneByteReader.
  - TestReadHeaderFromBytes_BackwardCompatibility (bug 2): a NUL-padded
    magic is accepted with the correct file type.
  - FuzzHeaderParsersAgree: differential fuzz asserting ReadHeaderFromBytes
    and Header.Read (over a one-byte-at-a-time reader) agree on validity
    and file type. ~1.8M execs, no failures.
  - FuzzReadHeaderFromBytes: no-panic, ~1.3M execs.

Third in the deserializer-fuzzing series (after the utxopersister and
block-parse allocation fixes).
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

Current Review:
No issues found. This PR correctly fixes two real robustness bugs with well-designed tests.

Analysis Summary:

  • Bug Fix 1 (chunked reads): Correctly uses io.ReadFull instead of single r.Read to handle readers that deliver data in chunks
  • Bug Fix 2 (normalization consistency): Properly extracts normalizeMagic helper and applies it in both Header.Read and ReadHeaderFromBytes
  • Test Coverage: Excellent - includes targeted regression tests for both bugs plus differential fuzz testing
  • Code Quality: Clean refactoring that improves maintainability by eliminating parser drift risk

@sonarqubecloud

sonarqubecloud Bot commented Jun 1, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-1006 (610cf41)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 144
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.773µ 1.770µ ~ 1.000
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 61.49n 61.76n ~ 1.000
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 61.64n 61.51n ~ 0.500
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 61.37n 61.45n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 30.06n 30.23n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 51.43n 52.06n ~ 0.400
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 104.3n 104.2n ~ 0.500
MiningCandidate_Stringify_Short-4 263.5n 261.0n ~ 0.700
MiningCandidate_Stringify_Long-4 1.912µ 1.917µ ~ 0.400
MiningSolution_Stringify-4 967.1n 973.5n ~ 0.200
BlockInfo_MarshalJSON-4 1.779µ 1.790µ ~ 0.100
NewFromBytes-4 133.0n 132.3n ~ 0.700
AddTxBatchColumnar_Validation-4 2.610µ 2.497µ ~ 0.400
OffsetValidationLoop-4 594.3n 592.7n ~ 0.400
Mine_EasyDifficulty-4 67.26µ 66.99µ ~ 0.800
Mine_WithAddress-4 7.003µ 7.069µ ~ 0.100
BlockAssembler_AddTx-4 0.02554n 0.02945n ~ 0.400
AddNode-4 14.43 14.19 ~ 1.000
AddNodeWithMap-4 14.77 14.71 ~ 0.700
DirectSubtreeAdd/4_per_subtree-4 57.76n 60.56n ~ 1.000
DirectSubtreeAdd/64_per_subtree-4 30.44n 30.77n ~ 0.700
DirectSubtreeAdd/256_per_subtree-4 29.25n 29.54n ~ 0.100
DirectSubtreeAdd/1024_per_subtree-4 28.39n 28.45n ~ 1.000
DirectSubtreeAdd/2048_per_subtree-4 27.84n 27.96n ~ 0.700
SubtreeProcessorAdd/4_per_subtree-4 293.0n 325.6n ~ 0.100
SubtreeProcessorAdd/64_per_subtree-4 291.5n 301.5n ~ 0.100
SubtreeProcessorAdd/256_per_subtree-4 296.5n 296.8n ~ 0.700
SubtreeProcessorAdd/1024_per_subtree-4 286.6n 296.8n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 295.6n 311.6n ~ 0.400
SubtreeProcessorRotate/4_per_subtree-4 309.2n 302.1n ~ 0.700
SubtreeProcessorRotate/64_per_subtree-4 308.0n 294.2n ~ 0.100
SubtreeProcessorRotate/256_per_subtree-4 321.1n 298.3n ~ 0.100
SubtreeProcessorRotate/1024_per_subtree-4 334.4n 298.3n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 55.91n 55.94n ~ 0.700
SubtreeNodeAddOnly/64_per_subtree-4 34.93n 34.87n ~ 1.000
SubtreeNodeAddOnly/256_per_subtree-4 33.82n 33.75n ~ 0.400
SubtreeNodeAddOnly/1024_per_subtree-4 33.14n 32.92n ~ 0.400
SubtreeCreationOnly/4_per_subtree-4 118.6n 117.1n ~ 0.700
SubtreeCreationOnly/64_per_subtree-4 447.1n 460.3n ~ 0.400
SubtreeCreationOnly/256_per_subtree-4 1.499µ 1.447µ ~ 0.200
SubtreeCreationOnly/1024_per_subtree-4 4.796µ 4.559µ ~ 0.100
SubtreeCreationOnly/2048_per_subtree-4 8.910µ 8.915µ ~ 1.000
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 311.5n 278.8n ~ 0.200
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 320.2n 278.3n ~ 0.100
ParallelGetAndSetIfNotExists/1k_nodes-4 2.360m 2.275m ~ 0.100
ParallelGetAndSetIfNotExists/10k_nodes-4 5.817m 5.653m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 8.412m 7.729m ~ 0.100
ParallelGetAndSetIfNotExists/100k_nodes-4 12.35m 10.95m ~ 0.100
SequentialGetAndSetIfNotExists/1k_nodes-4 2.117m 1.972m ~ 0.100
SequentialGetAndSetIfNotExists/10k_nodes-4 6.062m 4.562m ~ 0.100
SequentialGetAndSetIfNotExists/50k_nodes-4 18.79m 12.75m ~ 0.100
SequentialGetAndSetIfNotExists/100k_nodes-4 28.31m 25.46m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 2.506m 2.338m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 8.962m 8.697m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 15.28m 14.97m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 2.132m 2.020m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 10.224m 9.268m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 68.76m 44.19m ~ 0.100
DiskTxMap_SetIfNotExists-4 3.590µ 3.736µ ~ 0.400
DiskTxMap_SetIfNotExists_Parallel-4 3.394µ 3.261µ ~ 0.400
DiskTxMap_ExistenceOnly-4 339.0n 397.6n ~ 0.700
Queue-4 195.9n 194.6n ~ 0.700
AtomicPointer-4 4.590n 4.895n ~ 0.200
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 932.6µ 892.8µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/10K-4 836.3µ 829.6µ ~ 1.000
ReorgOptimizations/AllMarkFalse/Old/10K-4 133.5µ 107.3µ ~ 0.100
ReorgOptimizations/AllMarkFalse/New/10K-4 63.16µ 62.68µ ~ 0.400
ReorgOptimizations/HashSlicePool/Old/10K-4 57.29µ 63.89µ ~ 0.400
ReorgOptimizations/HashSlicePool/New/10K-4 11.77µ 11.47µ ~ 0.200
ReorgOptimizations/NodeFlags/Old/10K-4 4.950µ 5.133µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.663µ 1.699µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 10.07m 10.64m ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/100K-4 10.78m 11.09m ~ 0.400
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.192m 1.153m ~ 0.200
ReorgOptimizations/AllMarkFalse/New/100K-4 689.5µ 687.0µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/100K-4 551.4µ 602.6µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 298.7µ 322.1µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/100K-4 54.19µ 51.65µ ~ 0.100
ReorgOptimizations/NodeFlags/New/100K-4 18.73µ 17.96µ ~ 0.200
TxMapSetIfNotExists-4 52.21n 52.31n ~ 1.000
TxMapSetIfNotExistsDuplicate-4 40.24n 40.21n ~ 1.000
ChannelSendReceive-4 612.8n 613.1n ~ 1.000
CalcBlockWork-4 530.9n 543.3n ~ 0.100
CalculateWork-4 741.8n 733.3n ~ 1.000
BuildBlockLocatorString_Helpers/Size_10-4 1.352µ 1.665µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_100-4 12.97µ 12.96µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_1000-4 128.6µ 128.9µ ~ 1.000
CatchupWithHeaderCache-4 105.0m 104.7m ~ 0.100
_BufferPoolAllocation/16KB-4 4.413µ 4.236µ ~ 0.100
_BufferPoolAllocation/32KB-4 9.127µ 9.153µ ~ 0.700
_BufferPoolAllocation/64KB-4 22.81µ 21.12µ ~ 0.700
_BufferPoolAllocation/128KB-4 38.41µ 37.43µ ~ 0.200
_BufferPoolAllocation/512KB-4 146.9µ 131.6µ ~ 0.100
_BufferPoolConcurrent/32KB-4 24.86µ 24.43µ ~ 0.100
_BufferPoolConcurrent/64KB-4 38.41µ 37.92µ ~ 1.000
_BufferPoolConcurrent/512KB-4 154.4µ 158.3µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/16KB-4 648.7µ 653.4µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/32KB-4 671.3µ 648.5µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 679.0µ 655.0µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 677.0µ 652.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 650.4µ 661.1µ ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/16KB-4 38.08m 37.15m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/32KB-4 37.92m 37.44m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/64KB-4 37.85m 37.90m ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/128KB-4 37.51m 37.30m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/512KB-4 37.79m 37.03m ~ 0.200
_PooledVsNonPooled/Pooled-4 741.0n 734.4n ~ 0.700
_PooledVsNonPooled/NonPooled-4 9.213µ 8.471µ ~ 0.200
_MemoryFootprint/Current_512KB_32concurrent-4 7.519µ 7.435µ ~ 1.000
_MemoryFootprint/Proposed_32KB_32concurrent-4 10.90µ 11.10µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 10.52µ 10.18µ ~ 0.400
_prepareTxsPerLevel-4 421.4m 420.4m ~ 0.400
_prepareTxsPerLevelOrdered-4 4.050m 3.720m ~ 0.100
_prepareTxsPerLevel_Comparison/Original-4 422.1m 416.6m ~ 0.200
_prepareTxsPerLevel_Comparison/Optimized-4 3.965m 3.739m ~ 0.100
SubtreeSizes/10k_tx_4_per_subtree-4 1.411m 1.359m ~ 0.700
SubtreeSizes/10k_tx_16_per_subtree-4 335.0µ 320.7µ ~ 0.200
SubtreeSizes/10k_tx_64_per_subtree-4 79.33µ 75.47µ ~ 0.100
SubtreeSizes/10k_tx_256_per_subtree-4 19.80µ 18.91µ ~ 0.100
SubtreeSizes/10k_tx_512_per_subtree-4 9.860µ 9.319µ ~ 0.100
SubtreeSizes/10k_tx_1024_per_subtree-4 4.848µ 4.647µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.416µ 2.329µ ~ 0.100
BlockSizeScaling/10k_tx_64_per_subtree-4 77.24µ 73.02µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 19.56µ 18.85µ ~ 0.100
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.844µ 4.634µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 402.5µ 387.4µ ~ 0.100
BlockSizeScaling/50k_tx_256_per_subtree-4 97.15µ 93.49µ ~ 0.100
BlockSizeScaling/50k_tx_1024_per_subtree-4 23.78µ 23.04µ ~ 0.100
SubtreeAllocations/small_subtrees_exists_check-4 163.6µ 156.6µ ~ 0.100
SubtreeAllocations/small_subtrees_data_fetch-4 171.4µ 163.9µ ~ 0.200
SubtreeAllocations/small_subtrees_full_validation-4 331.4µ 322.2µ ~ 0.100
SubtreeAllocations/medium_subtrees_exists_check-4 9.701µ 9.077µ ~ 0.100
SubtreeAllocations/medium_subtrees_data_fetch-4 10.126µ 9.637µ ~ 0.100
SubtreeAllocations/medium_subtrees_full_validation-4 19.45µ 18.60µ ~ 0.100
SubtreeAllocations/large_subtrees_exists_check-4 2.320µ 2.161µ ~ 0.100
SubtreeAllocations/large_subtrees_data_fetch-4 2.431µ 2.353µ ~ 0.100
SubtreeAllocations/large_subtrees_full_validation-4 4.904µ 4.731µ ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 335.5µ 328.4µ ~ 0.400
StoreBlock_Sequential/AboveCSVHeight-4 331.1µ 330.6µ ~ 1.000
GetUtxoHashes-4 270.3n 270.3n ~ 1.000
GetUtxoHashes_ManyOutputs-4 45.25µ 50.94µ ~ 0.100
_NewMetaDataFromBytes-4 215.3n 214.1n ~ 0.200
_Bytes-4 400.7n 394.7n ~ 0.100
_MetaBytes-4 139.0n 139.0n ~ 1.000

Threshold: >10% with p < 0.05 | Generated: 2026-06-01 15:12 UTC

@liam liam requested review from freemans13, ordishs and sugh01 June 1, 2026 16:07

@ordishs ordishs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve. Tight, well-scoped fix — both robustness bugs are real and the regression tests genuinely fail before / pass after.

Verified locally:

  • go test -race ./pkg/fileformat/ — pass
  • go vet — clean
  • FuzzHeaderParsersAgree — ~3.5M execs, no crashers

Correctness:

  • io.ReadFull is the right primitive (ErrUnexpectedEOF on partial, EOF on zero — both hit the error path).
  • normalizeMagic is a faithful extraction; closes the reader/writer drift between Header.Read and ReadHeaderFromBytes without widening accepted input (normalized magic must still be in the map).
  • Existing error-message assertions still hold.

Minor non-blocking follow-up: TestHeader_Read_UnknownMagic passes a 7-byte input, so it actually exercises the short-read path rather than the unknown-magic path its name implies (pre-existing). Worth a rename or padding to 8 bytes later.

@liam liam merged commit 06bf307 into bsv-blockchain:main Jun 2, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants