fix(utxopersister): drop double-read of fileformat magic in verifyLastSet#971
Conversation
|
🤖 Claude Code Review Status: Complete Current Review: Verified:
Minor documentation inconsistency (pre-existing, not blocking): |
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-28 18:36 UTC |
ordishs
left a comment
There was a problem hiding this comment.
Approving — the fix is clean, well-explained, and the regression test is well-designed (uses real layout bytes so restoring the buggy line reproduces the exact production "unknown magic" failure mode rather than a generic header mismatch).
Strong follow-up request: same bug pattern at UTXOSet.go:592
The identical double-read anti-pattern exists in CreateUTXOSet:
// UTXOSet.go:569 — store layer already strips/validates the header
previousUTXOSetReader, err := us.store.GetIoReader(ctx, c.firstPreviousBlockHash[:], fileformat.FileTypeUtxoSet)
...
previousUTXOSetReader = &readCloserWrapper{
Reader: bufio.NewReaderSize(previousUTXOSetReader, bufferSize.Int()),
Closer: previousUTXOSetReader.(io.Closer),
}
defer previousUTXOSetReader.Close()
// UTXOSet.go:592 — same double-read bug
previousHeader, err := fileformat.ReadHeader(previousUTXOSetReader)
if err != nil {
return errors.NewStorageError("error reading previous utxo-set header", err)
}
if previousHeader.FileType() != fileformat.FileTypeUtxoSet {
return errors.NewStorageError("previous utxo-set header is not a utxo-set header")
}- The
bufio.NewReaderSizewrap does not reset position — it buffers reads from a reader that is already past the header. - The guard at line 567 (
firstPreviousBlockHash != GenesisHash) is the same gate that masks the bug during early lifetime, mirroringverifyLastSet. - The
FileTypecheck is also redundant —validateFileHeaderin the file store already enforces it. - Subsequent reads via
NewUTXOWrapperFromReaderwould then be misaligned by 8 bytes, likely silently corrupting consolidation rather than crashing — potentially worse than the original bug.
Given the companion work in #969 is actively exercising CreateUTXOSet, please either fix this in the same PR (it's a one-line removal with the same justification) or open an immediate follow-up before scenario_04 is unskipped.
Minor (non-blocking)
-
Server.go:646— "already validates the fileformat header (8-byte magic + matching FileType)" is precise for the file store but the memory store atmemory.go:269-271only strips by header size and does not validate the magic. Consider: "already advances past the fileformat header (and, in the file store, validates the magic + FileType)". -
Pre-existing:
verifyLastSetdocstring still promises footer validation that neither the old nor new code performs — worth a follow-up cleanup. -
Consider asserting on the absence of
"unknown magic"in the test, to make the regression intent explicit against future divergent failures.
|
Pushed Strong follow-up (UTXOSet.go:592) — Fixed in the same PR. The reviewer's read of the surrounding code was exactly right: the OUTER UTXOWrapper loop had never reached non-genesis previous sets in practice because verifyLastSet was crashing first. Just dropping the
Two new regression tests:
Minor (non-blocking) — all addressed:
Out of scope but noted for follow-up: |
oskarszoon
left a comment
There was a problem hiding this comment.
Approve. Root cause and fix both correct.
Both file and memory stores already advance past the 8-byte magic before returning a reader (file.go:GetIoReader via validateFileHeader; memory.go:Get:271 slices bd.data[header.Size():]). Calling fileformat.ReadHeader on top consumed the first 8 bytes of the body (start of the stored current-block-hash field) and crashed the sidecar.
verifyLastSet:ReadHeaderremoved; store-layer open already checks existence +FileType == FileTypeUtxoSet. Sufficient.CreateUTXOSet(UTXOSet.go:592-619): new code reads the 68-byte metadata explicitly (32 current hash + 4 height + 32 parent hash) matching the layout written atUTXOSet.go:543-556. ThestoredCurrentBlockHash.IsEqual(c.firstPreviousBlockHash)check is a real correctness improvement — would have caught file/key confusion silently before.
Tests pin the regression on both paths.
Nit: UTXOSet_test.go:113 uses assert.Contains; repo convention is require.Contains. Safe in practice (follows require.Error) but a one-line fix.
Code ReviewVerification (local, on Verdict: approve with changes. Diagnosis is right, fix is minimal in the right places, and the new tests pin the exact production failure mode. Couple of items worth addressing before merge: High — dead reads in
|
…tSet
verifyLastSet was calling fileformat.ReadHeader(r) on a reader that the
store layer had already advanced past the 8-byte fileformat header
(file.go's GetIoReader -> validateFileHeader; the memory store's
GetIoReader returns content with the header already stripped).
The extra read consumed the *next* 8 bytes of the UTXO set body —
which, per the file layout written by CreateUTXOSet, are the first 8
bytes of the block hash field. Block hashes are essentially random
bytes, so this reliably failed with:
ERROR | service_manager.go:259 | ServiceManager| Received error: \
unknown magic: [<8 bytes of block hash>]
The unwrapped fmt.Errorf from pkg/fileformat propagated through
processNextBlock back to the errgroup, and ServiceManager treated it
as fatal — gracefully shutting down the entire core sidecar (RPC,
alert, blockpersister, utxopersister, pruner, legacy) along with it.
This only fires when lastWrittenUTXOSetHash != GenesisHash
(Server.go:466), i.e. once the first non-genesis UTXO set has been
written. Before that, verifyLastSet is short-circuited. That's why
the crash typically manifested a short while into a node's lifetime
(after the first block past genesis was persisted) rather than at
boot.
Fix: trust the store layer's header validation and drop the redundant
ReadHeader call. A successful GetUTXOSetReader is sufficient evidence
that the file exists and has the correct FileTypeUtxoSet magic.
Regression test (TestVerifyLastSet_ExistingValidSet) seeds a store
with a valid UTXO set whose post-header body matches the real
on-disk layout (32-byte block hash + 4-byte height); without the fix
the test fails with exactly the production error. The existing
TestVerifyLastSet_NoUTXOSetExists still pins the missing-file path.
CI lint failed on services/utxopersister/Server_test.go with gci flagging mis-aligned inline comments on consecutive append lines.
…ment cleanup
Addresses review feedback on the verifyLastSet fix.
1. The identical double-read pattern existed in CreateUTXOSet at
UTXOSet.go:592. The store layer's GetIoReader had already advanced
past the 8-byte fileformat magic, but the function called
fileformat.ReadHeader on the wrapped (bufio) reader anyway —
bufio.NewReaderSize does not reset position, it buffers reads from
the already-advanced reader. Same gate (firstPreviousBlockHash !=
GenesisHash) masked it during early lifetime. Once verifyLastSet
stopped crashing, this path would have been exercised and would
have either errored loudly ("unknown magic") or — worse — silently
misaligned the OUTER UTXOWrapper loop by 8 bytes, consolidating
garbage instead of the previous block's UTXOs.
Fix: drop the ReadHeader call (the store already validated the
header in the file store; the memory store advances past it by
size, which is fine for tests) and consume the 68 bytes of
per-file metadata that CreateUTXOSet itself writes (32-byte
current block hash + 4-byte height + 32-byte previous block
hash) before the wrapper loop. Validate the stored current
block hash equals c.firstPreviousBlockHash so file/key
confusion fails loudly instead of silently consolidating from
the wrong ancestor.
2. Server.go verifyLastSet comment was precise for the file store
but glossed over that the memory store does not validate the
magic (only strips by size). Rewritten to be precise without
misrepresenting either implementation.
3. verifyLastSet docstring previously promised header + footer
integrity checks that the function never implemented. Rewritten
to describe what it actually does (existence + openability).
4. TestVerifyLastSet_ExistingValidSet now also asserts that the
error, if any, does not contain "unknown magic" — making the
regression intent explicit so future divergent failures
(e.g. some new store-layer change) aren't confused with this
specific bug.
5. Two new regression tests for the CreateUTXOSet path:
- PreviousSetReadDoesNotDoubleReadMagic exercises the
happy-path read of a valid previous UTXO set; without the
fix it fails with the production "unknown magic" error chain.
- PreviousSetWrongBlockHash pins the new validation that catches
file/key confusion.
All four tests verified to fail without the corresponding fix
restored and pass with it.
- UTXOSet.go: replace the two dead-read named variables with an io.CopyN(io.Discard) over the 36 trailing metadata bytes. The hash-equality check above is the only meaningful validation; the height and grandparent hash have no natural comparison target in the new set being written, so consuming them rather than parsing avoids "is this variable ever used?" confusion. - UTXOSet_test.go: preallocate the body slices in the two new regression tests to satisfy golangci-lint's prealloc check.
cfaefb6 to
7ac456f
Compare
|
CreateUTXOSet appends a 16-byte footer (txCount + utxoCount) after the final UTXOWrapper, but the OUTER loop that reads a previous block's UTXO set during consolidation only broke on a clean io.EOF. With the footer present, the next txid read came up 16 bytes short and io.ReadFull returned io.ErrUnexpectedEOF, which the loop treated as fatal: STORAGE_ERROR: error reading previous utxo-set (...) at iteration 1 -> failed to read txid, expected 32 bytes got 16 -> unexpected EOF That killed the UTXOPersister service and, via errgroup, the whole core sidecar on any non-genesis consolidation. The path was dead code until the verifyLastSet/CreateUTXOSet magic double-read fixes (bsv-blockchain#971) made it reachable, so this latent bug only surfaced afterwards. Also break the loop when the wrapper read short-reads the footer. The match is by substring ("unexpected EOF"): teranode's errors.New flattens a non-*Error cause to its message, so errors.Is(err, io.ErrUnexpectedEOF) would reduce to the same check - it is not structural. This is narrower than cmd/utxovalidator (which also matches "failed to read txid"), so a real non-EOF read error stays loud. A structural fix - a typed sentinel from FromReader plus validating records-read == txCount - is a follow-up. Add a unit test that stages a realistic previous set (header + one wrapper + footer); it reproduces the exact crash without the fix and passes with it. The pre-existing previous-set test passed only because it staged zero wrappers and no footer, hitting a clean io.EOF.



Summary
verifyLastSet(andCreateUTXOSet's previous-set read path) was callingfileformat.ReadHeader(r)on a reader that the store layer had already advanced past the 8-byte fileformat header. The redundant read consumed the next 8 bytes of the UTXO set body — the first bytes of the block hash field — and reliably surfaced as:The unwrapped
fmt.Errorffrompkg/fileformatpropagated up throughprocessNextBlockto the errgroup, and ServiceManager treated it as fatal — gracefully shutting down the entirecoresidecar (RPC, alert, blockpersister, utxopersister, pruner, legacy) on a peer that received a freshly-mined block.Why "unknown magic: [<random-looking 8 bytes>]"?
UTXO set file layout (per
CreateUTXOSetinUTXOSet.go:543-553):The store layer's
GetIoReaderadvances past the 8-byte magic before returning the reader (and, in the file store, validates magic + FileType —stores/blob/file/file.go:1011-1023; the memory store just strips by size —stores/blob/memory/memory.go:269-271). The two read sites then calledReadHeaderagain, reading the first 8 bytes of the block hash field — essentially random bytes — and rejecting them as an unknown magic.The bytes from the original crash log (
[202 100 224 254 161 3 101 216]) match the expected shape of a chainhash prefix.When the crash fires
Only when
lastWrittenUTXOSetHash != GenesisHash(Server.go:466) — i.e. once the first non-genesis UTXO set has been written. Before that,verifyLastSetis short-circuited, which is why the crash typically manifested a short while into a node's lifetime rather than at boot.Originally surfaced by the split-mode chaos suite in #958: when a node mined and broadcast a block after a chaos cycle, peers receiving the block hit the second iteration of
processNextBlock, triggeredverifyLastSet, and crashed.Fix
verifyLastSet: Drop the redundantfileformat.ReadHeadercall. A successfulGetUTXOSetReaderis sufficient evidence that the file exists.CreateUTXOSet's previous-set read path (UTXOSet.go:592): Same double-read pattern, with a worse failure mode. The gate at line 567 had prevented this path from ever being reached against a non-genesis previous set, becauseverifyLastSetwas crashing first. Just removing theReadHeaderhere would have unblocked the silent-corruption path (the OUTERNewUTXOWrapperFromReaderloop misaligned by 8 bytes, consolidating garbage instead of the previous block's UTXOs). The fix:ReadHeader.c.firstPreviousBlockHash. This is a new failure mode — a corrupt or mis-keyed previous set will now fail loudly with a clearblock hash mismatcherror where before it would have silently consolidated UTXOs from the wrong ancestor. Surfaced because the bug fix exposes a code path that had been entirely unreachable in production.Docstring / comment cleanup:
verifyLastSetcomment on memory-vs-file-store header behaviour rewritten for precision (the memory store advances past the header by size; only the file store validates the magic + FileType).verifyLastSetdocstring previously promised a footer integrity check that was never implemented; rewritten to describe what the function actually does.Regression tests
TestVerifyLastSet_ExistingValidSet— happy path; without the fix, fails with the exact production error string. Also assertsNotContains "unknown magic"so future divergent failures don't quietly mask this regression.TestCreateUTXOSet_PreviousSetReadDoesNotDoubleReadMagic— happy path against the second site; without the fix fails with the production error chain (STORAGE_ERROR (69): error reading previous utxo-set header -> UNKNOWN (0): unknown magic: [...]).TestCreateUTXOSet_PreviousSetWrongBlockHash— pins the new defense-in-depth check.All three new tests verified to fail with the corresponding fix removed and pass with it restored.
Test plan
go test ./services/utxopersister/...— full package passes (~60s)go test -race -run 'TestVerifyLastSet|TestCreateUTXOSet' ./services/utxopersister/— passes under race detectorgo build ./services/utxopersister/...cleangci diffcleanOut of scope (separate follow-up)
cmd/utxovalidator/utxo_validator.go:77has the same double-read pattern againstblockStore.GetIoReader. It's a CLI tool not on the hot path; deserves its own small PR.Companion work
This is the second of the two teranode-side blockers preventing
test/multinode_split/scenario_04_block_assembly_isolationfrom running. The first (fix(utxopersister): nil-guard CreateUTXOSet against empty ConsolidateBlockRange) is in #969. Once both land, the scenario'st.Skipline can be removed.