fix(utxopersister): nil-guard CreateUTXOSet against empty ConsolidateBlockRange#969
Conversation
…BlockRange
processNextBlock could crash with a nil-pointer SIGSEGV at
UTXOSet.go:527 (c.lastBlockHash[:]) during a startup race between
BlockPersister and UTXOPersister, manifesting as:
ERROR: Processing block <nil> height 0
panic: runtime error: invalid memory address or nil pointer dereference
Root cause: consolidator.ConsolidateBlockRange only assigns
c.lastBlockHash at the bottom of its loop body (consolidator.go:234).
The loop:
- 'continue's over the genesis block (consolidator.go:205), and
- 'return err's on per-block errors before line 234, including
errors.ErrNotFound from GetUTXOAdditionsReader when BlockPersister
hasn't yet written that block's UTXO additions file.
processNextBlock then intentionally swallowed ErrNotFound from
ConsolidateBlockRange and fell through to CreateUTXOSet, which
dereferenced the still-nil lastBlockHash. Non-ErrNotFound errors were
also silently dropped via 'return 0, nil', hiding real failures.
Fix:
- processNextBlock now propagates non-ErrNotFound errors instead of
swallowing them.
- After a successful (or ErrNotFound) ConsolidateBlockRange, gate on
c.lastBlockHash != nil; if nil, return errors.ErrNotFound so the
caller treats it as "no new block to process" and waits for the
next trigger rather than driving downstream code with bad state.
- CreateUTXOSet rejects nil c.lastBlockHash explicitly at the entry
alongside its other defensive nil checks, so any future caller
gets a clear error instead of a SIGSEGV.
Regression test pins the CreateUTXOSet defensive check; the
processNextBlock change is small and verifiable by inspection,
exercised in practice by the split-mode chaos suite once that
suite's other teranode-side blocker (legacy 'unknown magic') lands.
|
🤖 Claude Code Review Status: Complete Current Review: This PR correctly fixes a critical nil-pointer dereference bug with strong defensive programming: ✅ Core fix is sound:
✅ Test coverage:
✅ Minimal, focused diff:
No issues found. The fix is correct, well-tested, and follows the project's error-handling conventions. |
|
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-28 10:04 UTC |
ordishs
left a comment
There was a problem hiding this comment.
Approve. Surgical fix for the SIGSEGV with appropriate defense-in-depth: caller gates in processNextBlock and callee adds a defensive nil-check in CreateUTXOSet. The error-propagation tightening also fixes a latent swallowed-error bug. Verified locally: new regression test passes under -race, full package suite green, go vet clean.
Follow-ups worth tracking (non-blocking):
- ConsolidateBlockRange's contract is still fragile — it can return nil while leaving c.lastBlockHash == nil. Consider tightening the contract so future callers don't need to repeat the nil check.
- No unit test covers the new processNextBlock paths (error propagation + empty-range sentinel); the integration repro in scenario_04 is currently t.Skip-ed.
oskarszoon
left a comment
There was a problem hiding this comment.
Approve. Root cause and fix both correct.
ConsolidateBlockRange (consolidator.go:176) leaves c.lastBlockHash == nil when the range contains only genesis (loop skips at :206) or when GetBlockHeadersByHeight returns zero headers. Pre-PR code fell through to CreateUTXOSet → c.lastBlockHash[:] → SIGSEGV. Hits on first startup when the last written UTXO set is genesis — common case.
Inverted error handler also fixed: if !errors.Is(err, errors.ErrNotFound) { return 0, nil } was swallowing real errors as success AND letting ErrNotFound fall through to the nil deref. New version propagates both cases correctly.
Defense-in-depth at both layers is right: caller (Server.go:487-494) returns errors.ErrNotFound, callee (UTXOSet.go:530-532) catches future bypassers.
Nit: UTXOSet_test.go:45 uses assert.Contains; repo convention is require.Contains. Same issue as PR #971.
Minor cleanup candidate (not a blocker): Server.go:478-484 both branches of the if errors.Is(err, errors.ErrNotFound) block now return 0, err. The two-branch form implies a distinction that doesn't exist anymore — fold into one return.


Summary
utxopersister.processNextBlockcould crash with a nil-pointer SIGSEGV atUTXOSet.go:527(c.lastBlockHash[:]) during a startup race betweenBlockPersisterandUTXOPersister. The fix propagates errors correctly and gates downstream code on a validlastBlockHash; a defensive check at theCreateUTXOSetentry covers any future caller.Crash signature
Reproduces during 3-node split-mode bring-up: the height-1 probe block triggers
processNextBlockon each node'scoresidecar beforeBlockPersisterhas written the per-block UTXO additions file. The wholecoreprocess panics, taking RPC + ancillary services down with it.Root cause
consolidator.ConsolidateBlockRangeonly assignsc.lastBlockHashat the bottom of its loop body (consolidator.go:234). The loop:continues over the genesis block (consolidator.go:205)return errs on per-block errors before line 234 — includingerrors.ErrNotFoundfromGetUTXOAdditionsReaderwhen BlockPersister hasn't written that block's additions file yet.processNextBlockhad two issues compounding this:if !errors.Is(err, errors.ErrNotFound) { return 0, nil }— non-NotFound errors were dropped, returning success.CreateUTXOSetgot a nillastBlockHashand dereferenced it.Fix
services/utxopersister/Server.go(processNextBlock):errors.ErrNotFoundifc.lastBlockHashis nil — caller treats that as 'no new block, wait for next trigger' (matches the existingif errors.Is(err, errors.ErrNotFound)branch at the call site).services/utxopersister/UTXOSet.go(CreateUTXOSet):Test plan
go test ./services/utxopersister/...— full package suite passesgo test -race -run TestCreateUTXOSet_NilLastBlockHash ./services/utxopersister/— new regression test passes under the race detectorgo vet ./services/utxopersister/...cleanprocessNextBlockchange is a small, mechanical error-propagation tightening; verifiable by inspectionThe full integration repro lives in
test/multinode_split/scenario_04_block_assembly_isolation, which is currentlyt.Skip-ed pending one other teranode robustness fix (legacyunknown magicparser). Once that lands, deleting thet.Skipline re-enables end-to-end coverage of this fix path.