fix(utxopersister): bound allocations on untrusted UTXO count/script length + fuzz#1004
Conversation
|
🤖 Claude Code Review Status: Complete This PR adds critical fuzzing infrastructure and fixes an unbounded allocation vulnerability in UTXO deserialization. The fix correctly prevents OOM attacks from malicious/corrupted UTXO files. Security Fix Verified:
Test Coverage:
Code Quality: No issues found. The fix is minimal, correct, and well-tested. History:
|
| return errors.NewStorageError("failed to read utxo script (%d bytes)", l, err) | ||
| } | ||
|
|
||
| utxo.Script = script.Bytes() |
There was a problem hiding this comment.
[Minor] Allocation concern with bytes.Buffer
While io.CopyN prevents unbounded reads, bytes.Buffer still allocates the full l bytes internally when growing to accommodate the write. For a truncated file claiming a 32MB script, this will allocate ~32MB before the EOF error.
Consider adding an explicit size check before allocation:
- Cap script size at a reasonable maximum based on consensus limits
- Fail fast on unreasonable sizes before any allocation
- Only allocate what is actually readable from the stream
Example approach: check l against a maxScriptSize constant, return an error if exceeded, then use io.ReadFull with a pre-allocated slice of exactly l bytes.
Impact: Minor - the fix significantly improves the situation, but a consensus-based limit would be more robust.
There was a problem hiding this comment.
This concern appears to be based on a misunderstanding of bytes.Buffer.ReadFrom behavior.
bytes.Buffer.ReadFrom does NOT pre-allocate to the limit size. It grows incrementally in fixed chunks (typically 512 bytes) as data is actually read. The test TestUTXOWrapperFromBytes_HostileSizesAreBounded at line 77 validates exactly this — a truncated file claiming a 33MB script allocates < 4MB before erroring, which would be impossible if the buffer pre-allocated to the claimed size.
The fix is correct as implemented. Resolving.
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-06-01 15:37 UTC |
…length
The UTXO-set wrapper deserializer pre-allocated from length/count fields
read straight out of the file, with no bounds check:
- FromReader: make([]*UTXO, numUTXOs)
- NewUTXOFromReader: make([]byte, scriptLen)
- NewUTXOValueFromReader: make([]byte, scriptLen) (via reusableScript)
A corrupt or truncated .utxo-set whose header claims numUTXOs=0xFFFFFFFF
or scriptLen=0xFFFFFFFF forces a multi-GB allocation before the follow-up
read fails — a cheap OOM that takes down the core sidecar via the
ServiceManager errgroup, the same crash class as the recently-fixed
read-path bugs. recover() does not save an OOM.
Bound the allocations to the bytes actually available:
- cap the speculative []*UTXO capacity (min(numUTXOs, 1024)) and append
as records are read, so a bogus count errors on the first missing
record instead of pre-allocating;
- read the script via io.CopyN, which streams in fixed-size chunks and
stops at EOF, so a huge claimed length on a truncated record errors
out instead of allocating; the value-only path discards via
io.CopyN(io.Discard, ...) (allocation-free) and drops the now-unused
reusableScript buffer.
Add a fuzz harness (FuzzUTXOWrapperFromBytes) that checks no panic and a
re-serialization round-trip invariant, plus a deterministic regression
test asserting a hostile header allocates < 4 MB (was ~256 MB / ~32 MB)
before erroring. Valid-file parsing is unchanged (existing roundtrip
tests still pass).
afbb971 to
03c3641
Compare
|
Thanks — I dug into this, and the allocation is already bounded to the readable input;
Measured directly — a 56-byte input whose header claims a So ~1.4 KB, not 4 GB (or 32 MB) — which is exactly "only allocate what is actually readable from the stream." The deterministic On the consensus-based I've pushed a comment on the |
|
ordishs
left a comment
There was a problem hiding this comment.
Approve. Correct, minimally scoped fix for a real OOM/DoS vector in the .utxo-set deserializer, with strong test coverage.
Verified against the base code and I/O semantics:
- UTXO slice:
make([]*UTXO, 0, min(numUTXOs, 1024))+ append errors on the first missing record instead of pre-sizing. - Script read:
io.CopyN(&bytes.Buffer, r, l)dispatches tobytes.Buffer.ReadFrom, which grows incrementally and stops at EOF — never pre-sizes to l; truncation yields a wrapped StorageError. - Value path:
io.CopyN(io.Discard, r, l)is allocation-free via the pooled ReaderFrom; reusableScript cleanly removed (no other refs). - EOF-marker stream semantics and callers unaffected by the error-type change.
Tests are solid: the regression test fails on the old code (~268MB/~33MB) and passes after, runs deterministically in CI without -fuzz; the round-trip-prefix fuzz invariant genuinely holds.
Non-blocking notes:
- Read path retains bytes.Buffer's backing array (ReadFrom may grow ~2x and won't trim) vs the old exact make([]byte, l) — bounded by real file size, but a slices.Clip would keep peak RSS tight on full-set loads.
- totalAllocDelta reads process-global runtime.TotalAlloc, not goroutine-scoped; safe here given the 4MB headroom.


First in a small series adding Go native fuzzing to teranode's untrusted/persisted deserializers. teranode currently has zero fuzz tests, despite ingesting untrusted bytes from peers and persisted files. This PR targets the UTXO-set wrapper reader and fixes the unbounded-allocation bug it immediately surfaces.
Bug
UTXOWrapperdeserialization pre-allocated from length/count fields read straight out of the file, with no bounds check:FromReader(UTXO.go)make([]*UTXO, numUTXOs)NewUTXOFromReadermake([]byte, scriptLen)NewUTXOValueFromReadermake([]byte, scriptLen)(viareusableScript)A corrupt or truncated
.utxo-setwhose header claimsnumUTXOs = 0xFFFFFFFForscriptLen = 0xFFFFFFFFforces a multi-GB allocation before the follow-up read fails — a cheap OOM that takes down thecoresidecar via the ServiceManager errgroup. This is the same crash class as the recently-fixed utxopersister read-path bugs (#969/#971/#985), andrecover()does not save an OOM.These readers are reachable from the consolidation loop, the seeder,
cmd/utxovalidator, andcmd/filereader.Fix
Bound the allocations to the bytes actually available, never to an untrusted field:
min(numUTXOs, 1024)) andappendas records are read, so a bogus count errors on the first missing record instead of pre-allocating.io.CopyN, which streams in fixed-size chunks and stops at EOF — a huge claimed length on a truncated record errors out instead of allocating. The value-only path discards viaio.CopyN(io.Discard, …)(allocation-free;io.Discard.ReadFromuses a pooled buffer) and drops the now-unusedreusableScriptbuffer.Valid-file parsing is unchanged.
Tests
FuzzUTXOWrapperFromBytes— fuzzesNewUTXOWrapperFromBytes; asserts no panic (enforced by the engine) and a re-serialization round-trip invariant (a parsed wrapper'sBytes()must be a prefix of the input). Seeded with valid round-tripped wrappers + hostile headers. Runs as plain corpus replay in CI (no-fuzzflag needed).TestUTXOWrapperFromBytes_HostileSizesAreBounded— deterministic regression: a hostile header allocates < 4 MB (was ~256 MB for the count case, ~32 MB for the script case) before erroring. Fails on the old code, passes on the fix.Verification
268 MB/33 MBobserved), passes after.go test -fuzz=FuzzUTXOWrapperFromBytes -fuzztime=45s— 136K execs, no crashers.go test -race ./services/utxopersister/— pass (existing roundtrip tests green).go vet,gofmt,gci— clean.Follow-ups (separate PRs): fuzz
model.NewBlockFromBytesandpkg/fileformatheader parsing. Wire/tx parsing lives in the separatego-wire/go-btmodules.