recsplit: sharded FuseFilter#20644
Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for a new RecSplit index format (v2) that uses a sharded fusefilter existence filter to reduce false positives and potentially improve lookup efficiency, while extending test coverage to exercise the new version.
Changes:
- Introduce sharded fusefilter writer/reader (256 shards by top byte of hash) and wire it into RecSplit/Index for dataStructureVersion ≥ 2.
- Refactor fusefilter blob header handling and add regression tests around header field offsets and sharded round-trips.
- Extend RecSplit tests to run the existing two-layer index test suite against v2.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| db/recsplit/recsplit_test.go | Runs the two-layer index test for RecSplit version 2. |
| db/recsplit/recsplit.go | Creates/uses a sharded existence filter writer for v2 and closes it on cleanup. |
| db/recsplit/index.go | Adds sharded existence filter reader initialization and lookup gating for v2+. |
| db/datastruct/fusefilter/fusefilter_writer_test.go | Adds round-trip, truncation, and regression tests for sharded writer/reader and header parsing. |
| db/datastruct/fusefilter/fusefilter_writer.go | Adds sharded writer implementation and refactors header serialization helpers. |
| db/datastruct/fusefilter/fusefilter_reader.go | Adds sharded reader implementation and refactors header parsing. |
Comments suppressed due to low confidence (1)
db/datastruct/fusefilter/fusefilter_reader.go:101
- NewReaderOnBytes slices m into header/data without checking len(m) >= filterBlobHeaderSize, and then slices data[:fingerprintsLen] without validating fingerprintsLen against len(data). With truncated/corrupt input this will panic instead of returning an error; please add explicit bounds checks and return a descriptive error.
func NewReaderOnBytes(m []byte, fName string) (*Reader, int, error) {
filter := &xorfilter.BinaryFuse[uint8]{}
const headerSize = filterBlobHeaderSize
header, data := m[:headerSize], m[headerSize:]
v, features, err := parseHeaderFeatures(header, fName)
if err != nil {
return nil, 0, err
}
filter.SegmentCount = binary.BigEndian.Uint32(header[4:])
filter.SegmentCountLength = binary.BigEndian.Uint32(header[4+4:])
filter.Seed = binary.BigEndian.Uint64(header[4+4+4:])
filter.SegmentLength = binary.BigEndian.Uint32(header[4+4+4+8:])
filter.SegmentLengthMask = binary.BigEndian.Uint32(header[4+4+4+8+4:])
fingerprintsLen := int(binary.BigEndian.Uint64(header[4+4+4+8+4+4:]))
filter.Fingerprints = data[:fingerprintsLen]
return &Reader{inner: filter, version: v, features: features, m: m}, headerSize + fingerprintsLen, nil
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
This PR extends RecSplit’s “less false positives” existence filtering by adding a sharded fusefilter implementation and wiring it into RecSplit/Index for data structure version 2.
Changes:
- Add v2 existence filter plumbing in
recsplit(writer side) andindex(reader side) using a new sharded fusefilter format. - Implement
WriterSharded/ReaderShardedfor fusefilter blobs and refactor common header/feature parsing. - Expand test coverage to exercise RecSplit v2 and the new sharded fusefilter round-trips.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| db/recsplit/recsplit_test.go | Adds TestTwoLayerIndex coverage for version 2. |
| db/recsplit/recsplit.go | Adds v2 existence filter writer and writes it during build. |
| db/recsplit/index.go | Adds v2 existence filter reader and uses it during Lookup/ForceInMem. |
| db/datastruct/fusefilter/fusefilter_writer_test.go | Adds new sharded writer/reader tests and a regression test for segment count fields. |
| db/datastruct/fusefilter/fusefilter_writer.go | Refactors header feature init; implements WriterSharded and shared serialization helper. |
| db/datastruct/fusefilter/fusefilter_reader.go | Refactors header parsing; implements ReaderSharded. |
Comments suppressed due to low confidence (1)
db/datastruct/fusefilter/fusefilter_reader.go:100
NewReaderOnBytesslicesm[:headerSize]and laterdata[:fingerprintsLen]without any length checks. A truncated/corrupted fusefilter blob (or a shard blob insideNewReaderShardedOnBytes) will panic with an out-of-bounds slice instead of returning an error. Please add upfront bounds checks (len(m) >= headerSize, and headerSize+fingerprintsLen <= len(m)) and return a descriptive error when the blob is too small or claims an impossible fingerprints length.
func NewReaderOnBytes(m []byte, fName string) (*Reader, int, error) {
filter := &xorfilter.BinaryFuse[uint8]{}
const headerSize = filterBlobHeaderSize
header, data := m[:headerSize], m[headerSize:]
v, features, err := parseHeaderFeatures(header, fName)
if err != nil {
return nil, 0, err
}
filter.SegmentCount = binary.BigEndian.Uint32(header[4:])
filter.SegmentCountLength = binary.BigEndian.Uint32(header[4+4:])
filter.Seed = binary.BigEndian.Uint64(header[4+4+4:])
filter.SegmentLength = binary.BigEndian.Uint32(header[4+4+4+8:])
filter.SegmentLengthMask = binary.BigEndian.Uint32(header[4+4+4+8+4:])
fingerprintsLen := int(binary.BigEndian.Uint64(header[4+4+4+8+4+4:]))
filter.Fingerprints = data[:fingerprintsLen]
return &Reader{inner: filter, version: v, features: features, m: m}, headerSize + fingerprintsLen, nil
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fixes #20560. Problem: bloatnet 3B keys .efi file building: FuseFIlter used `t2hash=29GB` and `reverseOrder=26GB` we can't mmap FuseFilter without forking it. But maybe it's okey. Maybe it's better to shard FuseFilter rather than mmap large thing. Maybe sharding will improve data-locality. -- Solution: shard `fusefilter` by first byte of `keyHash`. it will reduce building allocation buffers 256x times. And it will not add reading overhead. (`keyHash>>56` is not same as `byte(keyHash)`. it's `hi/lo` bytes.) - Writer: re-using 1 `xorfilter.BuildBinaryFuse` to build all shards - `ShardedReader` is a `[256]Reader`. Lookup: `shards[keyHash>>56].ContainsHash(hash)` Depends on: #20722 --------- Co-authored-by: JkLondon <me@ilyamikheev.com>
Cherry-pick of [be36d4f](be36d4f) (original PR #20644) onto `performance`. ## Summary - Shards `fusefilter` by `keyHash>>56` (256 shards) to cut FuseFilter build-time RAM ~256× on large `.efi` files (3B-key bloatnet case). - Writer reuses one `xorfilter.BuildBinaryFuse` to build all shards; `ShardedReader` is a `[256]Reader` with lookup `shards[keyHash>>56].ContainsHash(hash)`. - Adds `ExistenceFilterVersion = 2` (sharded format), keeps v1 (monolithic) and v0 (legacy) intact. `recsplit.RecSplitArgs.Version` is now `version.DataStructureVersion` (was `uint8`). - Wires `snaptype.BuildIndex` to pick `recsplit.ExistenceFilterVersion` whenever the index version is not `v1.0` (v1.0 keeps inner v=0 for compat). - Brings the full new test set: sharded writer/reader tests, corruption tests, fuzz tests, real-file tests, and benchmarks + `scripts/fusefilter-bench/`. ## Cherry-pick status Clean cherry-pick. No manual conflict resolutions required — `performance` already has all the prerequisites the original commit assumed: - `xorfilter v0.5.1` (provides `MakeBinaryFuseBuilder` / `BuildBinaryFuse`) - Package-scope `headerSize` in `db/datastruct/fusefilter/fusefilter_reader.go` - `offsetFile *os.File` / `offsetWriter *bufio.Writer` in `recsplit.RecSplit` (from PR #20722) Auto-merge handled minor textual deltas in `.gitignore`, `fusefilter_reader.go`, `recsplit/index.go`, and `db/state/domain.go` without conflicts. ## Test plan - [ ] `go build` of affected packages: passes locally (`db/recsplit/...`, `db/datastruct/fusefilter/...`, `db/snaptype/...`) - [ ] `go test ./db/datastruct/fusefilter/... ./db/recsplit/...` - [ ] CI on the branch > Note: `./db/state/statecfg/...` fails to build with `undefined: InitSchemas` on `origin/performance` independently of this PR — pre-existing issue on the base branch, unrelated to the cherry-pick. ## References - Original PR: #20644 - Original issue: #20560 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> Co-authored-by: JkLondon <me@ilyamikheev.com>
## Summary Re-enable the sharded FuseFilter existence-filter format by setting `ExistenceFilterVersion` back from `1` to `2` in `db/recsplit/recsplit.go`. This is the switch that erigontech#21164 turned off after erigontech#20644 introduced the format. New index files (`.kvi`/`.efi`) will again embed the sharded BinaryFuse filter (256 shards by `keyHash>>56`), which cuts existence-filter build-time RAM ~256× (the motivation was bloatnet 3B-key `.efi` builds). ## Why it's safe - **Read path is version-aware.** `db/recsplit/index.go` reads the embedded `dataStructureVersion` per file and dispatches to v0/v1/v2 readers, so existing v1 files stay readable and new v2 files are read by the matching `case 2` path. - **Accessor/existence files are built locally, not seeded.** `SeedableV3Extensions()` publishes only `.kv/.v/.ef/.ap`; `.kvi/.efi/.kvei` are rebuilt locally via `BuildMissedAccessors`, so this does not affect the shared snapshot set. ## Test plan - [ ] CI green Co-authored-by: JkLondon <me@ilyamikheev.com>
Fixes #20560.
Problem:
bloatnet 3B keys .efi file building: FuseFIlter used
t2hash=29GBandreverseOrder=26GBwe can't mmap FuseFilter without forking it. But maybe it's okey. Maybe it's better to shard FuseFilter rather than mmap large thing. Maybe sharding will improve data-locality.
--
Solution:
shard
fusefilterby first byte ofkeyHash. it will reduce building allocation buffers 256x times. And it will not add reading overhead. (keyHash>>56is not same asbyte(keyHash). it'shi/lobytes.)xorfilter.BuildBinaryFuseto build all shardsShardedReaderis a[256]Reader. Lookup:shards[keyHash>>56].ContainsHash(hash)Depends on: #20722