seg: MatchCmp() unit-tests and zero-copy#19845
Conversation
e13a138 to
63fe77c
Compare
|
@awskii look |
There was a problem hiding this comment.
Pull request overview
This PR improves .kv/.seg key lookups by leveraging MatchCmp() as a zero-copy, early-exit comparator (including for compressed keys), with additional unit tests and benchmarks to validate semantics and measure alloc/latency improvements. It also refactors btindex’s B+tree lookup path to avoid key materialization via Next(nil) in favor of MatchCmp().
Changes:
- Implement zero-copy, early-exit
Getter.MatchCmp()and makeMatchCmpUncompressed()advance on match (reset on mismatch). - Add
Reader.MatchCmp()(handles compressed vs uncompressed keys) andReader.BinarySearch(). - Refactor
btindexB+tree compare/search paths to useReader.MatchCmp(); add extensive Match* tests/benches.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| db/seg/decompress.go | Reworks MatchCmp* / MatchPrefixUncompressed behavior to support zero-copy compare and advance-on-match semantics. |
| db/seg/seg_auto_rw.go | Adds Reader.MatchCmp() + Reader.BinarySearch() dispatching across compression modes. |
| db/seg/decompress_test.go | Adds comprehensive unit tests for MatchCmp* and edge cases (nil/empty, binary keys). |
| db/seg/seg_paged_rw_test.go | Adds reader-level tests for uncompressed keys + compressed values, plus BinarySearch coverage. |
| db/seg/decompress_bench_test.go | Adds benchmarks comparing MatchCmp/MatchPrefix vs Next (compressed/uncompressed). |
| db/datastruct/btindex/bps_tree.go | Refactors B+tree Seek/Get/warmup to compare via Reader.MatchCmp(); removes custom keyCmp callback. |
| db/datastruct/btindex/btree_index.go | Updates index open path to new NewBpsTree* signatures; removes old keyCmp method. |
| db/datastruct/btindex/btree_index_test.go | Updates tests to new NewBpsTree signature. |
| db/datastruct/btindex/bpstree_bench_test.go | Adds/updates benchmarks for Get and binary-search behavior over warmed nodes. |
Comments suppressed due to low confidence (1)
db/datastruct/btindex/bps_tree.go:255
- BpsTree.WarmUp now calls dataLookupFunc(di, kv), which reads/decompresses both key and value. WarmUp only needs the key, so this likely adds avoidable I/O/decompression and allocations (especially if values are compressed and Next(nil) allocates). Consider adding a key-only lookup path (e.g., Reset+Next into a reusable buffer + Skip value, or a dedicated keyLookupFunc) so WarmUp doesn’t decode values.
var key []byte
for i := step; i < N; i += step {
di := i - 1
key, _, _, err = b.dataLookupFunc(di, kv)
if err != nil {
return err
}
b.mx = append(b.mx, Node{off: b.offt.Get(di), key: common.Copy(key), di: di})
cachedBytes += nsz + uint64(len(key))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
db/datastruct/btindex/bps_tree.go:255
- WarmUp now uses dataLookupFunc(di, kv), which reads both key and value for each sampled pivot. This can significantly increase warm-up time and IO/CPU (especially when values are large/compressed) compared to the previous key-only lookup. Consider fetching only the key (e.g., Reset(offset) + Next into a reusable buffer + Skip value) or splitting dataLookup into key-only vs key+value paths for warm-up.
var key []byte
for i := step; i < N; i += step {
di := i - 1
key, _, _, err = b.dataLookupFunc(di, kv)
if err != nil {
return err
}
b.mx = append(b.mx, Node{off: b.offt.Get(di), key: common.Copy(key), di: di})
cachedBytes += nsz + uint64(len(key))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Remove Reader.BinarySearch override so calls fall back to Getter.BinarySearch via embedding. This isolates whether the debug_accountRange result difference is caused by the Reader.BinarySearch change (used in history_stream.go) vs the BpsTree.Seek compareKey change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert the previous removal — keep Reader.BinarySearch with MatchCmp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
65b7ab1 to
8fdcf14
Compare
MatchCmp() unit-tests and zero-copyMatchCmp() unit-tests and zero-copy
|
@awskii cr plz |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for i := step; i < N; i += step { | ||
| di := i - 1 | ||
| _, key, err = b.keyCmpFunc(nil, di, kv, key[:0]) | ||
| key, _, _, err = b.dataLookupFunc(di, kv) | ||
| if err != nil { |
There was a problem hiding this comment.
WarmUp now calls dataLookupFunc which reads both key and value via Reader.Next(nil) (see BtIndex.dataLookup), causing unnecessary decompression and allocations during warmup. Since warmup only needs pivot keys, consider reading just the key with a reusable buffer (e.g., kv.Reset(offt.Get(di)); key,_=kv.Next(key[:0]); kv.Skip() to skip the value) instead of calling dataLookupFunc.
| b.Run("MatchCmp_hit", func(b *testing.B) { | ||
| b.ReportAllocs() | ||
| for b.Loop() { | ||
| i := b.N % len(words) | ||
| g.Reset(offsets[i]) | ||
| cmp := g.MatchCmp(words[i]) | ||
| if cmp != 0 { | ||
| b.Fatal("expected match") | ||
| } |
There was a problem hiding this comment.
In these b.Loop() benchmarks, i := b.N % len(...) uses b.N (the target iteration count) which is constant during the loop, so every iteration hits the same index/key. Use a separate incrementing counter (or switch back to for i := 0; i < b.N; i++ { ... }) to actually vary the accessed key across iterations.
| for b.Loop() { | ||
| p := b.N % len(lookupKeys) | ||
| bt.bs(lookupKeys[p]) |
There was a problem hiding this comment.
Same benchmark-iteration issue here: p := b.N % len(lookupKeys) inside for b.Loop() will always pick the same element because b.N is constant. Use a local counter (incremented per iteration) or an RNG to select different lookup keys.
| for b.Loop() { | |
| p := b.N % len(lookupKeys) | |
| bt.bs(lookupKeys[p]) | |
| p := 0 | |
| for b.Loop() { | |
| bt.bs(lookupKeys[p%len(lookupKeys)]) | |
| p++ |
| b.Run("MatchCmp_miss", func(b *testing.B) { | ||
| miss := make([]byte, keySize) | ||
| miss[0] = 0xff // greater than any key | ||
| b.ReportAllocs() | ||
| for b.Loop() { | ||
| i := b.N % len(offsets) | ||
| g.Reset(offsets[i]) | ||
| g.MatchCmp(miss) | ||
| } |
There was a problem hiding this comment.
i := b.N % len(offsets) inside for b.Loop() will be constant across iterations, so this sub-benchmark repeatedly probes the same offset. Use an incrementing counter or RNG to avoid measuring an unrealistically cache-hot single element.
Story: as part of
bt.Get()optimization. as part of blocks exec optimization. Now bt.Get does much allocs because calling.Next(nil). But it can call.MatchCmp().MatchCmpis zero-copy (even in compressed case) and allows partial-key-decompression.MatchCmp:
58ns 1 alloc -> 24ns 0 alloc