seg: `MatchCmp()` unit-tests and zero-copy by AskAlexSharov · Pull Request #19845 · erigontech/erigon

AskAlexSharov · 2026-03-12T05:38:41Z

Story: as part of bt.Get() optimization. as part of blocks exec optimization. Now bt.Get does much allocs because calling .Next(nil). But it can call .MatchCmp(). MatchCmp is zero-copy (even in compressed case) and allows partial-key-decompression.

MatchCmp: 58ns 1 alloc -> 24ns 0 alloc


  ┌──────────────────────┬──────────┬───────────┐
  │        Method        │ hit (ns) │ miss (ns) │
  ├──────────────────────┼──────────┼───────────┤
  │ MatchCmp (streaming) │ 43.5     │ 24.1      │
  ├──────────────────────┼──────────┼───────────┤
  │ MatchPrefix          │ 40.7     │ 7.9       │
  ├──────────────────────┼──────────┼───────────┤
  │ Next + buf reuse     │ 39.4     │ —         │
  └──────────────────────┴──────────┴───────────┘


  ┌─────────────────────────┬──────────┬───────────┐
  │         Method          │ hit (ns) │ miss (ns) │
  ├─────────────────────────┼──────────┼───────────┤
  │ MatchCmpUncompressed    │ 8.1      │ 7.3       │
  ├─────────────────────────┼──────────┼───────────┤
  │ MatchPrefixUncompressed │ 8.3      │ 7.1       │
  ├─────────────────────────┼──────────┼───────────┤
  │ NextUncompressed        │ 5.3      │ —         │
  └─────────────────────────┴──────────┴───────────┘

AskAlexSharov · 2026-03-15T01:44:45Z

@awskii look

Copilot

Pull request overview

This PR improves .kv/.seg key lookups by leveraging MatchCmp() as a zero-copy, early-exit comparator (including for compressed keys), with additional unit tests and benchmarks to validate semantics and measure alloc/latency improvements. It also refactors btindex’s B+tree lookup path to avoid key materialization via Next(nil) in favor of MatchCmp().

Changes:

Implement zero-copy, early-exit Getter.MatchCmp() and make MatchCmpUncompressed() advance on match (reset on mismatch).
Add Reader.MatchCmp() (handles compressed vs uncompressed keys) and Reader.BinarySearch().
Refactor btindex B+tree compare/search paths to use Reader.MatchCmp(); add extensive Match* tests/benches.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
db/seg/decompress.go	Reworks `MatchCmp*` / `MatchPrefixUncompressed` behavior to support zero-copy compare and advance-on-match semantics.
db/seg/seg_auto_rw.go	Adds `Reader.MatchCmp()` + `Reader.BinarySearch()` dispatching across compression modes.
db/seg/decompress_test.go	Adds comprehensive unit tests for `MatchCmp*` and edge cases (nil/empty, binary keys).
db/seg/seg_paged_rw_test.go	Adds reader-level tests for uncompressed keys + compressed values, plus BinarySearch coverage.
db/seg/decompress_bench_test.go	Adds benchmarks comparing `MatchCmp`/`MatchPrefix` vs `Next` (compressed/uncompressed).
db/datastruct/btindex/bps_tree.go	Refactors B+tree `Seek`/`Get`/warmup to compare via `Reader.MatchCmp()`; removes custom keyCmp callback.
db/datastruct/btindex/btree_index.go	Updates index open path to new `NewBpsTree*` signatures; removes old keyCmp method.
db/datastruct/btindex/btree_index_test.go	Updates tests to new `NewBpsTree` signature.
db/datastruct/btindex/bpstree_bench_test.go	Adds/updates benchmarks for `Get` and binary-search behavior over warmed nodes.

Comments suppressed due to low confidence (1)

db/datastruct/btindex/bps_tree.go:255

BpsTree.WarmUp now calls dataLookupFunc(di, kv), which reads/decompresses both key and value. WarmUp only needs the key, so this likely adds avoidable I/O/decompression and allocations (especially if values are compressed and Next(nil) allocates). Consider adding a key-only lookup path (e.g., Reset+Next into a reusable buffer + Skip value, or a dedicated keyLookupFunc) so WarmUp doesn’t decode values.

	var key []byte
	for i := step; i < N; i += step {
		di := i - 1
		key, _, _, err = b.dataLookupFunc(di, kv)
		if err != nil {
			return err
		}
		b.mx = append(b.mx, Node{off: b.offt.Get(di), key: common.Copy(key), di: di})
		cachedBytes += nsz + uint64(len(key))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

db/datastruct/btindex/bps_tree.go:255

WarmUp now uses dataLookupFunc(di, kv), which reads both key and value for each sampled pivot. This can significantly increase warm-up time and IO/CPU (especially when values are large/compressed) compared to the previous key-only lookup. Consider fetching only the key (e.g., Reset(offset) + Next into a reusable buffer + Skip value) or splitting dataLookup into key-only vs key+value paths for warm-up.

	var key []byte
	for i := step; i < N; i += step {
		di := i - 1
		key, _, _, err = b.dataLookupFunc(di, kv)
		if err != nil {
			return err
		}
		b.mx = append(b.mx, Node{off: b.offt.Get(di), key: common.Copy(key), di: di})
		cachedBytes += nsz + uint64(len(key))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Remove Reader.BinarySearch override so calls fall back to Getter.BinarySearch via embedding. This isolates whether the debug_accountRange result difference is caused by the Reader.BinarySearch change (used in history_stream.go) vs the BpsTree.Seek compareKey change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Revert the previous removal — keep Reader.BinarySearch with MatchCmp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

AskAlexSharov · 2026-04-10T04:17:50Z

@awskii cr plz

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-15T02:55:42Z

 	for i := step; i < N; i += step {
 		di := i - 1
-		_, key, err = b.keyCmpFunc(nil, di, kv, key[:0])
+		key, _, _, err = b.dataLookupFunc(di, kv)
 		if err != nil {


WarmUp now calls dataLookupFunc which reads both key and value via Reader.Next(nil) (see BtIndex.dataLookup), causing unnecessary decompression and allocations during warmup. Since warmup only needs pivot keys, consider reading just the key with a reusable buffer (e.g., kv.Reset(offt.Get(di)); key,_=kv.Next(key[:0]); kv.Skip() to skip the value) instead of calling dataLookupFunc.

Copilot · 2026-04-15T02:55:42Z

+	b.Run("MatchCmp_hit", func(b *testing.B) {
+		b.ReportAllocs()
+		for b.Loop() {
+			i := b.N % len(words)
+			g.Reset(offsets[i])
+			cmp := g.MatchCmp(words[i])
+			if cmp != 0 {
+				b.Fatal("expected match")
+			}


In these b.Loop() benchmarks, i := b.N % len(...) uses b.N (the target iteration count) which is constant during the loop, so every iteration hits the same index/key. Use a separate incrementing counter (or switch back to for i := 0; i < b.N; i++ { ... }) to actually vary the accessed key across iterations.

Copilot · 2026-04-15T02:55:42Z

+			for b.Loop() {
+				p := b.N % len(lookupKeys)
+				bt.bs(lookupKeys[p])


Same benchmark-iteration issue here: p := b.N % len(lookupKeys) inside for b.Loop() will always pick the same element because b.N is constant. Use a local counter (incremented per iteration) or an RNG to select different lookup keys.

Suggested change

for b.Loop() {

p := b.N % len(lookupKeys)

bt.bs(lookupKeys[p])

p := 0

for b.Loop() {

bt.bs(lookupKeys[p%len(lookupKeys)])

p++

Copilot · 2026-04-15T02:55:43Z

+	b.Run("MatchCmp_miss", func(b *testing.B) {
+		miss := make([]byte, keySize)
+		miss[0] = 0xff // greater than any key
+		b.ReportAllocs()
+		for b.Loop() {
+			i := b.N % len(offsets)
+			g.Reset(offsets[i])
+			g.MatchCmp(miss)
+		}


i := b.N % len(offsets) inside for b.Loop() will be constant across iterations, so this sub-benchmark repeatedly probes the same offset. Use an incrementing counter or RNG to avoid measuring an unrealistically cache-hot single element.

AskAlexSharov requested a review from sudeepdino008 as a code owner March 12, 2026 05:38

AskAlexSharov force-pushed the alex/match_cmp_34 branch from e13a138 to 63fe77c Compare March 13, 2026 05:48

AskAlexSharov requested review from awskii and Copilot April 5, 2026 13:18

Copilot started reviewing on behalf of AskAlexSharov April 6, 2026 01:01 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread db/seg/decompress_bench_test.go

Comment thread db/datastruct/btindex/bpstree_bench_test.go

Comment thread db/seg/seg_paged_rw_test.go Outdated

Comment thread db/seg/seg_paged_rw_test.go Outdated

AskAlexSharov requested a review from Copilot April 6, 2026 02:22

Copilot started reviewing on behalf of AskAlexSharov April 6, 2026 02:23 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread db/datastruct/btindex/bps_tree.go

AskAlexSharov and others added 20 commits April 8, 2026 16:09

save

0488e28

save

7e0e413

save

b07ee83

save

dc54f23

save

73dbb49

save

24efbc2

save

1693846

save

e3e4418

restore Reader.BinarySearch

e3fe837

Revert the previous removal — keep Reader.BinarySearch with MatchCmp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

save

f1d5bfb

save

951c0fd

save

a9616b4

save

bbc3dd3

save

30dcde6

save

b6aa7ba

save

57f7028

save

24d1510

save

cb9819f

save

e225bf3

AskAlexSharov force-pushed the alex/match_cmp_34 branch from 65b7ab1 to 8fdcf14 Compare April 8, 2026 09:11

AskAlexSharov requested review from Giulio2002, anacrolix, antonis19, bloxster, canepat, domiwei, lupin012, mh0lt, mriccobene, taratorio and yperbasis as code owners April 8, 2026 09:11

AskAlexSharov changed the base branch from release/3.4 to main April 8, 2026 09:11

AskAlexSharov added 2 commits April 9, 2026 08:25

Merge branch 'main' into alex/match_cmp_34

2afdcfe

Merge branch 'main' into alex/match_cmp_34

88d5984

AskAlexSharov enabled auto-merge April 9, 2026 09:24

yperbasis added the performance label Apr 9, 2026

Merge branch 'main' into alex/match_cmp_34

0da1e42

AskAlexSharov changed the title ~~[wip] seg: MatchCmp() unit-tests and zero-copy~~ seg: MatchCmp() unit-tests and zero-copy Apr 10, 2026

Merge branch 'main' into alex/match_cmp_34

5f36881

awskii approved these changes Apr 13, 2026

View reviewed changes

AskAlexSharov added this pull request to the merge queue Apr 13, 2026

Merged via the queue into main with commit d8471e0 Apr 13, 2026
36 checks passed

AskAlexSharov deleted the alex/match_cmp_34 branch April 13, 2026 12:54

AskAlexSharov requested a review from Copilot April 15, 2026 02:50

Copilot started reviewing on behalf of AskAlexSharov April 15, 2026 02:51 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

AskAlexSharov mentioned this pull request Apr 15, 2026

btindex: WarmUp: read key-only, skip value to avoid unnecessary decompression #20567

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

seg: `MatchCmp()` unit-tests and zero-copy#19845

seg: `MatchCmp()` unit-tests and zero-copy#19845
AskAlexSharov merged 26 commits into
mainfrom
alex/match_cmp_34

AskAlexSharov commented Mar 12, 2026

Uh oh!

AskAlexSharov commented Mar 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

AskAlexSharov commented Apr 10, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

AskAlexSharov commented Mar 12, 2026

Uh oh!

AskAlexSharov commented Mar 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

AskAlexSharov commented Apr 10, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants