optimise FindBlockNum by sudeepdino008 · Pull Request #15398 · erigontech/erigon

sudeepdino008 · 2025-06-02T10:51:10Z

ReadTxNumFunc gets maxtxnum given a blockNumber...it was also used to find blockNumber given maxTxNum..
add TxBlockIndex in-place of ReadTxNumFunc -- difference is that there's now a separate function to GetBlockNumber.
Additionally:
earlier, binary search was performed over a range of block numbers, each search step queries MDBX and then fetches block from snapshot.
now, binary search is separated into two stages: first stage will binary_search in snapshots; second stage will do binary_search on MDBX.

sudeepdino008 · 2025-06-04T07:11:21Z

after the change...GetBlockNumber exec_time is 12% of getLogsV3; compared to around 50% previously - #15209

AskAlexSharov

does it work on --prune.mode=minimal?
do we need special DefaultTxBlockIndex object? Maybe move all logic of txnumreader to BlockReader?
On bormainnet:

du -hsc /erigon-data/snapshots/*bod*seg
700mb
du -hsc /erigon-data/snapshots/*bod*idx
200mb

du -hsc /erigon-data/snapshots/*trans*seg
1.4T	
du -hsc /erigon-data/snapshots/*trans*idx
1.4T

i feel it's wrong direction to move from bodies to txs. it's maybe look ok from 1 read perspective - but if think on it from "many parallel RPC" perspective - it can make "all txs files warm".

Does Erigon already using bodies.idx field? Mabye we can use it here?

sudeepdino008 · 2025-06-04T09:35:10Z

i feel it's wrong direction to move from bodies to txs.

interesting...I can't quite formalize this...but makes sense that this will significantly increase warming of txs files...

One idea is: figure out the exact blocks.seg where the block containing query_txnum might be (this needs to mmap touches); and then binary search query_txnum on that block.seg file alone..we should still get good savings from that.

sudeepdino008 · 2025-06-05T16:11:11Z

the current issue with this is that entries for smaller files don't get removed when they're merged. Will add a GC method for such entries..

AskAlexSharov · 2025-06-06T02:21:49Z

14% on FindBlockNumber now.. i guess it's because we don't have handmade rlp decoding for bodies (maybe i'm wrong).

AskAlexSharov · 2025-06-06T02:33:53Z

+
+// idea is to use lookup tables to implement uniform binary search
+type BlockTxNumLookupCache struct {
+	cache   map[snapshotsync.Range][]uint64


how this structure better than array of blockNums (1 blockNum per 1M txnums)?

you mean - not having a per-file map, but rather a "flat" array that stores 1blockNum per 1M txnum?

we have to create a per-file cache structure - it allows the first few layers of the binary search (on that file) to be common among any query txNum in it.
If we don't keep it per-file, and just do this structure from [0-last_snapshot_block_num), the path that the queries take can keep changing (when a new file is introduced)

If we can agree on the general idea - maybe it's possible to make it simpler.
I also need to lock to synchronize access to the cache.

a "flat" array that stores 1blockNum per 1M txnum? - yes. then search will look like arr[txNum/1M] (don't need binsearch).
Of-course we can have 1 array per file or 1 array per all files - append-only.

maybe i also not fully understand current solution. but i slowly understanding - it allow store big and small files and get 1 array per file by 1 map get and then do binsearch inside this array? (array of 10 uints - brobably sequential search will be faster than binsearch).

but i don't fully understand how do we fill current array

can't get through abstractions yet:

if !ok && !exhaust { query.SetValue(maxTxNum) }

where/what does it set? cache map[snapshotsync.Range][]uint64 storing txnums or blocknums?

map[snapshotsync.Range][]uint64

key: file range (basically file)
value: list of lookup values (let's call this per-file array lookup)

e.g.

file range: [10000-10500)

so the file has 500 blocks..the array lookup size (assuming cadence=64) will be 9.
the binary search in [10000-10500) is mirrored in binary search on the lookup array

lookup[0] -> maxTxNum of block=10000
lookup[8] -> maxTxNum of block = 10500-1=10499

lookup[(0+8)/2 = 4] -> maxTxNum of block = (10000+10499)/2 = 10249
lookup[(0+4)/2 = 2] -> maxTxNum of block = (10000+10249)/2
lookup[(4+8)/2 = 6] -> maxTxNum of block = (10249+10499)/2
etc.

essentially the lookup stores maxTxNum for blk= [10000, 10062, 10124, 10186, 10249,10311,10374,10436,10499]

so the binary search in lookup [0-9) is mirrored in binary search in "block number space" [10000, 10500).

But if we were to mirror all blocks, then lookup size = number of block in file, we can't afford this...so there's a depth beyond which, we don't store maxTxNum for any block (so always query the file)

sort.Search in block_reader.go happens on the "blockNumber" space. While CacheQuery.Left() etc. traversing this tree in it's "lookup array index" space (mirroring).

maybe it's easier to have lookup as map[block_number][maxTxnum], so the cache will be map[snapshotsync.Range]map[block_number][maxTxnum], but that needs extra space for storing blockNumber too...and max cache size is already 50MB or so.

it allow store big and small files and get 1 array per file by 1 map get and then do binsearch inside this array? (array of 10 uints - brobably sequential search will be faster than binsearch).

yes. But the lookup size is number of blocks/cadence - approx 30000 for 500-step-blocks-file.

sudeepdino008 · 2025-06-06T04:44:31Z

14% on FindBlockNumber now.. i guess it's because we don't have handmade rlp decoding for bodies (maybe i'm wrong).

BodyForStorage has handmade decoding.
for cadence=8 (maxtxnum cached for every 8th block), FindBlockNumber disappeared entirely in the profile. (with file touches=2-3) - the max cache size will be 100MB (for bor).

sudeepdino008 · 2025-06-10T05:13:47Z

@AskAlexSharov you were right. Storing one lookup value every x blocks turns out to be much simpler than trying to exactly find the block number in "binary search tree".

In current case, the cache stores cache every 20 blocks, which gives max cache size of 40MB (for 100M blocks).

In separate PR, we can do "GC/removal of smaller file entries when they get merged" - #15512

AskAlexSharov · 2025-06-10T13:19:22Z

+		if err != nil {
+			return true
+		}
+		txNum = atomic.LoadUint64(&lookup[i])


can use non-pointer atomic.Uint64 - it will not add ram overhead:

type Uint64 struct { _ noCopy _ align64 v uint64 }

AskAlexSharov · 2025-06-10T13:30:34Z

Looks good to me.

Does MapTxNum2BlockNumIter looks ok? Does it need to call FindBlockNum/Min/Max? (maybe it can call less?)

AskAlexSharov · 2025-06-10T13:35:10Z

FYI: for all such caches - need to keep in mind that likely we will have 2 copies of them on server: in erigon and in external rpcd

AskAlexSharov · 2025-06-10T13:38:08Z

pic from ethmainnet on this branch - looks good:

AskAlexSharov · 2025-06-10T13:48:32Z

50% of BodyForStorageFromSnapshots is rlp decode of withdrawals. ok, some day we will get there.

sudeepdino008 · 2025-06-10T14:23:51Z

Looks good to me.

Does MapTxNum2BlockNumIter looks ok? Does it need to call FindBlockNum/Min/Max? (maybe it can call less?)

I think it's fine.
One thing I noticed was that the queries were - consecutive txNums. This means creating a new MapTxNum2BlockNumIter and 5 file touches for each query (to FindBlockNum).

If consecutive txNum queries are common, maybe we also add additional "hottest block" (startTxNum, endTxNum) per snapshot file. This will make answering queries for consecutive txNums faster (atleast invariantsEthGetLogs should fly through).

ticket: #15209 - `ReadTxNumFunc` gets maxtxnum given a blockNumber...it was also used to find blockNumber given maxTxNum.. - add `TxBlockIndex` in-place of `ReadTxNumFunc` -- difference is that there's now a separate function to GetBlockNumber. Additionally: - earlier, binary search was performed over a range of block numbers, each search step queries MDBX and then fetches block from snapshot. - now, binary search is separated into two stages: first stage will binary_search in snapshots; second stage will do binary_search on MDBX.

save

288e6a1

AskAlexSharov requested a review from JkLondon June 2, 2025 14:38

sudeepdino008 added 8 commits June 3, 2025 10:44

save

bc553bd

save

22c8f70

save

fafee96

save

8ab874d

save

950b52e

save

35259ca

save

2b3ec90

save

275f386

sudeepdino008 changed the title ~~optimise FindBlockNum~~ wip: optimise FindBlockNum [DO NOT MERGE] Jun 3, 2025

sudeepdino008 marked this pull request as ready for review June 3, 2025 11:22

sudeepdino008 changed the title ~~wip: optimise FindBlockNum [DO NOT MERGE]~~ wip: optimise FindBlockNum [DO NOT MERGE or REVIEW yet] Jun 3, 2025

sudeepdino008 added 8 commits June 3, 2025 16:53

save

5d58d6a

Merge remote-tracking branch 'origin/main' into e3/optimize_findblocknum

53d1be0

save

798de78

save

311896e

save

60268f8

save

0340a46

save

2af1441

Merge remote-tracking branch 'origin/main' into e3/optimize_findblocknum

614a748

sudeepdino008 changed the title ~~wip: optimise FindBlockNum [DO NOT MERGE or REVIEW yet]~~ optimise FindBlockNum Jun 4, 2025

sudeepdino008 requested a review from AskAlexSharov June 4, 2025 07:33

JkLondon approved these changes Jun 4, 2025

View reviewed changes

AskAlexSharov requested changes Jun 4, 2025

View reviewed changes

sudeepdino008 added 3 commits June 4, 2025 15:31

save

80b8757

save

98730cf

save

c5bb107

AskAlexSharov reviewed Jun 6, 2025

View reviewed changes

save

d28111c

sudeepdino008 marked this pull request as draft June 9, 2025 06:28

sudeepdino008 changed the title ~~optimise FindBlockNum~~ WIP: optimise FindBlockNum Jun 9, 2025

save

330459a

sudeepdino008 changed the title ~~WIP: optimise FindBlockNum~~ optimise FindBlockNum Jun 10, 2025

sudeepdino008 marked this pull request as ready for review June 10, 2025 05:09

Merge remote-tracking branch 'origin/main' into e3/optimize_findblocknum

0e88b34

sudeepdino008 requested a review from AskAlexSharov June 10, 2025 05:16

save

2c8f05b

AskAlexSharov reviewed Jun 10, 2025

View reviewed changes

AskAlexSharov approved these changes Jun 10, 2025

View reviewed changes

save

69c0e2d

sudeepdino008 enabled auto-merge (squash) June 10, 2025 14:24

sudeepdino008 mentioned this pull request Jun 10, 2025

optimize BodyForStorageFromSnapshots when only BaseTxNum and Count is needed. #15516

Closed

sudeepdino008 merged commit 1760f16 into main Jun 10, 2025
14 checks passed

sudeepdino008 deleted the e3/optimize_findblocknum branch June 10, 2025 15:04

sudeepdino008 mentioned this pull request Jun 16, 2025

rpc tests for remote (relying on private api) configuration #15608

Closed

blxdyx mentioned this pull request Jun 17, 2025

Upstream v3.0.6 node-real/bsc-erigon#713

Merged

Conversation

sudeepdino008 commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sudeepdino008 commented Jun 4, 2025

Uh oh!

AskAlexSharov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sudeepdino008 commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sudeepdino008 commented Jun 5, 2025

Uh oh!

AskAlexSharov commented Jun 6, 2025

Uh oh!

AskAlexSharov Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

sudeepdino008 Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sudeepdino008 Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

AskAlexSharov Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

AskAlexSharov Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

sudeepdino008 Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sudeepdino008 Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

sudeepdino008 Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sudeepdino008 commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sudeepdino008 commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AskAlexSharov Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

AskAlexSharov commented Jun 10, 2025

Uh oh!

AskAlexSharov commented Jun 10, 2025

Uh oh!

AskAlexSharov commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AskAlexSharov commented Jun 10, 2025

Uh oh!

sudeepdino008 commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sudeepdino008 commented Jun 2, 2025 •

edited

Loading

AskAlexSharov left a comment •

edited

Loading

sudeepdino008 commented Jun 4, 2025 •

edited

Loading

sudeepdino008 Jun 6, 2025 •

edited

Loading

sudeepdino008 Jun 6, 2025 •

edited

Loading

sudeepdino008 Jun 6, 2025 •

edited

Loading

sudeepdino008 commented Jun 6, 2025 •

edited

Loading

sudeepdino008 commented Jun 10, 2025 •

edited

Loading

AskAlexSharov commented Jun 10, 2025 •

edited

Loading

sudeepdino008 commented Jun 10, 2025 •

edited

Loading