optimise FindBlockNum#15398
Conversation
|
after the change... |
There was a problem hiding this comment.
-
does it work on
--prune.mode=minimal? -
do we need special
DefaultTxBlockIndexobject? Maybe move all logic of txnumreader toBlockReader? -
On bormainnet:
du -hsc /erigon-data/snapshots/*bod*seg
700mb
du -hsc /erigon-data/snapshots/*bod*idx
200mb
du -hsc /erigon-data/snapshots/*trans*seg
1.4T
du -hsc /erigon-data/snapshots/*trans*idx
1.4T
i feel it's wrong direction to move from bodies to txs. it's maybe look ok from 1 read perspective - but if think on it from "many parallel RPC" perspective - it can make "all txs files warm".
- Does Erigon already using
bodies.idxfield? Mabye we can use it here?
interesting...I can't quite formalize this...but makes sense that this will significantly increase warming of txs files... One idea is: figure out the exact blocks.seg where the block containing query_txnum might be (this needs to mmap touches); and then binary search query_txnum on that block.seg file alone..we should still get good savings from that. |
|
the current issue with this is that entries for smaller files don't get removed when they're merged. Will add a GC method for such entries.. |
|
|
|
|
||
| // idea is to use lookup tables to implement uniform binary search | ||
| type BlockTxNumLookupCache struct { | ||
| cache map[snapshotsync.Range][]uint64 |
There was a problem hiding this comment.
how this structure better than array of blockNums (1 blockNum per 1M txnums)?
There was a problem hiding this comment.
you mean - not having a per-file map, but rather a "flat" array that stores 1blockNum per 1M txnum?
we have to create a per-file cache structure - it allows the first few layers of the binary search (on that file) to be common among any query txNum in it.
If we don't keep it per-file, and just do this structure from [0-last_snapshot_block_num), the path that the queries take can keep changing (when a new file is introduced)
There was a problem hiding this comment.
If we can agree on the general idea - maybe it's possible to make it simpler.
I also need to lock to synchronize access to the cache.
There was a problem hiding this comment.
a "flat" array that stores 1blockNum per 1M txnum? - yes. then search will look like arr[txNum/1M] (don't need binsearch).
Of-course we can have 1 array per file or 1 array per all files - append-only.
maybe i also not fully understand current solution. but i slowly understanding - it allow store big and small files and get 1 array per file by 1 map get and then do binsearch inside this array? (array of 10 uints - brobably sequential search will be faster than binsearch).
but i don't fully understand how do we fill current array
There was a problem hiding this comment.
can't get through abstractions yet:
if !ok && !exhaust {
query.SetValue(maxTxNum)
}
where/what does it set? cache map[snapshotsync.Range][]uint64 storing txnums or blocknums?
There was a problem hiding this comment.
map[snapshotsync.Range][]uint64
key: file range (basically file)
value: list of lookup values (let's call this per-file array lookup)
e.g.
file range: [10000-10500)
so the file has 500 blocks..the array lookup size (assuming cadence=64) will be 9.
the binary search in [10000-10500) is mirrored in binary search on the lookup array
lookup[0] -> maxTxNum of block=10000
lookup[8] -> maxTxNum of block = 10500-1=10499
lookup[(0+8)/2 = 4] -> maxTxNum of block = (10000+10499)/2 = 10249
lookup[(0+4)/2 = 2] -> maxTxNum of block = (10000+10249)/2
lookup[(4+8)/2 = 6] -> maxTxNum of block = (10249+10499)/2
etc.
essentially the lookup stores maxTxNum for blk= [10000, 10062, 10124, 10186, 10249,10311,10374,10436,10499]
so the binary search in lookup [0-9) is mirrored in binary search in "block number space" [10000, 10500).
But if we were to mirror all blocks, then lookup size = number of block in file, we can't afford this...so there's a depth beyond which, we don't store maxTxNum for any block (so always query the file)
sort.Search in block_reader.go happens on the "blockNumber" space. While CacheQuery.Left() etc. traversing this tree in it's "lookup array index" space (mirroring).
There was a problem hiding this comment.
maybe it's easier to have lookup as map[block_number][maxTxnum], so the cache will be map[snapshotsync.Range]map[block_number][maxTxnum], but that needs extra space for storing blockNumber too...and max cache size is already 50MB or so.
There was a problem hiding this comment.
it allow store big and small files and get 1 array per file by 1 map get and then do binsearch inside this array? (array of 10 uints - brobably sequential search will be faster than binsearch).
yes. But the lookup size is number of blocks/cadence - approx 30000 for 500-step-blocks-file.
|
|
@AskAlexSharov you were right. Storing one lookup value every x blocks turns out to be much simpler than trying to exactly find the block number in "binary search tree". In current case, the cache stores cache every 20 blocks, which gives max cache size of 40MB (for 100M blocks). In separate PR, we can do "GC/removal of smaller file entries when they get merged" - #15512 |
| if err != nil { | ||
| return true | ||
| } | ||
| txNum = atomic.LoadUint64(&lookup[i]) |
There was a problem hiding this comment.
can use non-pointer atomic.Uint64 - it will not add ram overhead:
type Uint64 struct {
_ noCopy
_ align64
v uint64
}
|
Looks good to me. Does |
|
FYI: for all such caches - need to keep in mind that likely we will have 2 copies of them on server: in erigon and in external rpcd |
|
50% of |
I think it's fine. If consecutive txNum queries are common, maybe we also add additional "hottest block" (startTxNum, endTxNum) per snapshot file. This will make answering queries for consecutive txNums faster (atleast |
ticket: #15209 - `ReadTxNumFunc` gets maxtxnum given a blockNumber...it was also used to find blockNumber given maxTxNum.. - add `TxBlockIndex` in-place of `ReadTxNumFunc` -- difference is that there's now a separate function to GetBlockNumber. Additionally: - earlier, binary search was performed over a range of block numbers, each search step queries MDBX and then fetches block from snapshot. - now, binary search is separated into two stages: first stage will binary_search in snapshots; second stage will do binary_search on MDBX.
ticket: #15209 - `ReadTxNumFunc` gets maxtxnum given a blockNumber...it was also used to find blockNumber given maxTxNum.. - add `TxBlockIndex` in-place of `ReadTxNumFunc` -- difference is that there's now a separate function to GetBlockNumber. Additionally: - earlier, binary search was performed over a range of block numbers, each search step queries MDBX and then fetches block from snapshot. - now, binary search is separated into two stages: first stage will binary_search in snapshots; second stage will do binary_search on MDBX.



ticket: #15209
ReadTxNumFuncgets maxtxnum given a blockNumber...it was also used to find blockNumber given maxTxNum..TxBlockIndexin-place ofReadTxNumFunc-- difference is that there's now a separate function to GetBlockNumber.Additionally: