Skip to content

memdb: use btree for storage#53

Merged
erikgrinaker merged 9 commits intomasterfrom
erik/memdb-btree
Mar 9, 2020
Merged

memdb: use btree for storage#53
erikgrinaker merged 9 commits intomasterfrom
erik/memdb-btree

Conversation

@erikgrinaker
Copy link
Contributor

Fixes #52 (see also tendermint/tendermint#4520). Uses a B-tree for storage, which should significantly improve range scan performance. No breaking changes.

Haven't had time to benchmark this yet, will do so shortly.

@erikgrinaker
Copy link
Contributor Author

Benchmarks, doing range scans of 10.000 keys on databases with 1 million and 10 million keys:

Benchmark Map (ns/op) B-tree (ns/op) Change
BenchmarkMemDBRangeScans1M-4 90 239 121 1 477 749 -98.36%
BenchmarkMemDBRangeScans10M-4 1 006 363 039 1 469 495 -99.86%
BenchmarkMemDBRandomReadsWrites-4 816 3 314 +306.17%

For these database sizes, B-trees have a speedup of 2-3 orders of magnitude. More importantly, and as expected, the old map-based database has a dependence on the number of keys in the database while B-trees do not.

The tradeoff is that read/write operations are three times more costly, but I think that's well worth it.

@erikgrinaker erikgrinaker removed the S:wip Status: Work In Progress label Mar 3, 2020
@erikgrinaker erikgrinaker marked this pull request as ready for review March 3, 2020 21:14
@melekes
Copy link
Contributor

melekes commented Mar 4, 2020

@erikgrinaker
Copy link
Contributor Author

I ran some Cosmos SDK benchmarks using the B-tree MemDB, via make test-sim-benchmark. These used SIM_NUM_BLOCKS=250, SIM_BLOCK_SIZE=200, and LevelDB with pruning strategies PruneNothing (persist every version to disk) and PruneSyncable (persist every 100 blocks to disk). I believe the SDK currently uses PruneSyncable by default.

tm-db PruneNothing PruneSyncable MemDB + PruneNothing
0.4.1 191 s 114 s 691 s
btree 189 s 117 s 138 s

It appears that IAVLs use of MemDB in PruneSyncable mode, as a temporary working set that's periodically flushed to disk, does not trigger the slow path in MemDB. It is unclear why, but appears to be related to cleanup of orphaned nodes, which possibly hits LevelDB instead of MemDB. In this case the new B-tree appears to have a slight slowdown, as expected. However, when we use MemDB as the main database we do hit the slow path, and the B-tree version is five times faster.

Overall memory usage is about the same in both cases.

@erikgrinaker erikgrinaker merged commit a2b9135 into master Mar 9, 2020
@erikgrinaker erikgrinaker deleted the erik/memdb-btree branch March 9, 2020 15:22
faddat pushed a commit to faddat/tm-db that referenced this pull request Feb 21, 2024
* Create SECURITY.md

* Update SECURITY.md

* Apply suggestions from code review

---------

Co-authored-by: Thane Thomson <connect@thanethomson.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

memdb: use ordered data structure

3 participants