Mutable/Immutable refactor and GetImmutable snapshots#92
Conversation
|
Are we still able to run the benchmark scripts with these changes? Could you do so and report how the performance / mem-consumption changed? (e.g., does moving the ref to the db from tree to node impact mem-consumption on large trees?) |
liamsi
left a comment
There was a problem hiding this comment.
Thanks a lot @jlandrews. I think this separates concerns and clarifies the API.
I left a few minor comments (mostly documentation nits).
It would be good to have someone's input who uses the API a lot (or is deeply familiar with the previous design decisions), too. /CC @silasdavis @jaekwon
| version int64 | ||
| } | ||
|
|
||
| // NewImmutableTree creates both in-memory and persistent instances |
There was a problem hiding this comment.
documentation nit: this sounds like NewImmutableTree always creates both. Could be changed to:
NewImmutableTree creates an ImmutableTree.
If a database is provided as an argument, it will be used for persistence, otherwise an in-memory instance will be created.
There was a problem hiding this comment.
I think we should just delete this function actually. If a user creates a new ImmutableTree with this method, they won't be able to set any data on it. It's currently not being called from anywhere in this repo or in the SDK code either.
| } | ||
| return t.root.height | ||
| } | ||
|
|
There was a problem hiding this comment.
Not sure if we should keep differently typed "getters" for the same data (Height8 / Height & Version Version64 & Get / Get64 & GetByIndex / GetByIndex64).
|
|
||
| // IterateRange makes a callback for all nodes with key between start and end non-inclusive. | ||
| // If either are nil, then it is open on that side (nil, nil is the same as Iterate) | ||
| func (t *ImmutableTree) IterateRange(start, end []byte, ascending bool, fn func(key []byte, value []byte) bool) (stopped bool) { |
There was a problem hiding this comment.
Can we add a sentence about the return value (currently stopped)? E.g.: "It returns true if ... " Also, we might remove the named return and just change it to:
IterateRange(start, end []byte, ascending bool, fn func(key []byte, value []byte) bool) bool
I think this is slightly more general.
|
|
||
| // IterateRangeInclusive makes a callback for all nodes with key between start and end inclusive. | ||
| // If either are nil, then it is open on that side (nil, nil is the same as Iterate) | ||
| func (t *ImmutableTree) IterateRangeInclusive(start, end []byte, ascending bool, fn func(key, value []byte, version int64) bool) (stopped bool) { |
There was a problem hiding this comment.
It's a bit confusing that IterateRangeInclusive takes a version while IterateRange does not.
mutable_tree.go
Outdated
| // TODO: optimize balance & rotate. | ||
| node = node.clone(version) | ||
| l := node.getLeftNode() | ||
| _l := l.clone(version) |
There was a problem hiding this comment.
Can we rename _l to newNode and l to orphaned? Maybe we can remove the named returns here, too (func (tree *MutableTree) rotateRight(node *Node) (*Node, *Node) instead)
mutable_tree.go
Outdated
| // TODO: optimize balance & rotate. | ||
| node = node.clone(version) | ||
| r := node.getRightNode() | ||
| _r := r.clone(version) |
|
|
||
| // DeleteVersion deletes a tree version from disk. The version can then no | ||
| // longer be accessed. | ||
| func (tree *MutableTree) DeleteVersion(version int64) error { |
There was a problem hiding this comment.
Just to be sure: this is what you called DeleteImmutable(version) in your proposal?
There was a problem hiding this comment.
Yes, I changed my mind about renaming it. Seemed like it made more sense to say we're deleting an old version of the mutable tree, rather than that we're deleting an immutable tree.
d1f292f to
c1f6d4e
Compare
|
So I've run the "fullbench" benchmarks (which took more than 32gb of ram, so we might want to trim them down a bit), and it seems like performance is quite similar to develop. That doesn't account for actual tree size differences though, and I think I was wrong to add the nodeDB reference to each node, since it's a waste of memory to have it on each one. I only did that because I thought it was nicer to have node.getLeftNode() and node.getRightNode() without arguments of the tree. That's not a necessary change though, so I've reverted the particular commit that did it. |
|
Don't have too much context here, but the changes and discussion look very clean and concise. Does any relevant documentation need to be updated (i.e. readme, docs, etc...)? Also, @jlandrews, do you think the benchmarks should also be run with |
c1f6d4e to
badf974
Compare
|
Reran with -benchmem and compared results. Overall it looks pretty similar, but there are some metrics that changed by 10-20% either direction. Not sure how to interpret it though, since it seems like these tests are non-deterministic and so there may be some variance anyway... Expand to see benchmark results |
|
Really neat work @jlandrews 👍 |
|
|
||
| records := make([]*record, 400) | ||
| tree := NewTree(nil, 0) | ||
| tree := NewMutableTree(db.NewMemDB(), 0) |
There was a problem hiding this comment.
Posting what I said in the call here for future reference:
Anywhere in the tests where we were using Tree mutably I had to change it to MutableTree, and if you pass nil as the db to MutableTree it errors when constructing the nodedb. Using MutableTree with a MemDB as nodedb is equivalent to the old code using Tree.
88aa2a9 to
332a462
Compare
332a462 to
8ad44f7
Compare
|
I added back the ability to load old versions into the |
mutable_tree.go
Outdated
| } | ||
|
|
||
| if !(targetVersion == 0 || latestVersion == targetVersion) { | ||
| return latestVersion, fmt.Errorf("Wanted to load target %v but only found up to %v", |
There was a problem hiding this comment.
nit: error messages are usually non-capitalized in golang: https://github.com/golang/go/wiki/CodeReviewComments#error-strings
| // Size returns the number of leaf nodes in the tree. | ||
| func (t *Tree) Size() int { | ||
| func (t *ImmutableTree) Size() int { | ||
| return int(t.Size64()) |
There was a problem hiding this comment.
I might have mentioned this already somewhere but won't all this casting to int cause funky behaviour on 32 bit machines? Unrelated to this PR but I think we should just keep the *64 version of all these methods.
|
Thanks for adding the orphan test and addressing the comments! This still looks good to me. What could be improved: Can we have a test that shows the change in behaviour of Can you remind me if there was sth. else then the orphan test that we agreed should be included into this PR? |
|
So in the current code, it's possible to load any version of the tree, but you cannot overwrite previously saved versions. To enforce this, the library loads all roots into memory even when trying to load a specific version. Then when saving, it checks if version n+1 already exists in memory, and if so it compares the hashes and only tells the caller it was a successful save if the hash of the existing tree matches the hash of the intended tree. The original refactor I made removed this functionality entirely, since I thought it would make more sense to only load the current mutable "head" of the tree and completely disallow loading old versions mutably. I also can't think of anything else we wanted to include in this PR... |
That would awesome and then let's merge this. |
|
Thanks @jlandrews! Great work! Let's see if @jaekwon has anything to add to this. I'll merge soon otherwise. |
|
|
||
| // GetImmutable loads an ImmutableTree at a given version for querying | ||
| func (tree *MutableTree) GetImmutable(version int64) (*ImmutableTree, error) { | ||
| rootHash := tree.ndb.getRoot(version) |
There was a problem hiding this comment.
And the leaks are gone, and we get a sensible global cache behaviour of the node cache
|
I think this is a great change - it opens up some flexible ways of using the library and tidies up various things. Storing tree versions as a set |
Thanks for chiming in @silasdavis, we'll merge this very soon. I'll put together a release shortly after. |
* dep: Change tendermint dep to be ^v0.22.0 (#91) * Mutable/Immutable refactor and GetImmutable snapshots (#92) * Release 0.10.0: Update Changelog and bump version (#99) See changelog: https://github.com/tendermint/iavl/blob/develop/CHANGELOG.md#0100
* 'develop' of github.com:danil-lashin/iavl: Prep Release 0.11.0 (cosmos#111) Remove architecture dependent getter functions (cosmos#96) Use 8 bytes to store int64 components of database keys (cosmos#107) Update to CircleCI 2.0 (cosmos#108) Release 0.10.0: Update Changelog and bump version (cosmos#99) delete empty file (relict from merging master into develop) Mutable/Immutable refactor and GetImmutable snapshots (cosmos#92) Remove unreachable code Remove unused variable dep: Change tendermint dep to be ^v0.22.0 (cosmos#91) Fix benchmark scripts for current code (cosmos#89) release v0.9.2 (cosmos#82) Various lints (cosmos#80) Jae/rangeprooffix (cosmos#75)
* Move orphaningTree logic into VersionedTree * Move Tree.Set and Node.set to VersionedTree, fix tests * Move Tree.Remove and node.remove to VersionedTree * Rename VersionedTree/Tree to MutableTree/ImmutableTree * Rename files to match type names * Add GetImmutable and lazy loading of MutableTree * Move balance and rotate from Node to MutableTree * Add benchmarks with -benchmem, remove incomplete benchmark * Rename variables in rotation functions * Add test to check for consistent tracking of orphans * Add back ability to load old versions with idempotent saves * Add idempotent save test case
* dep: Change tendermint dep to be ^v0.22.0 (cosmos#91) * Mutable/Immutable refactor and GetImmutable snapshots (cosmos#92) * Release 0.10.0: Update Changelog and bump version (cosmos#99) See changelog: https://github.com/tendermint/iavl/blob/develop/CHANGELOG.md#0100
This fixes #88 and #68, with a side effect of resolving #50 and #87.