Conversation
|
This pull request introduces 5 alerts when merging f3168eb into 7204160 - view on LGTM.com new alerts:
|
|
This pull request fixes 1 alert when merging 35c73c5 into 7204160 - view on LGTM.com fixed alerts:
|
|
This pull request fixes 1 alert when merging 36a150e into 7204160 - view on LGTM.com fixed alerts:
|
|
before final merge could we get an adr describing these changes and then also the removal of orphan system. It would help with objective and delivery items. |
|
This pull request fixes 1 alert when merging 37fe66f into 7204160 - view on LGTM.com fixed alerts:
|
|
|
||
| It requires extra storage to store the node because it should keep `leftNodeKey` and `rightNodeKey` to iterate the tree. Instead, we can delete the `version` in the node and reduce the key size. | ||
|
|
||
| It can't delete the old nodes for the specific version due to removing orphans. But it makes `rollback` easier and it makes it possible to remove old nodes through `import` and `export` functionalities. The `export` will restruct the tree to make node IDs to a sequenced segment like (1 ... node_sieze). |
There was a problem hiding this comment.
If there 50 versions, are we still able to remove version 5-10?
37fe66f to
36a150e
Compare
|
Do you can have comparative benchmarks you can show for this @cool-develope? |
|
This pull request fixes 1 alert when merging 042b877 into 7204160 - view on LGTM.com fixed alerts:
|
|
its a bit odd how variable the benchmarks are. are we able to get some profiles on the changes? |
mutable_tree.go
Outdated
| *ImmutableTree // The current, working tree. | ||
| lastSaved *ImmutableTree // The most recently saved tree. | ||
| orphans map[string]int64 // Nodes removed by changes to working tree. | ||
| orphans map[int64]int64 // Nodes removed by changes to working tree. |
There was a problem hiding this comment.
Why do we still need orphans? Are there plans to remove them in a subsequent PR?
There was a problem hiding this comment.
the idea is to remove them, this wasn't done in this pr
|
This pull request fixes 1 alert when merging 43dcaff into 7204160 - view on LGTM.com fixed alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog. |
| return false | ||
| }) | ||
| return size | ||
| return int(t.root.size*2 - 1) |
There was a problem hiding this comment.
This is only correct if the tree is a full one?
There was a problem hiding this comment.
right, this API is only used for the test purpose right now
mutable_tree.go
Outdated
| // We don't need to orphan nodes that were never persisted. | ||
| continue | ||
| // getOrphans gets orphaned nodes by the changes of the working tree. | ||
| func (tree *MutableTree) getOrphans() []*NodeKey { |
There was a problem hiding this comment.
I don't see how you can track the orphans try traverse the prev root?
There was a problem hiding this comment.
it is related to node.Clone(), as you can see I added the ensure part to assign children in Clone function.
If the node of the lastSaved tree has children, it means it is orphaned in the current version
There was a problem hiding this comment.
I guess you also need to clear those reference to avoid memory leak?
And the nodes come from NodeDB's cache, could it be evicted by cache and load again from db?
Somehow it don't feels cleaner than previous solution to me.
If we can somehow track orphans by traversing new root and old root simultaneously, without "hacking" the old nodes, that'd be ideal.
There was a problem hiding this comment.
If you assume cloning a node means orphan it (let's assume the assumption is correct for now), how about simply mark it with a flag?
There was a problem hiding this comment.
it could be, I will check it
There was a problem hiding this comment.
OK, it works, I updated it.
|
here it the benchmarking result |
mutable_tree.go
Outdated
| // save new nodes | ||
| if tree.root != nil { | ||
| // ensure init the nonce | ||
| tree.nonce = 0 |
There was a problem hiding this comment.
It seems tree.nonce could be a local var?
There was a problem hiding this comment.
yeah, it sounds good!
|
This pull request fixes 1 alert when merging b7d631b into 7204160 - view on LGTM.com fixed alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog. |
|
This pull request fixes 1 alert when merging 6d95cfb into 7204160 - view on LGTM.com fixed alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog. |
There was a problem hiding this comment.
The nonce idea starting to make sense.
From the ADR, it seemed that nonces are assigned as we are building up a tree during set(). In that case, nonces would not follow lexicographic order, leading to superfluous merging and compactions overhead.
Since nonces are assigned retroactively during SaveVersion, that could work. However, currently, it seems to be broken due to pre-order traversal when assigning nonces.
I think that if we change pre-order traversal to in-order, the performance would significantly improve. Now, the lexicographic order would be followed. I would love to see the updated benchmarks with this change applied.
Additionally, I do not understand why we need to retrieve all nodes into a map with tree.getNewNodes() in SaveVersion(). We should simply do another in-order tree traversal to save the nodes.
Lastly, I'm not following the partial removal of orphans. I saw the comment: #597 (comment) about having plans to remove them. However, here it seems to have been done in part. I think we should either remove them completely or fully keep them in this PR. Partial removal doesn't seem to make a lot of sense and might interfere with benchmarks.
nodedb.go
Outdated
| if len(hash) == 0 { | ||
| return nil, ErrNodeMissingHash | ||
| if nk == nil { | ||
| return nil, nil |
There was a problem hiding this comment.
Why are we returning nil, nil and not nil, err here?
There was a problem hiding this comment.
We should probably return ErrNodeMissingNodeKey like in SaveNode and cover this branch with a test
| logger.Debug("BATCH SAVE %X %p\n", node.hash, node) | ||
| node.persisted = true | ||
|
|
||
| // resetBatch only working on generate a genesis block |
There was a problem hiding this comment.
What does it mean? Can you elaborate? Why do we even need this part of the code?
There was a problem hiding this comment.
this comes from the old function nodedb.SaveBranch, tbh I am not sure about this usecase.
| func (tree *MutableTree) balance(node *Node, orphans *[]*Node) (newSelf *Node, err error) { | ||
| if node.persisted { | ||
| func (tree *MutableTree) balance(node *Node) (newSelf *Node, err error) { | ||
| if node.nodeKey != nil { |
There was a problem hiding this comment.
We should have a check for node itself not being nil and cover with a test
There was a problem hiding this comment.
balance always comes after clone(), it ensures node is not nil
| } | ||
|
|
||
| lftBalance, err := leftNode.calcBalance(tree.ImmutableTree) | ||
| lftBalance, err := node.leftNode.calcBalance(tree.ImmutableTree) |
There was a problem hiding this comment.
What if node.leftNode is nil?
There was a problem hiding this comment.
same reason( call balance after clone()) , it makes sure node has children
|
|
||
| if lftBalance >= 0 { | ||
| // Left Left Case | ||
| newNode, orphaned, err := tree.rotateRight(node) |
There was a problem hiding this comment.
Why do we still have orphans as a field and track them in some other parts of the code but not here?
mutable_tree.go
Outdated
|
|
||
| // save new nodes | ||
| if tree.root != nil { | ||
| newNodes, err := tree.getNewNodes() |
There was a problem hiding this comment.
Why do we have to get all these nodes in memory and then iterate them as a map a second time only to save?
After we iterate to assign nonces, there must be another in-order tree iteration to make sure that we save nodes in lexicographic order.
There was a problem hiding this comment.
the purpose of in-order is to calculate the node hash and then to make sure of lexicographic order, it stores all nodes in the map and iterates again in the sort order of nonce. I think we can't do both hash calc and nonce assigning.
The other purpose is I'd like to keep the nonce of the root as 1, and keep the ASC order of nodes when visiting the leaf node.
There was a problem hiding this comment.
What if we do this?
recursiveSpread = func(node *Node) (*NodeKey, error) {
if node.nodeKey != nil {
return node.nodeKey, nil
}
var err error
if node.leftNode != nil {
node.leftNodeKey, err = recursiveSpread(node.leftNode)
if err != nil {
return nil, err
}
}
nonce++
node.nodeKey = &NodeKey{
version: version,
nonce: nonce,
}
if node.rightNode != nil {
node.rightNodeKey, err = recursiveSpread(node.rightNode)
if err != nil {
return nil, err
}
}
_, err = node._hash(version)
if err != nil {
return nil, err
}
node.leftNode = nil
node.rightNode = nil
newNodes[node.nodeKey.nonce] = node
return node.nodeKey, nil
}This way, the nonce assignment follows in-order traversal while hashing follows post-order traversal as desired.
There was a problem hiding this comment.
Then, we can do a second in-order traversal of the tree to write the nodes to disk, removing the need for the map
There was a problem hiding this comment.
the second traversal makes sense, but I think the second traversal should be pre-order.
| nonce++ | ||
| node.nodeKey = &NodeKey{ | ||
| version: version, | ||
| nonce: nonce, | ||
| } |
There was a problem hiding this comment.
This is a pre-order traversal. As a result, the nonces do not follow the lexicographic order. This leads to increased superfluous merges.
Please try changing to in-order traversal. I would expect the benchmarks to improve with that change
There was a problem hiding this comment.
you are right, this is why we use the map and iterate again.
There was a problem hiding this comment.
I am not sure why in-order is essential here, we are using the node key as version | nonce. it doesn't matter the order of the key. I think we have to focus on the tree iterate rather than it.
|
This pull request fixes 1 alert when merging dcc0d07 into 7204160 - view on LGTM.com fixed alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog. |
|
This pull request fixes 1 alert when merging 03baaf5 into 7204160 - view on LGTM.com fixed alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog. |
|
closing this in favour of the #650 |
ref: #592