Problem: Inefficient binary encoding of Merkle tree path (CRO-149)#558
Problem: Inefficient binary encoding of Merkle tree path (CRO-149)#558bors[bot] merged 1 commit intocrypto-com:masterfrom
Conversation
tomtau
left a comment
There was a problem hiding this comment.
@yihuang thanks and congrats on your first PR!
I forgot to point to this document (did you go through the PR checklist?): https://github.com/crypto-com/chain/blob/master/CONTRIBUTING.md#working-on-issues (and https://rfc.zeromq.org/spec:42/C4/#23-patch-requirements )
Looks OK -- I guess flattening out the path representation is the biggest win in terms of efficiency and simplicity (in common cases with short paths, as the tree depth is limited / bounded); for our own reference, could you do a simple example byte size comparison (e.g. a path with 4 nodes -- old_representation.encode().len() vs new_representation.encode().len())?
A couple of small things:
- the patch fails to compile: https://travis-ci.org/crypto-com/chain/jobs/608353067#L3540
--> chain-core/src/common/merkle_tree.rs:102:21
3542 |
3543102 | path
3544 | ^^^^ expected enum `std::option::Option`, found struct `common::merkle_tree::Path`
3545 |
3546 = note: expected type `std::option::Option<common::merkle_tree::Path>`
3547 found type `common::merkle_tree::Path`
3548
3549error: aborting due to previous error
3550
3551
- the patch isn't formatted with the default style convention we use (
cargo fmtshould fix it): https://travis-ci.org/crypto-com/chain/jobs/608353067#L3593
Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 76:
3594 {
3595 match self {
3596 Tree::Empty => None,
3597- Tree::Leaf { hash: node_hash, .. } => {
3598+ Tree::Leaf {
3599+ hash: node_hash, ..
3600+ } => {
3601 if &hash(value, NodeType::Leaf) == node_hash {
3602- Some(Path{leaf_hash: node_hash.clone(), nodes: vec![]})
3603+ Some(Path {
3604+ leaf_hash: node_hash.clone(),
3605+ nodes: vec![],
3606+ })
3607 } else {
3608 None
3609 }
3610Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 85:
3611 }
3612- Tree::Node { hash: node_hash, left, right } => {
3613- left.generate_path(value).map_or_else(|| {
3614+ Tree::Node {
3615+ hash: node_hash,
3616+ left,
3617+ right,
3618+ } => left.generate_path(value).map_or_else(
3619+ || {
3620 right.generate_path(value).map(|mut path| {
3621- path.nodes.push(PathNode{
3622+ path.nodes.push(PathNode {
3623 node_hash: node_hash.clone(),
3624 child_hash: left.hash(),
3625- hash_side: Side::Left
3626+ hash_side: Side::Left,
3627 });
3628 path
3629 })
3630Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 96:
3631- }, |mut path| {
3632- path.nodes.push(PathNode{
3633+ },
3634+ |mut path| {
3635+ path.nodes.push(PathNode {
3636 node_hash: node_hash.clone(),
3637 child_hash: right.hash(),
3638- hash_side: Side::Right
3639+ hash_side: Side::Right,
3640 });
3641 path
3642- })
3643- }
3644+ },
3645+ ),
3646 }
3647 }
3648 }
3649Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 111:
3650 #[derive(Debug, PartialEq, Eq, Clone, Encode, Decode)]
3651 pub enum Side {
3652 Left,
3653- Right
3654+ Right,
3655 }
3656
3657 #[derive(Debug, PartialEq, Eq, Clone, Encode, Decode)]
3658Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 118:
3659 pub struct PathNode {
3660 node_hash: H256,
3661- child_hash: H256, // hash of left child unless reversed
3662+ child_hash: H256, // hash of left child unless reversed
3663 hash_side: Side, // the side of hash child
3664 }
3665
3666Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 124:
3667 #[derive(Debug, PartialEq, Eq, Clone, Encode, Decode)]
3668 pub struct Path {
3669 leaf_hash: H256,
3670- nodes: Vec<PathNode>, // order from inner node to outer node
3671+ nodes: Vec<PathNode>, // order from inner node to outer node
3672 }
3673
3674 impl Path {
3675Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 142:
3676 let ref_hash = if i == 0 {
3677 self.leaf_hash
3678 } else {
3679- self.nodes[i-1].node_hash
3680+ self.nodes[i - 1].node_hash
3681 };
3682
3683 let calced_hash = if Side::Left == node.hash_side {
3684Diff in /home/travis/build/crypto-com/chain/chain-core/src/common/merkle_tree.rs at line 300:
3685 {
3686 self.generate_path(&value).map(|path| {
3687 assert_eq!(&self.root_hash(), path.hash());
3688- Proof {
3689- path,
3690- value,
3691- }
3692+ Proof { path, value }
3693 })
3694 }
- the patch commit message doesn't follow point 7: https://rfc.zeromq.org/spec:42/C4/#23-patch-requirements
A patch commit message MUST consist of a single short (less than 50 characters) line stating the problem ("Problem: ...") being solved, followed by a blank line and then the proposed solution ("Solution: ...").
|
Sorry about the compile error and format, just fixed them, and made a force push to keep it clean. |
Codecov Report
@@ Coverage Diff @@
## master #558 +/- ##
==========================================
- Coverage 67.53% 67.53% -0.01%
==========================================
Files 124 124
Lines 14319 14321 +2
==========================================
+ Hits 9671 9672 +1
- Misses 4648 4649 +1
|
|
I've compared the serialization length between new and old representation, the numbers are below:
With the new representation, the extra bytes without the hashes is The test code: |
|
bors r+ |
538: Problem:(CRO-521) unbonded from custom time is ignored in genesis initconfig r=tomtau a=linfeng-crypto Solution: change the parameters of `new_init`: change `genesis_time` from `Timespec` into `Option<Timespec>`, remove the `bool` type parameter `bonded`, add a `&StakedStateDestination` type parameter. 558: Problem: Inefficient binary encoding of Merkle tree path (CRO-149) r=tomtau a=yihuang I've refactored the ``Path`` representation with ``Vec``, which makes it both more correct and more compact serialization (Old representation allows some illegal path state). With the current implementation, the gain of further optimizing the serialization with ``BitVec`` is small, I guess it doesn't worth the trouble anymore. Co-authored-by: linfeng <linfeng@crypto.com> Co-authored-by: yihuang <huang@crypto.com>
Timed out (retrying...) |
|
lgtm, |
yeah, there are a bunch of tests from the old implementation: https://github.com/crypto-com/chain/blob/master/chain-core/src/common/merkle_tree.rs#L381 -- one different thing is that the previous representation could have represented some invalid states, e.g. no Sibling but with subpath or something like that |
|
bors r+ |
538: Problem:(CRO-521) unbonded from custom time is ignored in genesis initconfig r=tomtau a=linfeng-crypto Solution: change the parameters of `new_init`: change `genesis_time` from `Timespec` into `Option<Timespec>`, remove the `bool` type parameter `bonded`, add a `&StakedStateDestination` type parameter. 556: Problem: (CRO-494) not possible to create unjail transaction in client-cli r=tomtau a=leejw51 Solution: call unjail in cli increase coverage fix sync fix save public, private key 558: Problem: Inefficient binary encoding of Merkle tree path (CRO-149) r=tomtau a=yihuang I've refactored the ``Path`` representation with ``Vec``, which makes it both more correct and more compact serialization (Old representation allows some illegal path state). With the current implementation, the gain of further optimizing the serialization with ``BitVec`` is small, I guess it doesn't worth the trouble anymore. Co-authored-by: linfeng <linfeng@crypto.com> Co-authored-by: Jongwhan Lee <jonghwan@crypto.com> Co-authored-by: yihuang <huang@crypto.com>
Canceled |
|
done @tomtau |
tomtau
left a comment
There was a problem hiding this comment.
a few clippy suggestions:
error: using `clone` on a `Copy` type
1596 --> chain-core/src/common/merkle_tree.rs:84:36
1597 |
159884 | leaf_hash: node_hash.clone(),
1599 | ^^^^^^^^^^^^^^^^^ help: try dereferencing it: `*node_hash`
1600 |
1601 = note: `-D clippy::clone-on-copy` implied by `-D warnings`
1602 = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_copy
1603
1604error: using `clone` on a `Copy` type
1605 --> chain-core/src/common/merkle_tree.rs:99:40
1606 |
160799 | node_hash: node_hash.clone(),
1608 | ^^^^^^^^^^^^^^^^^ help: try dereferencing it: `*node_hash`
1609 |
1610 = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_copy
1611
1612error: using `clone` on a `Copy` type
1613 --> chain-core/src/common/merkle_tree.rs:108:36
1614 |
1615108 | node_hash: node_hash.clone(),
1616 | ^^^^^^^^^^^^^^^^^ help: try dereferencing it: `*node_hash`
1617 |
1618 = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_copy
1619
1620error: aborting due to 3 previous errors
|
and test compilation fails: https://travis-ci.org/crypto-com/chain/jobs/608661250#L1426 |
|
Clippy and doctest problem fixed. |
represent merkle tree path using Vec, more efficient, more correct and simpler. ascii art to explain merkle path fix clippy and doctest errors fix doctest
558: Problem: Inefficient binary encoding of Merkle tree path (CRO-149) r=tomtau a=yihuang I've refactored the ``Path`` representation with ``Vec``, which makes it both more correct and more compact serialization (Old representation allows some illegal path state). With the current implementation, the gain of further optimizing the serialization with ``BitVec`` is small, I guess it doesn't worth the trouble anymore. Co-authored-by: yihuang <huang@crypto.com>
Build failed |
|
bors retry |
558: Problem: Inefficient binary encoding of Merkle tree path (CRO-149) r=tomtau a=yihuang I've refactored the ``Path`` representation with ``Vec``, which makes it both more correct and more compact serialization (Old representation allows some illegal path state). With the current implementation, the gain of further optimizing the serialization with ``BitVec`` is small, I guess it doesn't worth the trouble anymore. Co-authored-by: yihuang <huang@crypto.com>
I've refactored the
Pathrepresentation withVec, which makes it both more correct and more compact serialization (Old representation allows some illegal path state).With the current implementation, the gain of further optimizing the serialization with
BitVecis small, I guess it doesn't worth the trouble anymore.