dont length prefix hashes during concatenation#211
Conversation
|
|
Yeh the hashes are harcoded - need to update them. Just waiting for agreement first |
|
I do not think length prefixes are safer / neccessary either unless they encode the depth of where we are (which level) in the tree (which doesn't seem to be the case here but maybe I'm overlooking sth). To prevent second pre-image attacks it is quite common to add a prefix depending on if we are hashing a leaf node, or and interior node. Is this intentional, that this doesn't seem to happen here? |
Not intentional, don't think we thought about it. So the idea with a 2nd pre-image attack is to basically make a new merkle tree where the leaves are say the internal nodes from some layer in the original tree - so they have the same root but technically different leaves. Semantically this shouldn't be a problem since the inner nodes will be meaningless from a tx/validator perspective, but its worth fixing this in the data structure so no one has to think about it. Let's copy CT then ? ie. prepend a 0x00 to the leaf hashes and a 0x1 to the inner hashes ? |
Oh, this won't go great in the implementation right now without some changes if we want to avoid using |
|
Looks like we can't really avoid append anyways. we can use Need to be careful with this to make sure its done right and the same everywhere, especially since we're copying the merkle tree impl in a couple places in Tendermint as an optimization (to avoid allocations from the generic funcs in merkle pkg) |
|
I'm rethinking this prefix thing because maybe we don't need it after all. We could add, but it'd be nicer on many fronts not to. I don't think these attacks will actually be an issue because the merkle roots are always found in a header, and the header contains the num_txs, so we know how deep the tree should be. |
Codecov Report
@@ Coverage Diff @@
## bucky/tmhash #211 +/- ##
================================================
- Coverage 58.99% 58.53% -0.46%
================================================
Files 59 54 -5
Lines 3687 3504 -183
================================================
- Hits 2175 2051 -124
+ Misses 1360 1308 -52
+ Partials 152 145 -7
Continue to review full report at Codecov.
|
|
Moved to tendermint/go-crypto#105 , specifically commit tendermint/go-crypto@a72addd |
While adding the TX number makes such attacks more difficult (if the number of TX is always checked), I'm not sure if this is sufficient. It also depends on if one can add "empty leafs", I guess (the we could have the same number of TXs and could find another pre-image; in practice this might not work as an empty TX is not allowed). Maybe some more "serious cryptographers" have sth to say here, too: |
|
Generally the safest approach to mitigate the exploitation of collision in a tree is to have a distinct hash per node, i.e. encode the depth and index of the node. It also helps mitigate multi-target attacks. For example in some hash-based signatures you may attack multiple nodes of multiples instances at the same time. But here we're talking of something slightly different, an encoding to avoid trivial collisions when hashing pairs of variable size values. Not sure to get all the context here, but my short answer would be that if you always hash values of the same length then encoding is useless, whereas if you hash pairs of arbitrary length data then encoding is mandatory. |
Trying to simplify the tree here.
I don't see what the length prefixing buys us, other than another dependency/thing to specify.
Of course, we still need it for hashing kv pairs, otherwise two distinct pairs could hash to the same thing.
Can we enforce that these FromXxxHashes functions only work with hashes of the same length (ie so theyre safe to concatenate without length prefixing ) ?
@zmanian @jaekwon @liamsi
Ref tendermint/tendermint#1547