Skip to content
This repository was archived by the owner on Jul 15, 2018. It is now read-only.

dont length prefix hashes during concatenation#211

Closed
ebuchman wants to merge 4 commits intodevelopfrom
bucky/merkle-hash
Closed

dont length prefix hashes during concatenation#211
ebuchman wants to merge 4 commits intodevelopfrom
bucky/merkle-hash

Conversation

@ebuchman
Copy link
Contributor

@ebuchman ebuchman commented May 22, 2018

Trying to simplify the tree here.

I don't see what the length prefixing buys us, other than another dependency/thing to specify.

Of course, we still need it for hashing kv pairs, otherwise two distinct pairs could hash to the same thing.

Can we enforce that these FromXxxHashes functions only work with hashes of the same length (ie so theyre safe to concatenate without length prefixing ) ?

@zmanian @jaekwon @liamsi

Ref tendermint/tendermint#1547

@ebuchman ebuchman requested a review from melekes as a code owner May 22, 2018 01:01
@melekes
Copy link
Contributor

melekes commented May 22, 2018

--- FAIL: TestSimpleMap (0.00s)
	Error Trace:	simple_map_test.go:31
	Error:      	Not equal: 
	            	expected: "acc3971eab8513171cc90ce8b74f368c38f9657d"
	            	actual  : "2db281ef2b0da1c20e16c9597b7cd529a9ce3c54"
	Test:       	TestSimpleMap
	Messages:   	Hash didn't match
	Error Trace:	simple_map_test.go:37
	Error:      	Not equal: 
	            	expected: "acc3971eab8513171cc90ce8b74f368c38f9657d"
	            	actual  : "2db281ef2b0da1c20e16c9597b7cd529a9ce3c54"
	Test:       	TestSimpleMap
	Messages:   	Hash didn't match
	Error Trace:	simple_map_test.go:44
	Error:      	Not equal: 
	            	expected: "0cd117ad14e6cd22edcd9aa0d84d7063b54b862f"
	            	actual  : "6c5244cc930cbbbe7ff83069e9d2a2ee3faf24ba"
	Test:       	TestSimpleMap
	Messages:   	Hash didn't match
	Error Trace:	simple_map_test.go:51
	Error:      	Not equal: 
	            	expected: "0cd117ad14e6cd22edcd9aa0d84d7063b54b862f"
	            	actual  : "6c5244cc930cbbbe7ff83069e9d2a2ee3faf24ba"
	Test:       	TestSimpleMap
	Messages:   	Hash didn't match

@ebuchman
Copy link
Contributor Author

Yeh the hashes are harcoded - need to update them. Just waiting for agreement first

@liamsi
Copy link

liamsi commented May 22, 2018

I do not think length prefixes are safer / neccessary either unless they encode the depth of where we are (which level) in the tree (which doesn't seem to be the case here but maybe I'm overlooking sth).

To prevent second pre-image attacks it is quite common to add a prefix depending on if we are hashing a leaf node, or and interior node. Is this intentional, that this doesn't seem to happen here?
https://en.wikipedia.org/wiki/Merkle_tree#Second_preimage_attack

@ebuchman
Copy link
Contributor Author

Is this intentional, that this doesn't seem to happen here?

Not intentional, don't think we thought about it.

So the idea with a 2nd pre-image attack is to basically make a new merkle tree where the leaves are say the internal nodes from some layer in the original tree - so they have the same root but technically different leaves.

Semantically this shouldn't be a problem since the inner nodes will be meaningless from a tx/validator perspective, but its worth fixing this in the data structure so no one has to think about it.

Let's copy CT then ? ie. prepend a 0x00 to the leaf hashes and a 0x1 to the inner hashes ?

@ebuchman
Copy link
Contributor Author

Let's copy CT then ? ie. prepend a 0x00 to the leaf hashes and a 0x1 to the inner hashes ?

Oh, this won't go great in the implementation right now without some changes if we want to avoid using append (ie. performance reasons). The recursive impl doesnt distinguish leaves/nodes

@ebuchman
Copy link
Contributor Author

Looks like we can't really avoid append anyways. we can use .Sum(prefix) but that just uses append under the hood too.

Need to be careful with this to make sure its done right and the same everywhere, especially since we're copying the merkle tree impl in a couple places in Tendermint as an optimization (to avoid allocations from the generic funcs in merkle pkg)

@ebuchman
Copy link
Contributor Author

I'm rethinking this prefix thing because maybe we don't need it after all. We could add, but it'd be nicer on many fronts not to. I don't think these attacks will actually be an issue because the merkle roots are always found in a header, and the header contains the num_txs, so we know how deep the tree should be.

@liamsi @zmanian @veorq would you agree ?

@codecov-io
Copy link

Codecov Report

Merging #211 into bucky/tmhash will decrease coverage by 0.45%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff                @@
##           bucky/tmhash     #211      +/-   ##
================================================
- Coverage         58.99%   58.53%   -0.46%     
================================================
  Files                59       54       -5     
  Lines              3687     3504     -183     
================================================
- Hits               2175     2051     -124     
+ Misses             1360     1308      -52     
+ Partials            152      145       -7
Impacted Files Coverage Δ
merkle/simple_map.go
merkle/tmhash/hash.go
merkle/simple_proof.go
merkle/types.go

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d72de8b...8aa9e11. Read the comment docs.

@ebuchman ebuchman changed the base branch from bucky/tmhash to develop May 22, 2018 16:10
@ebuchman
Copy link
Contributor Author

Moved to tendermint/go-crypto#105 , specifically commit tendermint/go-crypto@a72addd

@liamsi
Copy link

liamsi commented May 22, 2018

I'm rethinking this prefix thing because maybe we don't need it after all. We could add, but it'd be nicer on many fronts not to. I don't think these attacks will actually be an issue because the merkle roots are always found in a header, and the header contains the num_txs, so we know how deep the tree should be.

While adding the TX number makes such attacks more difficult (if the number of TX is always checked), I'm not sure if this is sufficient. It also depends on if one can add "empty leafs", I guess (the we could have the same number of TXs and could find another pre-image; in practice this might not work as an empty TX is not allowed).
I would at least add a warning in the readme that this lib is not secure against 2nd pre-image per se and what checks the user should additionally do be safe(r).

Maybe some more "serious cryptographers" have sth to say here, too:

@zmanian @veorq would you agree ?

@veorq
Copy link

veorq commented May 22, 2018

Generally the safest approach to mitigate the exploitation of collision in a tree is to have a distinct hash per node, i.e. encode the depth and index of the node. It also helps mitigate multi-target attacks. For example in some hash-based signatures you may attack multiple nodes of multiples instances at the same time.

But here we're talking of something slightly different, an encoding to avoid trivial collisions when hashing pairs of variable size values. Not sure to get all the context here, but my short answer would be that if you always hash values of the same length then encoding is useless, whereas if you hash pairs of arbitrary length data then encoding is mandatory.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants