docs(ADR): #113 Modular transaction hashing#2241
Conversation
|
@itsdevbear @fedekunze @JayT106 super simple ADR, please have a look. |
adizere
left a comment
There was a problem hiding this comment.
Concept ACK, thanks Anton!
I pinged a couple of users to gather some feedback on this, let's wait on their input before merging.
This comment was marked as resolved.
This comment was marked as resolved.
cason
left a comment
There was a problem hiding this comment.
Nice write up.
I agree with the approach, but I think we should really restrict the scope of the proposed changes. I know, for instance, that the RPC endpoints use some weird way to "print" (i.e., representing as strings) transactions and their IDs/hashes. But this is a problem of String() method. That I am afraid that is not solved using this approach.
| } | ||
| ``` | ||
|
|
||
| And then use it to calculate a transaction's hash. |
There was a problem hiding this comment.
Can you be more specific regarding that?
In the case of types/Tx, calculating the hash is hard-coded. How can we parameterize this?
There was a problem hiding this comment.
In the case we want this change to apply to the RPC endpoints and, possibly, indexer is one thing. In this case we can use any customized hash function.
But if we want to adopt customized hash functions for our types, this is a completely different discussion.
There was a problem hiding this comment.
But if we want to adopt customized hash functions for our types, this is a completely different discussion.
I'd like for us to have both discussions.
This comment was marked as resolved.
This comment was marked as resolved.
183a263 to
cf8ffe1
Compare
+1 neutral consequence
andynog
left a comment
There was a problem hiding this comment.
The overall approach seems OK, but my concern is related to the scope and implications of this change.
For example, in types/tx.go the code below might affect the Merkle root hash
func (txs Txs) Hash() []byte {
hl := txs.hashList()
return merkle.HashFromByteSlices(hl)
}
because in txs.hashList() the Hash() is invoked
func (txs Txs) hashList() [][]byte {
hl := make([][]byte, len(txs))
for i := 0; i < len(txs); i++ {
hl[i] = txs[i].Hash()
}
return hl
}
which in merkle.HashFromByteSlices(hl) will make the Merkle root hash different
func HashFromByteSlices(items [][]byte) []byte {
return hashFromByteSlices(sha256.New(), items)
}
maybe this will be OK, but will this have implications later on ? Like during an upgrade, could the hash algorithm be changed ? what are the implications on verifying and validating if the Merkle is not valid.
Also, is this OK with IBC and proofs ? Like what if one chain decides to switch the hash algorithm? what are the validity of proof guarantees ?
Anyway, the scope and implications of this change seems to be a lot broader and it is a good idea to think this through. Maybe my concerns are not valid but just bringing this up 😉
Fully agree with that, much better explained by @andynog. Hashing transactions is something useful for the mempool reactor and for RPC endpoints. Other uses of this library includes block, merkle tree, evidence production and verification. For this case, being compatible is essential. A suggestion here, for instance, is to adopt |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
NOTE: We should be able to customize the Merkle Tree hashing as well. But this has to be done in the IAVL level. |
|
|
||
| ## Detailed Design | ||
|
|
||
| Add `hashFn HashFn` option to `NewNode` in `node.go`. |
There was a problem hiding this comment.
Should we be specific and add Transaction or Tx to the name and description of the methods?
There was a problem hiding this comment.
If we decide to restrict the scope to just txs, yes.
|
|
||
| ```json | ||
| { | ||
| "hash": 5, // crypto.SHA256 (https://pkg.go.dev/crypto#Hash) |
There was a problem hiding this comment.
What does "5" mean here?
There was a problem hiding this comment.
Go's stdlib Hash https://pkg.go.dev/crypto#Hash. We can also use strings like SHA256 if that's more practical
| Hash = crypto.SHA256 | ||
|
|
||
| stringFunc = func(bz []byte) string { | ||
| return fmt.Sprintf("%X", bz) | ||
| } |
There was a problem hiding this comment.
I see things differently. A core dev will be less likely to call FmtHash instead of calling the String() func. Moreover the %s operand calls String() so loggers and print statements will be less verbose than calling the FmtHash function
using the Stringer interface also requires fewer code changes. Devs only would have to set the hasher struct once and then forget about the rest
…ng.md Co-authored-by: Sergio Mena <sergio@informal.systems>
|
@fedekunze If we only allowed changing the hash function for transactions (how they are hashed in mempool and displayed, plus the Merkle root hash of transactions— |
cason
left a comment
There was a problem hiding this comment.
A further review round.
Should we consider a prototype implementation to test the idea?
|
|
||
| ## Context | ||
|
|
||
| Hashing in CometBFT is currently implemented using `crypto/tmhash` |
There was a problem hiding this comment.
Although we directly import crypto/sha256 in many places:
$ grep crypto/sha256 . -R 2> /dev/null
./statesync/snapshots.go: "crypto/sha256"
./crypto/merkle/tree.go: "crypto/sha256"
./crypto/merkle/bench_test.go: "crypto/sha256"
./crypto/bls12381/key_bls12381.go: "crypto/sha256"
./crypto/secp256k1/secp256k1.go: "crypto/sha256"
./crypto/tmhash/hash_test.go: "crypto/sha256"
./crypto/tmhash/hash.go: "crypto/sha256"
./crypto/tmhash/bench_test.go: "crypto/sha256"
./crypto/hash.go: "crypto/sha256"
./types/tx.go: "crypto/sha256"
./test/e2e/app/state.go: "crypto/sha256"
./p2p/conn/secret_connection.go: "crypto/sha256"
./mempool/cache_test.go: "crypto/sha256"
./mempool/mempool.go: "crypto/sha256"
| ## Alternative Approaches | ||
|
|
||
| 1. Do nothing => not flexible. | ||
| 2. Add `HashFn` argument to `NewNode` and pass this function down the stack => |
There was a problem hiding this comment.
HashFn is a transaction hasher implementation?
There was a problem hiding this comment.
It's an option that we pass to NewNode and forward to mempool, tx indexer, and tx RPC endpoints. However, the use case is rare, so I chose to opt for global variables instead.
|
|
||
| Give app developers a way to provide their own hash function. | ||
|
|
||
| ## Detailed Design |
There was a problem hiding this comment.
One solution I see in several places in the code base is:
- Introduce an interface
- Produce a standard implementation for the interface
- Enable changing the implementation for the interface, which is essentially only used for testing
So I wonder why this wouldn't be a better solution.
There was a problem hiding this comment.
My main argument against passing a hasher everywhere is that it's rare when somebody wants a different hashing function, so changing the interface of mempool, tx indexer and RPC endpoints is unjustified. But maybe I'm wrong. Please let me know what you think.
docs/references/architecture/adr-113-modular-transaction-hashing.md
Outdated
Show resolved
Hide resolved
|
I think we should also add a short paragraph mentioning that it is probably not possible/feasible to implement this via configuration parameter or command-line option, am I right? Because this could be seen as a trivial solution, but... unfeasible. |
|
Did we mention tendermint/tendermint#6539 and tendermint/tendermint#6773 here? It would be nice, also to see the discussions back then. |
It's possible but I'm against this because a) we're not sure this will be a final design/solution |
RENDERED
Closes #342