docs(ADR): #113 Modular transaction hashing by melekes · Pull Request #2241 · cometbft/cometbft

melekes · 2024-02-05T12:30:49Z

RENDERED

Closes #342

adizere · 2024-02-05T13:28:53Z

@itsdevbear @fedekunze @JayT106 super simple ADR, please have a look.

adizere

Concept ACK, thanks Anton!

I pinged a couple of users to gather some feedback on this, let's wait on their input before merging.

docs/references/architecture/adr-112-txi.md

cason

Nice write up.

I agree with the approach, but I think we should really restrict the scope of the proposed changes. I know, for instance, that the RPC endpoints use some weird way to "print" (i.e., representing as strings) transactions and their IDs/hashes. But this is a problem of String() method. That I am afraid that is not solved using this approach.

docs/references/architecture/adr-112-txi.md

cason · 2024-02-07T13:21:35Z

docs/references/architecture/adr-112-txi.md

+}
+```
+
+And then use it to calculate a transaction's hash.


Can you be more specific regarding that?

In the case of types/Tx, calculating the hash is hard-coded. How can we parameterize this?

In the case we want this change to apply to the RPC endpoints and, possibly, indexer is one thing. In this case we can use any customized hash function.

But if we want to adopt customized hash functions for our types, this is a completely different discussion.

But if we want to adopt customized hash functions for our types, this is a completely different discussion.

I'd like for us to have both discussions.

docs/references/architecture/adr-112-txi.md

Refs #342

+1 neutral consequence

andynog

The overall approach seems OK, but my concern is related to the scope and implications of this change.

For example, in types/tx.go the code below might affect the Merkle root hash

func (txs Txs) Hash() []byte {
	hl := txs.hashList()
	return merkle.HashFromByteSlices(hl)
}

because in txs.hashList() the Hash() is invoked

func (txs Txs) hashList() [][]byte {
	hl := make([][]byte, len(txs))
	for i := 0; i < len(txs); i++ {
		hl[i] = txs[i].Hash()
	}
	return hl
}

which in merkle.HashFromByteSlices(hl) will make the Merkle root hash different

func HashFromByteSlices(items [][]byte) []byte {
	return hashFromByteSlices(sha256.New(), items)
}

maybe this will be OK, but will this have implications later on ? Like during an upgrade, could the hash algorithm be changed ? what are the implications on verifying and validating if the Merkle is not valid.

Also, is this OK with IBC and proofs ? Like what if one chain decides to switch the hash algorithm? what are the validity of proof guarantees ?

Anyway, the scope and implications of this change seems to be a lot broader and it is a good idea to think this through. Maybe my concerns are not valid but just bringing this up 😉

cason · 2024-02-12T08:12:07Z

The overall approach seems OK, but my concern is related to the scope and implications of this change.

Fully agree with that, much better explained by @andynog. Hashing transactions is something useful for the mempool reactor and for RPC endpoints.

Other uses of this library includes block, merkle tree, evidence production and verification. For this case, being compatible is essential. A suggestion here, for instance, is to adopt ed25519 directly, so it is clear to everyone that this is the standard for blockchain hashes.

github-actions · 2024-02-23T00:12:38Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

docs/references/architecture/adr-113-txi.md

robert-zaremba · 2024-03-04T22:00:47Z

NOTE: We should be able to customize the Merkle Tree hashing as well. But this has to be done in the IAVL level.

docs/references/architecture/adr-113-txi.md

cason · 2024-03-05T08:52:50Z

docs/references/architecture/adr-113-txi.md

+
+## Detailed Design
+
+Add `hashFn HashFn` option to `NewNode` in `node.go`.


Should we be specific and add Transaction or Tx to the name and description of the methods?

If we decide to restrict the scope to just txs, yes.

docs/references/architecture/adr-113-txi.md

sergio-mena · 2024-06-05T13:43:03Z

docs/references/architecture/adr-113-modular-transaction-hashing.md

+
+```json
+{
+    "hash": 5, // crypto.SHA256 (https://pkg.go.dev/crypto#Hash)


What does "5" mean here?

Go's stdlib Hash https://pkg.go.dev/crypto#Hash. We can also use strings like SHA256 if that's more practical

docs/references/architecture/adr-113-modular-transaction-hashing.md

sergio-mena

🚀

docs/references/architecture/adr-113-modular-transaction-hashing.md

fedekunze · 2024-06-05T09:47:38Z

docs/references/architecture/adr-113-modular-transaction-hashing.md

+    Hash = crypto.SHA256
+
+	stringFunc = func(bz []byte) string {
+		return fmt.Sprintf("%X", bz)
+	}


I see things differently. A core dev will be less likely to call FmtHash instead of calling the String() func. Moreover the %s operand calls String() so loggers and print statements will be less verbose than calling the FmtHash function

using the Stringer interface also requires fewer code changes. Devs only would have to set the hasher struct once and then forget about the rest

…ng.md Co-authored-by: Sergio Mena <sergio@informal.systems>

melekes · 2024-06-07T09:02:36Z

@fedekunze If we only allowed changing the hash function for transactions (how they are hashed in mempool and displayed, plus the Merkle root hash of transactions—DataHash in the block's header), would it suffice for your use case?

cason

A further review round.

Should we consider a prototype implementation to test the idea?

cason · 2024-06-10T13:03:44Z

docs/references/architecture/adr-113-modular-transaction-hashing.md

+
+## Context
+
+Hashing in CometBFT is currently implemented using `crypto/tmhash`


Although we directly import crypto/sha256 in many places:

$ grep crypto/sha256 . -R 2> /dev/null ./statesync/snapshots.go: "crypto/sha256" ./crypto/merkle/tree.go: "crypto/sha256" ./crypto/merkle/bench_test.go: "crypto/sha256" ./crypto/bls12381/key_bls12381.go: "crypto/sha256" ./crypto/secp256k1/secp256k1.go: "crypto/sha256" ./crypto/tmhash/hash_test.go: "crypto/sha256" ./crypto/tmhash/hash.go: "crypto/sha256" ./crypto/tmhash/bench_test.go: "crypto/sha256" ./crypto/hash.go: "crypto/sha256" ./types/tx.go: "crypto/sha256" ./test/e2e/app/state.go: "crypto/sha256" ./p2p/conn/secret_connection.go: "crypto/sha256" ./mempool/cache_test.go: "crypto/sha256" ./mempool/mempool.go: "crypto/sha256"

docs/references/architecture/adr-113-modular-transaction-hashing.md

cason · 2024-06-10T13:07:53Z

docs/references/architecture/adr-113-modular-transaction-hashing.md

+## Alternative Approaches
+
+1. Do nothing => not flexible.
+2. Add `HashFn` argument to `NewNode` and pass this function down the stack =>


HashFn is a transaction hasher implementation?

It's an option that we pass to NewNode and forward to mempool, tx indexer, and tx RPC endpoints. However, the use case is rare, so I chose to opt for global variables instead.

cason · 2024-06-10T13:09:36Z

docs/references/architecture/adr-113-modular-transaction-hashing.md

+
+Give app developers a way to provide their own hash function.
+
+## Detailed Design


One solution I see in several places in the code base is:

Introduce an interface

Produce a standard implementation for the interface

Enable changing the implementation for the interface, which is essentially only used for testing

So I wonder why this wouldn't be a better solution.

#2241 (comment)

My main argument against passing a hasher everywhere is that it's rare when somebody wants a different hashing function, so changing the interface of mempool, tx indexer and RPC endpoints is unjustified. But maybe I'm wrong. Please let me know what you think.

docs/references/architecture/adr-113-modular-transaction-hashing.md

docs/references/architecture/README.md

cason · 2024-06-10T14:05:24Z

I think we should also add a short paragraph mentioning that it is probably not possible/feasible to implement this via configuration parameter or command-line option, am I right? Because this could be seen as a trivial solution, but... unfeasible.

cason · 2024-06-19T11:40:39Z

Did we mention tendermint/tendermint#6539 and tendermint/tendermint#6773 here? It would be nice, also to see the discussions back then.

melekes · 2024-06-19T15:53:17Z

I think we should also add a short paragraph mentioning that it is probably not possible/feasible to implement this via configuration parameter or command-line option, am I right?

It's possible but I'm against this because

a) we're not sure this will be a final design/solution
b) we're optimizing for the default hashing function (by analogy, we may support different consensus algorithms, but we're optimizing for the default one)

comments have been addressed

melekes requested a review from a team as a code owner February 5, 2024 12:30

melekes requested a review from a team February 5, 2024 12:30

melekes changed the title ~~feat(ADR): ADR-112: Modular transaction hashing~~ docs(ADR): #112: Modular transaction hashing Feb 5, 2024

melekes changed the title ~~docs(ADR): #112: Modular transaction hashing~~ docs(ADR): #112 Modular transaction hashing Feb 5, 2024

melekes self-assigned this Feb 5, 2024

melekes added backport-to-v1.x labels Feb 5, 2024

adizere added this to the 2024-Q1 milestone Feb 5, 2024

adizere approved these changes Feb 5, 2024

View reviewed changes

This comment was marked as resolved.

Sign in to view

cason reviewed Feb 7, 2024

View reviewed changes

docs/references/architecture/adr-112-txi.md Outdated Show resolved Hide resolved

cason reviewed Feb 7, 2024

View reviewed changes

This comment was marked as resolved.

Sign in to view

cason mentioned this pull request Feb 8, 2024

fix(mempool/tests): Reduce tests duration #2263

Merged

4 tasks

melekes changed the title ~~docs(ADR): #112 Modular transaction hashing~~ docs(ADR): #113 Modular transaction hashing Feb 8, 2024

feat(ADR): ADR-112: Modular transaction hashing

cf8ffe1

Refs #342

melekes force-pushed the anton/342-txi branch from 183a263 to cf8ffe1 Compare February 8, 2024 14:12

melekes added 2 commits February 8, 2024 18:26

option instead of argument

d02b25e

+1 neutral consequence

Merge branch 'main' into anton/342-txi

548a8b8

andynog reviewed Feb 9, 2024

View reviewed changes

github-actions bot added the stale For use by stalebot label Feb 23, 2024

Merge branch 'main' into anton/342-txi

13b9cdb

github-actions bot removed the stale For use by stalebot label Feb 27, 2024

lasarojc reviewed Mar 4, 2024

View reviewed changes

docs/references/architecture/adr-113-txi.md Outdated Show resolved Hide resolved

Merge branch 'main' into anton/342-txi

4724d84

cason reviewed Mar 5, 2024

View reviewed changes

add hash to genesis instead of header

03cf7e2

sergio-mena reviewed Jun 5, 2024

View reviewed changes

docs/references/architecture/adr-113-modular-transaction-hashing.md Show resolved Hide resolved

sergio-mena approved these changes Jun 5, 2024

View reviewed changes

fedekunze reviewed Jun 6, 2024

View reviewed changes

melekes changed the title ~~docs(ADR): #113 Modular transaction hashing~~ docs(ADR): #113 Modular hashing Jun 6, 2024

melekes and others added 3 commits June 6, 2024 13:07

Update docs/references/architecture/adr-113-modular-transaction-hashi…

6251317

…ng.md Co-authored-by: Sergio Mena <sergio@informal.systems>

save commit

fb42488

add a point re IBC

b69cf8d

limit the scope of changes

7fa6a5d

melekes changed the title ~~docs(ADR): #113 Modular hashing~~ docs(ADR): #113 Modular transaction hashing Jun 10, 2024

cason reviewed Jun 10, 2024

View reviewed changes

address comments

ba6e244

melekes added 2 commits June 19, 2024 19:23

Merge branch 'main' into anton/342-txi

1943e2b

explain why we don't expose CLI

6bf6e5d

Merge branch 'main' into anton/342-txi

0b7ca7a

melekes requested a review from adizere June 19, 2024 16:04

Merge branch 'main' into anton/342-txi

bc95102

melekes enabled auto-merge June 20, 2024 05:53

melekes removed wip Work in progress backport-to-v1.x labels Jun 20, 2024

melekes added this pull request to the merge queue Jun 20, 2024

Merged via the queue into main with commit 31220bf Jun 20, 2024

melekes deleted the anton/342-txi branch June 20, 2024 06:20


		## Detailed Design

		Add `hashFn HashFn` option to `NewNode` in `node.go`.


		## Context

		Hashing in CometBFT is currently implemented using `crypto/tmhash`


		Give app developers a way to provide their own hash function.

		## Detailed Design

Conversation

melekes commented Feb 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adizere commented Feb 5, 2024

Uh oh!

adizere left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

cason left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

andynog left a comment

Choose a reason for hiding this comment

Uh oh!

cason commented Feb 12, 2024

Uh oh!

github-actions bot commented Feb 23, 2024

Uh oh!

Uh oh!

robert-zaremba commented Mar 4, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sergio-mena left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

melekes commented Jun 7, 2024

Uh oh!

cason left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cason commented Jun 10, 2024

Uh oh!

melekes commented Feb 5, 2024 •

edited

Loading