RFC-024: block structure consolidation by williambanfield · Pull Request #8457 · tendermint/tendermint

williambanfield · 2022-05-03T21:01:23Z

This RFC lays out a proposed set of changes for consolidating the set of information that may be included in the block structure.

creachadair · 2022-05-06T02:34:42Z

docs/rfc/rfc-019-block-structure-consolidation.md

+original block and therefore transitively included as in the Merkle root hash of
+every block. The redundant field is a candidate for removal from the root hash
+of each block. However, aesthetically, it's somewhat nice to include in each
+block, as if the block was 'stamped' with the ID. Additionally, re-validating


Debuggability seems to be the main argument in favour of keeping this one.

creachadair · 2022-05-06T02:42:07Z

docs/rfc/rfc-019-block-structure-consolidation.md

+Leaving the field will allow for transaction execution issues that are not
+correctly reflected in the `AppHash` to be more completely diagnosed.
+
+Take the case of mismatch of `LastResultsHash` between two nodes, A and B, where both


It seems like the more likely scenario is that A and B report different AppHashes for the same cumulative transaction hash? Not that it couldn't be the opposite too, but if I saw that in a trace I'd assume the node had computed the wrong values rather than the app diverging.

docs/rfc/rfc-019-block-structure-consolidation.md

creachadair · 2022-05-06T02:44:43Z

docs/rfc/rfc-019-block-structure-consolidation.md

+contains the same data but for the next height.
+
+This data is effectively redundant. Having both values present in the block
+structure is helpful for light client verification. The light client is able to


But if the light client is going to go on to verify the next block, won't it need to fetch both headers anyway?

I suppose if the light client only wants to verify one block it's a savings, but I'm not sure optimizing for that case makes much sense. It seems like the normal case is the light client is verifying a bunch of blocks, where this only affects the fencepost.

I suppose if the light client only wants to verify one block it's a savings, but I'm not sure optimizing for that case makes much sense.

In the bisection algorithm, the light client will verify a header without verifying the adjacent header. It is attempting to bootstrap trust by looking for validators that behaved well at a previous height with the intention of using those newly trusted validators to verify a particular candidate header.

creachadair · 2022-05-06T02:48:03Z

docs/rfc/rfc-019-block-structure-consolidation.md

+
+#### ProposerAddress
+
+The section below details a change to allow the `ProposerAddress` to be calculated


I agree, it probably doesn't make sense to remove this. Issues like #8326 suggest that if anything, we might want to consider making sure we compute the value "correctly". I think the value we store here now doesn't necessarily reflect changes in a height with multiple rounds.

docs/rfc/rfc-019-block-structure-consolidation.md

cmwaters

Haven't gone through the entire RFC yet but posting my comments now and will return to this tomorrow

cmwaters · 2022-05-09T12:48:37Z

docs/rfc/rfc-020-block-structure-consolidation.md

+
+A block is the structure produced as the result of an instance of the Tendermint
+consensus algorithm. At its simplest, the 'block' can be represented as a Merkle
+root hash of all of the data used to construct and produce the hash. Our current


Which "hash" are you referring to? Do you mind being a little more explicit

Also your definition of a "block" seems more apt to the definition of the "header". A block at the end of the day must include the set of state transitions to be applied against application state

Hmm, the point here is a bit abstract, but, broadly: every piece of data needed to execute the state transition is not included in the 'block' structure. Most concretely, the application state is not stored in the 'block' structure. A hash of it is gossiped in the block and validators use this to ensure they have reached the same state as each other. We have a protocol, statesync for disseminating and verifying the data referenced by the AppHash. We could just as easily have a txsync protocol, where validators gossip transaction sets separately, instead of storing them in a data structure we call the block. A version of Tendermint, albeit an inefficient one, could just gossip the single hash during consensus and wait for other protocols to gather the data that constitutes the root hash. I'm not really saying we need to do this, just that the two presentations of the data - i.e. what we include in the block data structure and what we include as part of the Merkle root hash - are separable.

docs/rfc/rfc-020-block-structure-consolidation.md

cmwaters · 2022-05-09T12:52:46Z

docs/rfc/rfc-020-block-structure-consolidation.md

+consensus algorithm can use proof-of-stake consensus to validate all of the
+properties of the chain that we would like?
+1. How can we create views of the data that can be easily retrieved, stored, and
+verified by the relevant protocols?


Easily retrieved and stored isn't really a concern of the structure of the block per say, only verification

The block structure is a view of the underlying data. There is a consistent refrain that seems to percolate around the Tendermint ecosystem that removing fields from the block will save storage. This is largely because the storage and presentation of the data are, at the moment 1:1 mapped. This digression exists to clarify that this link is not necessary. Different protocols - light client, consensus, statesync - need to see different slices of the data, with accompanying proofs, to function correctly. We can slice the data into many different pieces as long as the different protocols can retrieve and store it in the requisite ways. A LightBlock is the most clear example of this re-slicing. Consensus may be able to re-slice the data as well, as long as we're able to ensure that it can verify and execute the correct data.

cmwaters · 2022-05-09T12:55:27Z

docs/rfc/rfc-020-block-structure-consolidation.md

+1. What values need to be included as part of the Merkle tree so that the
+consensus algorithm can use proof-of-stake consensus to validate all of the
+properties of the chain that we would like?


As a side note: We use a merkle tree for aggregating the hashes in the header but this might actually not be necessary. A regular sha256 hash could be sufficient as it's rare that we need to prove that a hash exists in a header (the header itself is so small you might as well verify the entire header).

Yeah, definitely. We have a total of 14 fields included in the the root hash.

tendermint/types/block.go

Line 467 in dfa5e21

return merkle.HashFromByteSlices([][]byte{

recalculating a hash over the concatenated data is quick and trivial, so the merkle nature of the header may ultimately not be needed.

cmwaters · 2022-05-09T13:58:44Z

docs/rfc/rfc-020-block-structure-consolidation.md

+The [BlockID][block-id] field comprises the [PartSetHeader][part-set-header] and the hash of the block.
+The `PartSetHeader` is used by nodes to gossip the block by dividing it into
+parts. Nodes receive the `PartSetHeader` from their peers, informing them of
+what pieces of the block to gather. There is no strong reason to include this
+value in the block. Validators will still be able to gossip and validate the
+blocks that they received from their peers using this mechanism even if it is
+not written into the block. The `BlockID` can therefore be consolidated into
+just the hash of the block. This is by far the most uncontroversial change
+and there appears to be no good reason _not_ to do it. Further evidence that
+the field is not meaningful can be found in the fact that the field is not
+actually validated to ensure it is correct during block validation. Validation
+only checks that the [field is well formed][psh-check].


We would still need to gossip the PartSetHeader in the proposal msg

Yes definitely, it is key our block propagation mechanism at the moment, but it isn't directly a concern of the block data.

docs/rfc/rfc-020-block-structure-consolidation.md

cmwaters · 2022-05-09T16:00:17Z

docs/rfc/rfc-020-block-structure-consolidation.md

+I would advocated for keeping this field. This field provides an additional check
+for determinism across nodes. Logic to update the application hash is more


Is there any actual value in providing additional checks for non-determinism. Put in another way, is there anything wrong with relying solely on the AppHash as the single value we are trying to replicate?

You mention debuggability which I think is fair but if applications are using merkle trees then it should be possible that they include a way of working out which leaf in the tree is different and thus better deduce where the divergence occurred

Put in another way, is there anything wrong with relying solely on the AppHash as the single value we are trying to replicate?

No, there definitely is nothing wrong with that. The AppHash is intended to encode this information. The mild value of this field is just in catching chain issues faster, there is nothing that this field encodes that AppHash does not also encode.

cmwaters

Nice write up.

A similar question to what goes into the block is what is important to have in the header and what can be relegated to the rest of the block.

Hashes in headers are somewhat first class citizens in that it makes it faster to verify other data. As an example having the consensus hash in the header makes it faster to verify the consensus parameters than if the hash were in Data whereby you'd need to get the merkle path to that verify's that the consensus hash was present in Data and then that it correlated to the hash of the consensus params.

A suggestion from me would be to consolidate DataHash, LastCommitHash, ConsensusHash and EvidenceHash into a single merkle root hash.

cmwaters · 2022-05-10T05:06:33Z

docs/rfc/rfc-020-block-structure-consolidation.md

+This creates a recursive problem. To verify the validator set that signed the
+block at height `H`, what information do we need? We could fetch the
+`NextValidatorsHash` from height `H-1`, but how do we verify that that hash is correct?


I remember thinking over this myself and drawing the conclusion that it was possible to not require the ValidatorsHash for the light client.

We do have this check here:

tendermint/light/verifier.go

Line 140 in 4b36fea

if !bytes.Equal(untrustedHeader.ValidatorsHash, trustedHeader.NextValidatorsHash) {

but I don't think it serves any extra security. For sequential verification we always start with a trusted header. We get the light block from the next height. We check that the hash of the untrusted validator set is equal to the NextValidatorsHash of the trusted header. We then have verified the validator set. To verify the corresponding commit we check that 2/3+ of the validator set correctly signed for a single non-nil blockID. If that's true, the commit is valid. Finally we check that the hash of the untrusted header is equal to the blockID.Hash that the commit is for. If that's true then we have verified the new header.

For verifyAdjacent, we don't even use the ValidatorsHash because a valid validator set is any that has at least "TrustLevel" (in most cases 1/3) overlap in voting power.

Interestingly, we don't even verify that the validator set we received and used was actually the one that correlates with the untrustedHeader.ValidatorsHash

In the verify non-adjacent case, we're trying to verify a block of height H. To do so, we try to build a chain of validators that we trust to reach from the trusted header to the untrusted header at height H. When we need to perform non-adjacent verification, we fetch a block header along with the commit that supposedly corresponds to that header. Once we verify this header was signed by enough of the validators that we trust, we update the trusted set of validators to include the other signers of that header.

If we remove the ValidatorsHash, it seems that we open ourselves to the following issue:

We have a known good validator set, vs1, and attempt to use it to verify a non-adjacent header.

We fetch the non-adjacent header along with the untrusted validator set, vs2, and the commit, c, that justifies this header.

We check that TrustLevel of vs1 has signatures in c for the header and that vs2 also has signatures in c for the header.

We now trust vs2. However, nothing in vs1 actually signed off on the veracity of vs2.

A malicious node could compose a c and vs2 that comports with vs1 so that the client now trusts phony validators in vs2. With ValidatorsHash in the header, we have reason to trust vs2: enough of vs1 signed off on its hash in the header. Definitely possible I've made a mistake here since my understanding of the light client algorithm is far from perfect.

Do we do this currently? i.e. check hash(vs2) = h2.ValidatorsHash

Yes:

tendermint/light/verifier.go

Line 266 in b0fe38c

if !bytes.Equal(untrustedHeader.ValidatorsHash, untrustedVals.Hash()) {

Called From:

tendermint/light/verifier.go

Line 61 in b0fe38c

if err := verifyNewHeaderAndVals(

Ok yeah that makes sense. Perhaps then we do need to keep ValidatorsHash around. Thanks for investigating

I'll take a closer look at the light client algorithm definition to be certain and add a more detailed description to this doc.

I'd be interested in a whiteboard session during the retreat on the subject of how the validator set evolves, and how it's used for (light-client) verification

cmwaters · 2022-05-10T05:21:01Z

docs/rfc/rfc-020-block-structure-consolidation.md

+I would advocate against this. Any consumer of the chain that wanted to
+know which validator proposed a block would have to run the proposer selection
+algorithm. This algorithm is not widely implemented, meaning that consumers
+in other languages would need to implement the algorithm to determine a piece
+of basic information about the chain.


The main consumer of the knowledge of the proposer is usually the application itself which might have some logic rewarding the validator. Any other use, I believe is just statistical i.e. block explorers say which validator proposed each block.

Rather than removing it completely you could have instead the index of the validator in the validator set.

If we're going to keep 4 bytes to represent the validator index to allow a consumer of the chain to reconstruct the data without implementing the full proposer selection algorithm, is there a reason to not just keep the additional 16 bytes to identify the proposer directly?

Well we're trying to reduce the size of the header. If there's not something we think truly deserves to be in the header we should cut it

cmwaters · 2022-05-10T05:28:30Z

docs/rfc/rfc-020-block-structure-consolidation.md

+proposed the height. Instead, the validation logic simply checks that the value
+is an address of one of the known validators.
+
+Currently, we include this value in the `Commit` for height `H` which is


Is the Round in Commit the proof of lock round or just the final round that received majority?

Good question, it's the round in which the proposer committed in the previous height:

tendermint/internal/consensus/state.go

Line 774 in dfa5e21

cs.LastCommit = cs.Votes.Precommits(cs.CommitRound)

I.e. the second of the options you listed. Added language into this document to clarify.

cmwaters · 2022-05-10T05:29:47Z

docs/rfc/rfc-020-block-structure-consolidation.md

+Currently we store the [every piece of each block][save-block] in the `BlockStore`.
+I suspect that this has lead to some mistakes in reasoning around the merits of
+consolidating fields in the block. We could update the storage scheme we use to
+store only some pieces of each block and still achieve a space savings without having
+to change the block structure at all.


I think @creachadair also alluded to having separate data structures for storage and networking

Yes, that is what I'm planning to do as part of the database work in v0.37 et seq. But that's separate from the question of which parts of the block we might want to eliminate/rework. Both must be projections of the same abstract schema.

Yeah, and to clarify, this section is more hand-wavy ideas to maybe chew on in the future than concrete and actionable changes.

My general sense is that we should switch to storing data only in cases where the values change, versus what we do at the moment: store everything all the time.

cmwaters · 2022-05-10T07:38:04Z

docs/rfc/rfc-020-block-structure-consolidation.md

+previous height. We could achieve a speed-up in many cases by communicating the
+hashes _first_ and letting peers request additional information when they do not
+recognize the communicated hash. 


👍 really like this idea although it would be quite an overhaul to the gossip protocol.

I had similar ideas towards transactions (#7932) i.e. propagate blocks containing the hashes of txs and rely on the mempool to retrieve any missing transactions.

Yeah, I think it would make the architecture much cleaner if we had a content addressing layer. To clarify though, I have no concrete plans to actually do this in the next release or two. I think it's something we can / should consider post 1.0. I think there are more low hanging fruit in the gossip layer to hit before implementing something like content addressing into Tendermint.

sergio-mena · 2022-05-18T15:05:48Z

docs/rfc/rfc-020-block-structure-consolidation.md

+of each block. However, aesthetically, it's somewhat nice to include in each
+block, as if the block was 'stamped' with the ID. Additionally, re-validating
+the value from genesis would be painful and require reconstituting potentially
+large chains. I'm therefore mildly in favor of maintaining this redundant


We could still remove it from the block, but still make it part of the header hashing. So, in order to validate you don't need to go back to genesis if you have the value.

sergio-mena · 2022-05-18T15:07:41Z

docs/rfc/rfc-020-block-structure-consolidation.md

+I would advocate for keeping this field. This field provides an additional check
+for determinism across nodes. Logic to update the application hash is more


I totally agree with this. This could be crucial for troubleshooting non-determinism in some cases. And it's not big in terms of size

sergio-mena · 2022-05-18T15:29:01Z

docs/rfc/rfc-020-block-structure-consolidation.md

+block at height `H`, what information do we need? We could fetch the
+`NextValidatorsHash` from height `H-1`, but how do we verify that that hash is correct?
+
+#### ProposerAddress


I know what I'm going to say is controversial 😄
ProposerAddress and Round (in LastCommit) should not be exposed to the Application. These are details of Tendermint's operation. If you replaced Tendermint by another consensus algorithm offering the same set of properties, the App should be able to work with it. Yes, I know this is utopian.
But:

All uses of ProposerAddress I've seen so far (all rewards-related) are controversial, but I may be unaware of others

Is there any use of round other than stats and monitoring? (which can be done without showing it to the App via ABCI)

Back to reality, as ProposerAddress is being used now (e.g., SDK), I don't think we can easily remove it.

sergio-mena · 2022-05-18T15:33:28Z

docs/rfc/rfc-020-block-structure-consolidation.md

+to propose a height. Because of this, our [validation logic][proposer-check] does not check that
+the `ProposerAddress` included in the block corresponds to the validator that
+proposed the height. Instead, the validation logic simply checks that the value
+is an address of one of the known validators.


One idea. If the App is so keen in knowing, recording, or proving who the proposer is at each block, why don't we leave it up to the App?
I mean, with ABCI++, the propoer app can add a special tx with an app-specific proof that it is the proposer (in PrepareProposal), which other validators could validate in ProcessProposal.

sergio-mena · 2022-05-18T15:35:31Z

docs/rfc/rfc-020-block-structure-consolidation.md

+was updated and construct each block using the data that existed at the time.
+
+This document does not make any specific recommendations around storage since
+that is likely to change with upcoming improvements to to the database infrastructure.


Suggested change

that is likely to change with upcoming improvements to to the database infrastructure.

that is likely to change with upcoming improvements to the database infrastructure.

sergio-mena · 2022-05-18T15:39:42Z

docs/rfc/rfc-020-block-structure-consolidation.md

+Either a soft-upgrades scheme will need to be implemented or a hard fork will
+be required for chains and light clients to function with the new hashes.
+This means that all of the additions and deletions from the Merkle root hash
+proposed by this document will be light client breaking.


I think the actions proposed in this document would be an excellent first use-case for soft-upgrades, because we don't change any functionality, but only the way fields are organized, hashes are calculated and blocks are verified.

sergio-mena · 2022-05-18T15:43:55Z

docs/rfc/rfc-020-block-structure-consolidation.md

+
+## Discussion
+
+### Data to consider removing


I went through the data structures myself and I think that the following fields in Commit

height

round

block_id

are redundant, and could therefore be removed.

They are redundant in that they can be "gleaned" from the rest of the fields in the block.

github-actions · 2022-05-29T00:17:01Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2022-06-09T00:16:28Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2022-06-21T00:15:45Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2022-07-05T00:16:03Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2022-07-22T00:20:05Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2022-08-02T00:18:31Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2022-08-14T00:17:42Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Co-authored-by: Callum Waters <cmwaters19@gmail.com>

This RFC lays out a proposed set of changes for consolidating the set of information that may be included in the block structure. {{ [rendered](https://github.com/tendermint/tendermint/blob/wb/rfc-block-structure/docs/rfc/rfc-020-block-structure-consolidation.md) }}

creachadair reviewed May 6, 2022

View reviewed changes

williambanfield marked this pull request as ready for review May 6, 2022 14:01

williambanfield requested review from ancazamfir, cmwaters, ebuchman, jmalicevic, sergio-mena, thanethomson and tychoish as code owners May 6, 2022 14:01

williambanfield changed the title ~~(wip) RFC-019: block structure consolidation~~ RFC-020: block structure consolidation May 6, 2022

cmwaters reviewed May 9, 2022

View reviewed changes

cmwaters approved these changes May 10, 2022

View reviewed changes

sergio-mena reviewed May 18, 2022

View reviewed changes

github-actions bot added the stale for use by stalebot label May 29, 2022

creachadair removed the stale for use by stalebot label May 29, 2022

github-actions bot added the stale for use by stalebot label Jun 9, 2022

cmwaters removed the stale for use by stalebot label Jun 10, 2022

github-actions bot added the stale for use by stalebot label Jun 21, 2022

cmwaters removed the stale for use by stalebot label Jun 24, 2022

github-actions bot added the stale for use by stalebot label Jul 5, 2022

github-actions bot closed this Jul 10, 2022

cmwaters removed the stale for use by stalebot label Jul 11, 2022

cmwaters reopened this Jul 11, 2022

github-actions bot added the stale for use by stalebot label Jul 22, 2022

creachadair removed the stale for use by stalebot label Jul 22, 2022

creachadair requested a review from samricotta as a code owner July 22, 2022 00:31

github-actions bot added the stale for use by stalebot label Aug 2, 2022

cmwaters removed the stale for use by stalebot label Aug 3, 2022

github-actions bot added the stale for use by stalebot label Aug 14, 2022

williambanfield changed the base branch from master to main August 16, 2022 13:33

williambanfield requested review from a team August 16, 2022 13:33

williambanfield and others added 7 commits August 16, 2022 09:48

RFC-019: block structure consolidation

5c65789

update to be rfc 20

abe984f

typo

bc41391

Update docs/rfc/rfc-020-block-structure-consolidation.md

ac7d462

Co-authored-by: Callum Waters <cmwaters19@gmail.com>

Update docs/rfc/rfc-020-block-structure-consolidation.md

98d5945

Co-authored-by: Callum Waters <cmwaters19@gmail.com>

add information on committed round

a52a7a4

change to be rfc 24

5e2768e

williambanfield force-pushed the wb/rfc-block-structure branch from f88f75c to 5e2768e Compare August 16, 2022 13:50

williambanfield changed the title ~~RFC-020: block structure consolidation~~ RFC-024: block structure consolidation Aug 16, 2022

williambanfield added the S:automerge Automatically merge PR when requirements pass label Aug 16, 2022

mergify bot merged commit c322b89 into main Aug 16, 2022

mergify bot deleted the wb/rfc-block-structure branch August 16, 2022 13:51

williambanfield restored the wb/rfc-block-structure branch September 9, 2022 16:12


		#### ProposerAddress

		The section below details a change to allow the `ProposerAddress` to be calculated

		I would advocated for keeping this field. This field provides an additional check
		for determinism across nodes. Logic to update the application hash is more

		I would advocate for keeping this field. This field provides an additional check
		for determinism across nodes. Logic to update the application hash is more

	that is likely to change with upcoming improvements to to the database infrastructure.
	that is likely to change with upcoming improvements to the database infrastructure.

Conversation

williambanfield commented May 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cmwaters left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmwaters left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

williambanfield May 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

williambanfield May 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

williambanfield commented May 3, 2022 •

edited

Loading

williambanfield May 10, 2022 •

edited

Loading

williambanfield May 11, 2022 •

edited

Loading