RFC-024: block structure consolidation#8457
Conversation
| original block and therefore transitively included as in the Merkle root hash of | ||
| every block. The redundant field is a candidate for removal from the root hash | ||
| of each block. However, aesthetically, it's somewhat nice to include in each | ||
| block, as if the block was 'stamped' with the ID. Additionally, re-validating |
There was a problem hiding this comment.
Debuggability seems to be the main argument in favour of keeping this one.
| Leaving the field will allow for transaction execution issues that are not | ||
| correctly reflected in the `AppHash` to be more completely diagnosed. | ||
|
|
||
| Take the case of mismatch of `LastResultsHash` between two nodes, A and B, where both |
There was a problem hiding this comment.
It seems like the more likely scenario is that A and B report different AppHashes for the same cumulative transaction hash? Not that it couldn't be the opposite too, but if I saw that in a trace I'd assume the node had computed the wrong values rather than the app diverging.
| contains the same data but for the next height. | ||
|
|
||
| This data is effectively redundant. Having both values present in the block | ||
| structure is helpful for light client verification. The light client is able to |
There was a problem hiding this comment.
But if the light client is going to go on to verify the next block, won't it need to fetch both headers anyway?
I suppose if the light client only wants to verify one block it's a savings, but I'm not sure optimizing for that case makes much sense. It seems like the normal case is the light client is verifying a bunch of blocks, where this only affects the fencepost.
There was a problem hiding this comment.
I suppose if the light client only wants to verify one block it's a savings, but I'm not sure optimizing for that case makes much sense.
In the bisection algorithm, the light client will verify a header without verifying the adjacent header. It is attempting to bootstrap trust by looking for validators that behaved well at a previous height with the intention of using those newly trusted validators to verify a particular candidate header.
|
|
||
| #### ProposerAddress | ||
|
|
||
| The section below details a change to allow the `ProposerAddress` to be calculated |
There was a problem hiding this comment.
I agree, it probably doesn't make sense to remove this. Issues like #8326 suggest that if anything, we might want to consider making sure we compute the value "correctly". I think the value we store here now doesn't necessarily reflect changes in a height with multiple rounds.
cmwaters
left a comment
There was a problem hiding this comment.
Haven't gone through the entire RFC yet but posting my comments now and will return to this tomorrow
|
|
||
| A block is the structure produced as the result of an instance of the Tendermint | ||
| consensus algorithm. At its simplest, the 'block' can be represented as a Merkle | ||
| root hash of all of the data used to construct and produce the hash. Our current |
There was a problem hiding this comment.
Which "hash" are you referring to? Do you mind being a little more explicit
There was a problem hiding this comment.
Also your definition of a "block" seems more apt to the definition of the "header". A block at the end of the day must include the set of state transitions to be applied against application state
There was a problem hiding this comment.
Hmm, the point here is a bit abstract, but, broadly: every piece of data needed to execute the state transition is not included in the 'block' structure. Most concretely, the application state is not stored in the 'block' structure. A hash of it is gossiped in the block and validators use this to ensure they have reached the same state as each other. We have a protocol, statesync for disseminating and verifying the data referenced by the AppHash. We could just as easily have a txsync protocol, where validators gossip transaction sets separately, instead of storing them in a data structure we call the block. A version of Tendermint, albeit an inefficient one, could just gossip the single hash during consensus and wait for other protocols to gather the data that constitutes the root hash. I'm not really saying we need to do this, just that the two presentations of the data - i.e. what we include in the block data structure and what we include as part of the Merkle root hash - are separable.
| consensus algorithm can use proof-of-stake consensus to validate all of the | ||
| properties of the chain that we would like? | ||
| 1. How can we create views of the data that can be easily retrieved, stored, and | ||
| verified by the relevant protocols? |
There was a problem hiding this comment.
Easily retrieved and stored isn't really a concern of the structure of the block per say, only verification
There was a problem hiding this comment.
The block structure is a view of the underlying data. There is a consistent refrain that seems to percolate around the Tendermint ecosystem that removing fields from the block will save storage. This is largely because the storage and presentation of the data are, at the moment 1:1 mapped. This digression exists to clarify that this link is not necessary. Different protocols - light client, consensus, statesync - need to see different slices of the data, with accompanying proofs, to function correctly. We can slice the data into many different pieces as long as the different protocols can retrieve and store it in the requisite ways. A LightBlock is the most clear example of this re-slicing. Consensus may be able to re-slice the data as well, as long as we're able to ensure that it can verify and execute the correct data.
| 1. What values need to be included as part of the Merkle tree so that the | ||
| consensus algorithm can use proof-of-stake consensus to validate all of the | ||
| properties of the chain that we would like? |
There was a problem hiding this comment.
As a side note: We use a merkle tree for aggregating the hashes in the header but this might actually not be necessary. A regular sha256 hash could be sufficient as it's rare that we need to prove that a hash exists in a header (the header itself is so small you might as well verify the entire header).
There was a problem hiding this comment.
Yeah, definitely. We have a total of 14 fields included in the the root hash.
Line 467 in dfa5e21
recalculating a hash over the concatenated data is quick and trivial, so the merkle nature of the header may ultimately not be needed.
| The [BlockID][block-id] field comprises the [PartSetHeader][part-set-header] and the hash of the block. | ||
| The `PartSetHeader` is used by nodes to gossip the block by dividing it into | ||
| parts. Nodes receive the `PartSetHeader` from their peers, informing them of | ||
| what pieces of the block to gather. There is no strong reason to include this | ||
| value in the block. Validators will still be able to gossip and validate the | ||
| blocks that they received from their peers using this mechanism even if it is | ||
| not written into the block. The `BlockID` can therefore be consolidated into | ||
| just the hash of the block. This is by far the most uncontroversial change | ||
| and there appears to be no good reason _not_ to do it. Further evidence that | ||
| the field is not meaningful can be found in the fact that the field is not | ||
| actually validated to ensure it is correct during block validation. Validation | ||
| only checks that the [field is well formed][psh-check]. |
There was a problem hiding this comment.
We would still need to gossip the PartSetHeader in the proposal msg
There was a problem hiding this comment.
Yes definitely, it is key our block propagation mechanism at the moment, but it isn't directly a concern of the block data.
| I would advocated for keeping this field. This field provides an additional check | ||
| for determinism across nodes. Logic to update the application hash is more |
There was a problem hiding this comment.
Is there any actual value in providing additional checks for non-determinism. Put in another way, is there anything wrong with relying solely on the AppHash as the single value we are trying to replicate?
You mention debuggability which I think is fair but if applications are using merkle trees then it should be possible that they include a way of working out which leaf in the tree is different and thus better deduce where the divergence occurred
There was a problem hiding this comment.
Put in another way, is there anything wrong with relying solely on the AppHash as the single value we are trying to replicate?
No, there definitely is nothing wrong with that. The AppHash is intended to encode this information. The mild value of this field is just in catching chain issues faster, there is nothing that this field encodes that AppHash does not also encode.
cmwaters
left a comment
There was a problem hiding this comment.
Nice write up.
A similar question to what goes into the block is what is important to have in the header and what can be relegated to the rest of the block.
Hashes in headers are somewhat first class citizens in that it makes it faster to verify other data. As an example having the consensus hash in the header makes it faster to verify the consensus parameters than if the hash were in Data whereby you'd need to get the merkle path to that verify's that the consensus hash was present in Data and then that it correlated to the hash of the consensus params.
A suggestion from me would be to consolidate DataHash, LastCommitHash, ConsensusHash and EvidenceHash into a single merkle root hash.
| This creates a recursive problem. To verify the validator set that signed the | ||
| block at height `H`, what information do we need? We could fetch the | ||
| `NextValidatorsHash` from height `H-1`, but how do we verify that that hash is correct? |
There was a problem hiding this comment.
I remember thinking over this myself and drawing the conclusion that it was possible to not require the ValidatorsHash for the light client.
We do have this check here:
Line 140 in 4b36fea
For verifyAdjacent, we don't even use the ValidatorsHash because a valid validator set is any that has at least "TrustLevel" (in most cases 1/3) overlap in voting power.
Interestingly, we don't even verify that the validator set we received and used was actually the one that correlates with the untrustedHeader.ValidatorsHash
There was a problem hiding this comment.
In the verify non-adjacent case, we're trying to verify a block of height H. To do so, we try to build a chain of validators that we trust to reach from the trusted header to the untrusted header at height H. When we need to perform non-adjacent verification, we fetch a block header along with the commit that supposedly corresponds to that header. Once we verify this header was signed by enough of the validators that we trust, we update the trusted set of validators to include the other signers of that header.
If we remove the ValidatorsHash, it seems that we open ourselves to the following issue:
- We have a known good validator set,
vs1, and attempt to use it to verify a non-adjacent header. - We fetch the non-adjacent header along with the untrusted validator set,
vs2, and the commit,c, that justifies this header. - We check that
TrustLevelofvs1has signatures incfor the header and thatvs2also has signatures incfor the header. - We now trust
vs2. However, nothing invs1actually signed off on the veracity ofvs2.
A malicious node could compose a c and vs2 that comports with vs1 so that the client now trusts phony validators in vs2. With ValidatorsHash in the header, we have reason to trust vs2: enough of vs1 signed off on its hash in the header. Definitely possible I've made a mistake here since my understanding of the light client algorithm is far from perfect.
There was a problem hiding this comment.
Do we do this currently? i.e. check hash(vs2) = h2.ValidatorsHash
There was a problem hiding this comment.
Ok yeah that makes sense. Perhaps then we do need to keep ValidatorsHash around. Thanks for investigating
There was a problem hiding this comment.
I'll take a closer look at the light client algorithm definition to be certain and add a more detailed description to this doc.
There was a problem hiding this comment.
I'd be interested in a whiteboard session during the retreat on the subject of how the validator set evolves, and how it's used for (light-client) verification
| I would advocate against this. Any consumer of the chain that wanted to | ||
| know which validator proposed a block would have to run the proposer selection | ||
| algorithm. This algorithm is not widely implemented, meaning that consumers | ||
| in other languages would need to implement the algorithm to determine a piece | ||
| of basic information about the chain. |
There was a problem hiding this comment.
The main consumer of the knowledge of the proposer is usually the application itself which might have some logic rewarding the validator. Any other use, I believe is just statistical i.e. block explorers say which validator proposed each block.
Rather than removing it completely you could have instead the index of the validator in the validator set.
There was a problem hiding this comment.
If we're going to keep 4 bytes to represent the validator index to allow a consumer of the chain to reconstruct the data without implementing the full proposer selection algorithm, is there a reason to not just keep the additional 16 bytes to identify the proposer directly?
There was a problem hiding this comment.
Well we're trying to reduce the size of the header. If there's not something we think truly deserves to be in the header we should cut it
| proposed the height. Instead, the validation logic simply checks that the value | ||
| is an address of one of the known validators. | ||
|
|
||
| Currently, we include this value in the `Commit` for height `H` which is |
There was a problem hiding this comment.
Is the Round in Commit the proof of lock round or just the final round that received majority?
There was a problem hiding this comment.
Good question, it's the round in which the proposer committed in the previous height:
tendermint/internal/consensus/state.go
Line 774 in dfa5e21
I.e. the second of the options you listed. Added language into this document to clarify.
| Currently we store the [every piece of each block][save-block] in the `BlockStore`. | ||
| I suspect that this has lead to some mistakes in reasoning around the merits of | ||
| consolidating fields in the block. We could update the storage scheme we use to | ||
| store only some pieces of each block and still achieve a space savings without having | ||
| to change the block structure at all. |
There was a problem hiding this comment.
I think @creachadair also alluded to having separate data structures for storage and networking
There was a problem hiding this comment.
Yes, that is what I'm planning to do as part of the database work in v0.37 et seq. But that's separate from the question of which parts of the block we might want to eliminate/rework. Both must be projections of the same abstract schema.
There was a problem hiding this comment.
Yeah, and to clarify, this section is more hand-wavy ideas to maybe chew on in the future than concrete and actionable changes.
My general sense is that we should switch to storing data only in cases where the values change, versus what we do at the moment: store everything all the time.
| previous height. We could achieve a speed-up in many cases by communicating the | ||
| hashes _first_ and letting peers request additional information when they do not | ||
| recognize the communicated hash. |
There was a problem hiding this comment.
👍 really like this idea although it would be quite an overhaul to the gossip protocol.
I had similar ideas towards transactions (#7932) i.e. propagate blocks containing the hashes of txs and rely on the mempool to retrieve any missing transactions.
There was a problem hiding this comment.
Yeah, I think it would make the architecture much cleaner if we had a content addressing layer. To clarify though, I have no concrete plans to actually do this in the next release or two. I think it's something we can / should consider post 1.0. I think there are more low hanging fruit in the gossip layer to hit before implementing something like content addressing into Tendermint.
| of each block. However, aesthetically, it's somewhat nice to include in each | ||
| block, as if the block was 'stamped' with the ID. Additionally, re-validating | ||
| the value from genesis would be painful and require reconstituting potentially | ||
| large chains. I'm therefore mildly in favor of maintaining this redundant |
There was a problem hiding this comment.
We could still remove it from the block, but still make it part of the header hashing. So, in order to validate you don't need to go back to genesis if you have the value.
| I would advocate for keeping this field. This field provides an additional check | ||
| for determinism across nodes. Logic to update the application hash is more |
There was a problem hiding this comment.
I totally agree with this. This could be crucial for troubleshooting non-determinism in some cases. And it's not big in terms of size
| block at height `H`, what information do we need? We could fetch the | ||
| `NextValidatorsHash` from height `H-1`, but how do we verify that that hash is correct? | ||
|
|
||
| #### ProposerAddress |
There was a problem hiding this comment.
I know what I'm going to say is controversial 😄
ProposerAddress and Round (in LastCommit) should not be exposed to the Application. These are details of Tendermint's operation. If you replaced Tendermint by another consensus algorithm offering the same set of properties, the App should be able to work with it. Yes, I know this is utopian.
But:
- All uses of
ProposerAddressI've seen so far (all rewards-related) are controversial, but I may be unaware of others - Is there any use of
roundother than stats and monitoring? (which can be done without showing it to the App via ABCI)
Back to reality, as ProposerAddress is being used now (e.g., SDK), I don't think we can easily remove it.
| to propose a height. Because of this, our [validation logic][proposer-check] does not check that | ||
| the `ProposerAddress` included in the block corresponds to the validator that | ||
| proposed the height. Instead, the validation logic simply checks that the value | ||
| is an address of one of the known validators. |
There was a problem hiding this comment.
One idea. If the App is so keen in knowing, recording, or proving who the proposer is at each block, why don't we leave it up to the App?
I mean, with ABCI++, the propoer app can add a special tx with an app-specific proof that it is the proposer (in PrepareProposal), which other validators could validate in ProcessProposal.
| was updated and construct each block using the data that existed at the time. | ||
|
|
||
| This document does not make any specific recommendations around storage since | ||
| that is likely to change with upcoming improvements to to the database infrastructure. |
There was a problem hiding this comment.
| that is likely to change with upcoming improvements to to the database infrastructure. | |
| that is likely to change with upcoming improvements to the database infrastructure. |
| Either a soft-upgrades scheme will need to be implemented or a hard fork will | ||
| be required for chains and light clients to function with the new hashes. | ||
| This means that all of the additions and deletions from the Merkle root hash | ||
| proposed by this document will be light client breaking. |
There was a problem hiding this comment.
I think the actions proposed in this document would be an excellent first use-case for soft-upgrades, because we don't change any functionality, but only the way fields are organized, hashes are calculated and blocks are verified.
|
|
||
| ## Discussion | ||
|
|
||
| ### Data to consider removing |
There was a problem hiding this comment.
I went through the data structures myself and I think that the following fields in Commit
- height
- round
- block_id
are redundant, and could therefore be removed.
They are redundant in that they can be "gleaned" from the rest of the fields in the block.
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
f88f75c to
5e2768e
Compare
This RFC lays out a proposed set of changes for consolidating the set of information that may be included in the block structure.
{{ [rendered](https://github.com/tendermint/tendermint/blob/wb/rfc-block-structure/docs/rfc/rfc-020-block-structure-consolidation.md) }}
This RFC lays out a proposed set of changes for consolidating the set of information that may be included in the block structure.
{{ rendered }}