Strengthen CometBFT's internal checks on vote extension enablement#546
Conversation
| } | ||
|
|
||
| added, err = hvs.AddVote(vote1001_0, "peer2") | ||
| added, err = hvs.AddVote(vote1001_0, "peer2", true) |
There was a problem hiding this comment.
Could you add a test for the new condition in addVote?
| if extEnabled { | ||
| if voteType != cmtproto.PrecommitType { | ||
| panic(fmt.Errorf("vote type is not precommit but extensions enabled")) | ||
| } |
There was a problem hiding this comment.
the following test does not need voteType -- cmtproto.PrecommitType since it was already tested. Is the recheck a Go idiom?
There was a problem hiding this comment.
It's an oversight :-)
Fixing it
| if !cs.state.ConsensusParams.ABCI.VoteExtensionsEnabled(vote.Height) { | ||
| // The signer will sign the extension, make sure to remove the data on the way out | ||
| vote.StripExtension() | ||
| hasExt := len(vote.ExtensionSignature) > 0 |
There was a problem hiding this comment.
given the checks here, do we need the checks upstream, for example in signAddVote?
There was a problem hiding this comment.
I would leave them here for the time being, I prefer to be over-exhaustive... until this code has been used to some extent.
Just in case there is another path we didn't think of. Remember that the initial point of this code was ruling out possible inconsistencies with extension data when troubleshooting a real problem.
Contributes to #10
While troubleshooting #539, I had to navigate the production code dealing with vote extensions quite a bit. I realized we are not doing enough validation at key points: our checks tend to focus on making sure the vote extensions are there when they are needed, but less so on making sure we don't get vote extensions while they are disabled.
I started progressively adding the missing checks while investigating #539 to exclude various hypothesis. The end result is this PR.
How I tested it:
./build/generator -g 5 -d networks/nightly/ && ./run-multiple.sh networks/nightly/*-group*-*.toml) on that version of the code. The idea being that the introduced panics should not be triggered as we have no byzantine nodes in our e2e tests ATM.PR checklist
.changelog(we use unclog to manage our changelog)docs/orspec/) and code comments