consensus: Wait for proposal or timeout before prevote#2540
consensus: Wait for proposal or timeout before prevote#2540
Conversation
751cb37 to
1a5bbc4
Compare
1a5bbc4 to
e2cd6b3
Compare
e2cd6b3 to
ba336b5
Compare
Codecov Report
@@ Coverage Diff @@
## develop #2540 +/- ##
===========================================
- Coverage 61.38% 61.24% -0.15%
===========================================
Files 203 203
Lines 16776 16816 +40
===========================================
+ Hits 10298 10299 +1
- Misses 5609 5642 +33
- Partials 869 875 +6
|
ba336b5 to
2dfd595
Compare
2dfd595 to
8284693
Compare
consensus/common_test.go
Outdated
| func ensureNoNewStep(stepCh <-chan interface{}) { | ||
| func ensureNoNewEventOnChannel(ch <-chan interface{}) { | ||
| ensureNoNewEvent(ch, ensureTimeout, "We should be stuck waiting, "+ | ||
| "not receiving new event on the channel") |
There was a problem hiding this comment.
This breaking is a bit clunky, it would be better to use this pattern:
ensureNoNewEvent(
ch,
ensureTimeout,
"We should be stuck waiting, not receiving new event on the channel",
)
CHANGELOG_PENDING.md
Outdated
| - [node] \#2434 Make node respond to signal interrupts while sleeping for genesis time | ||
| - [consensus] [\#1690](https://github.com/tendermint/tendermint/issues/1690) wait for | ||
| timeoutPrecommit before starting next round | ||
| - [consensus] [\#1745] (https://github.com/tendermint/tendermint/issues/1745) wait for |
There was a problem hiding this comment.
The space between the ] and ( will not render the link correctly.
consensus/common_test.go
Outdated
| func ensureNoNewRoundStep(stepCh <-chan interface{}) { | ||
| ensureNoNewEvent(stepCh, ensureTimeout, "We should be stuck waiting, "+ | ||
| "not moving to the next step") | ||
| "not receiving NewRoundStep event") |
consensus/common_test.go
Outdated
|
|
||
| func ensureNoNewUnlock(unlockCh <-chan interface{}) { | ||
| ensureNoNewEvent(unlockCh, ensureTimeout, "We should be stuck waiting, "+ | ||
| "not receiving Unlock event") |
consensus/common_test.go
Outdated
| timeoutDuration := time.Duration(timeout*5) * time.Nanosecond | ||
| ensureNoNewEvent(stepCh, timeoutDuration, "We should be stuck waiting, "+ | ||
| "not moving to the next step") | ||
| "not receiving NewTimeout event") |
consensus/common_test.go
Outdated
|
|
||
| func ensureNewEvent(ch <-chan interface{}, timeout time.Duration, errorMessage string) { | ||
| func ensureNewEvent(ch <-chan interface{}, height int64, round int, | ||
| timeout time.Duration, errorMessage string) { |
There was a problem hiding this comment.
Should be broken up by param per line with trailing comma and newline behind the last one.
consensus/common_test.go
Outdated
| if !ok { | ||
| panic(fmt.Sprintf("expected a *types.EventDataRoundState, "+ | ||
| "got %v. wrong subscription channel?", | ||
| reflect.TypeOf(rs))) |
82235ba to
23999ca
Compare
ebuchman
left a comment
There was a problem hiding this comment.
Initial review - let's chat about it
consensus/common_test.go
Outdated
| var config *cfg.Config // NOTE: must be reset for each _test.go file | ||
| var ensureTimeout = time.Second * 1 // must be in seconds because CreateEmptyBlocksInterval is | ||
| var config *cfg.Config // NOTE: must be reset for each _test.go file | ||
| var ensureTimeout = time.Millisecond * 100 // must be in seconds because CreateEmptyBlocksInterval is |
There was a problem hiding this comment.
Oh, what about the comment claiming this must be in seconds?
There was a problem hiding this comment.
I guess it was because of this line
tendermint/consensus/mempool_test.go
Line 41 in 69ecda1
tendermint/consensus/mempool_test.go
Line 41 in 0c9c329
| cs.enterPrecommit(ti.Height, ti.Round) | ||
| case cstypes.RoundStepPrecommitWait: | ||
| cs.eventBus.PublishEventTimeoutWait(cs.RoundStateEvent()) | ||
| cs.enterPrecommit(ti.Height, ti.Round) |
There was a problem hiding this comment.
Weird that we can get here without having done enterPrecommit already
There was a problem hiding this comment.
It's possible as you might receive 2/3+ Precommit any so you trigger timeout. In the meantime you don't receive Polka and timeoutPrevote hasn't expired, so you precommit nil before going to the next round.
consensus/state.go
Outdated
| func (cs *ConsensusState) defaultDoPrevote(height int64, round int) { | ||
| logger := cs.Logger.With("height", height, "round", round) | ||
|
|
||
| //TODO: Remove this so it is aligned with spec! |
There was a problem hiding this comment.
Let's open an issue for it? This isn't a bug, afawk, right? Just an optimization that would complicate the spec/proofs?
There was a problem hiding this comment.
There is already an issue open: #2529. It's a bug; we need to propose validBlock, not lockedBlock. I will remove TODO as there is issue open.
| // NOTE: our proposal block may be nil or not what received a polka.. | ||
| // TODO: we may want to still update the ValidBlock and obtain it via gossipping | ||
| if !blockID.IsZero() && | ||
| if len(blockID.Hash) != 0 && |
There was a problem hiding this comment.
Need to confirm these always match - a test was failing.
|
|
||
| blockID, ok := precommits.TwoThirdsMajority() | ||
| if ok && len(blockID.Hash) != 0 { | ||
| // Executed as TwoThirdsMajority could be from a higher round |
There was a problem hiding this comment.
Isn't this comment still true?
There was a problem hiding this comment.
We could maybe move comment before enterNewRound, L1656.
| } | ||
| } else if cs.Round <= vote.Round && precommits.HasTwoThirdsAny() { | ||
| cs.enterNewRound(height, vote.Round) | ||
| cs.enterPrecommit(height, vote.Round) |
There was a problem hiding this comment.
It's too early to Precommit at this point, as it would mean Precommiting nil. We wait either prevote based condition to be true or timeoutPrecommit to expire.
There was a problem hiding this comment.
Right. I think we need to update the enter condition on enterPrecommit then which currently says:
// Enter: any +2/3 precommits for next round.
consensus/common_test.go
Outdated
| panic(fmt.Sprintf("expected height %v, got %v", height, blockHeader.Header.Height)) | ||
| } | ||
| if !bytes.Equal(blockHeader.Header.Hash(), blockHash) { | ||
| panic(fmt.Sprintf("expected header %v, got %v", blockHash, blockHeader.Header.Hash())) |
There was a problem hiding this comment.
Shouldn't this be expected header's hash %X, got %X?
consensus/common_test.go
Outdated
| case <-time.After(ensureTimeout): | ||
| panic("Timeout expired while waiting for new activity on the channel") | ||
| case <-ch: | ||
| break |
consensus/state.go
Outdated
| if cs.Height != height || round < cs.Round || (cs.Round == round && cstypes.RoundStepPrecommitWait <= cs.Step) { | ||
| logger.Debug(fmt.Sprintf("enterPrecommitWait(%v/%v): Invalid args. Current step: %v/%v/%v", height, round, cs.Height, cs.Round, cs.Step)) | ||
| if cs.Height != height || round < cs.Round || (cs.Round == round && cs.triggeredTimeoutPrecommit) { | ||
| logger.Debug(fmt.Sprintf("enterPrecommitWait(%v/%v): Invalid args. Current step: %v/%v/%v", height, round, cs.Height, cs.Round, cs.triggeredTimeoutPrecommit)) |
There was a problem hiding this comment.
comment needs to be updated to reflect it's no longer about "step". maybe %v/%v (timeoutPrecommit: %v)
23999ca to
6ec7076
Compare
957c806 to
8094436
Compare
ebuchman
left a comment
There was a problem hiding this comment.
Looks good, thanks @milosevic . And thanks for the great update to the tests!
| logger.Debug( | ||
| fmt.Sprintf( | ||
| "enterPrecommitWait(%v/%v): Invalid args. "+ | ||
| "Current state is Height/Round: %v/%v/, triggeredTimeoutPrecommit:%v", |
There was a problem hiding this comment.
Maybe we should keep the Step in this output, for completeness/debugging.
| // If +2/3 prevotes for *anything* for future round: | ||
| if cs.Round < vote.Round && prevotes.HasTwoThirdsAny() { | ||
| // Round-skip if there is any 2/3+ of votes ahead of us | ||
| cs.enterNewRound(height, vote.Round) |
There was a problem hiding this comment.
So the reason we don't enterPrevote or enterPrecommit here anymore is because we have to see the entire propose step through before prevoting, right?
Either we will:
- receive the whole proposal, and then prevote block
- receive +2/3-prevotes for nil, and then prevote nil
- timeout propose, and then prevote nil
Whereas before, we just prevoted right away on seeing +2/3-prevote-any
There was a problem hiding this comment.
In the spec, the second condition (receiving +2/3-prevotes for nil) is conditioned by step=prevote (which means a process always wait for the other two conditions to be true) and we precommit nil at that point. It is probably fine also to prevote nil upon receiving +2/3-prevotes for nil even if step != prevote, but we need to check the proofs for this.
| if cs.Round < vote.Round && prevotes.HasTwoThirdsAny() { | ||
| // Round-skip if there is any 2/3+ of votes ahead of us | ||
| cs.enterNewRound(height, vote.Round) | ||
| } else if cs.Round == vote.Round && cstypes.RoundStepPrevote <= cs.Step { // current round |
There was a problem hiding this comment.
And the reason we add cstypes.RoundStepPrevote <= cs.Step and drop the enterPrevote from here is because
- if
cs.Step < RoundStepPrevote, we still have to wait out the full propose step, so +2/3-prevote-any can't trigger anything. Even if we receive +2/3-prevote-block, we still have to wait until addProposalBlock gets the last part and calls enterPrevote, or we finally timeoutPropose - if cs.Step >= RoundStepPrevote`, we've already prevoted
So now addVote will only call enterPrevote if it just received the last vote in a POL it was waiting for to complete a proposal
| } | ||
| } else if cs.Round <= vote.Round && precommits.HasTwoThirdsAny() { | ||
| cs.enterNewRound(height, vote.Round) | ||
| cs.enterPrecommit(height, vote.Round) |
There was a problem hiding this comment.
Right. I think we need to update the enter condition on enterPrecommit then which currently says:
// Enter: any +2/3 precommits for next round.
| // wait for precommit | ||
| ensureNewVote(voteCh) | ||
|
|
||
| ensureVote(voteCh, height, round, types.VoteTypePrecommit) |
There was a problem hiding this comment.
maybe these should just be ensurePrevote and ensurePrecommit and we can drop the last arg
| if prevotes.HasTwoThirdsMajority() { | ||
| cs.enterPrecommit(height, vote.Round) | ||
| } else { | ||
| cs.enterPrevote(height, vote.Round) // if the vote is ahead of us |
There was a problem hiding this comment.
@jaekwon was suggesting we call enterPropose here instead of enterPrevote. But since we already guarantee the round matches and the cs.Step >= PrevoteStep, we shouldn't need to.
| } | ||
| } else if cs.Round <= vote.Round && precommits.HasTwoThirdsAny() { | ||
| cs.enterNewRound(height, vote.Round) | ||
| cs.enterPrecommit(height, vote.Round) |
There was a problem hiding this comment.
Jae was suggesting we also call enterPropose and enterPrevoteWait here, instead of the enterPrecommit. Note enterNewRound calls enterPropose, so it shouldn't be needed. And we want to wait the whole proposal step before doing any prevoting, so I don't think we need the enterPrevoteWait here either (it will happen latter if it needs to).
But then calling enterPrecommitWait right away means we now only have TimeoutPrecommit to receive the proposal block. Is that right?
In this case, we could be in any step, and may even be jumping to a later round, but then we only give ourselves TimeoutPrecommit to receive the proposal block for that round.
There was a problem hiding this comment.
But then calling enterPrecommitWait right away means we now only have TimeoutPrecommit to receive the proposal block. Is that right?
In this case, we could be in any step, and may even be jumping to a later round, but then we only give ourselves TimeoutPrecommit to receive the proposal block for that round.
That's why in Lemma 5 we need to assume that timeoutPropose(r) > 2Delta + timeoutPrecommit(r-1), i.e., timeoutPropose need to be big enough so we will wait for a process that jump to a higher round to receive Proposal. So we don't wait timeoutPrecommit to receive the proposal block. We wait timeoutPrecommit to set validBlock in the current round. And we wait timeoutPropose to receive the proposal, but there is dependency on the duration of timeoutPropose on timeoutPrecommit. Does this answer your question?
| cs.enterNewRound(cs.Height, 0) | ||
| } | ||
| } else { | ||
| cs.enterPrecommitWait(height, vote.Round) |
There was a problem hiding this comment.
@milosevic @ebuchman
Hey guys. Late to follow up this code, but I have a question for line 1668.
Thanks in advance.
Why should we enter precommit wait? we already checked majority for nil so we don't have to wait because consensus already achieved for nil.
I think 1668 should be enterNewRound because it don't need to wait more precommit votes.
Uh oh!
There was an error while loading. Please reload this page.