Skip to content

consensus: Wait for proposal or timeout before prevote#2540

Merged
ebuchman merged 3 commits intodevelopfrom
zarko/1745-wait-for-proposal-or-timeout-before-prevote
Oct 12, 2018
Merged

consensus: Wait for proposal or timeout before prevote#2540
ebuchman merged 3 commits intodevelopfrom
zarko/1745-wait-for-proposal-or-timeout-before-prevote

Conversation

@milosevic
Copy link
Contributor

@milosevic milosevic commented Oct 4, 2018

@milosevic milosevic changed the base branch from master to develop October 4, 2018 12:59
@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from 751cb37 to 1a5bbc4 Compare October 4, 2018 13:26
@milosevic milosevic self-assigned this Oct 4, 2018
@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from 1a5bbc4 to e2cd6b3 Compare October 4, 2018 13:34
@ebuchman ebuchman mentioned this pull request Oct 5, 2018
4 tasks
@xla xla changed the title WIP: Wait for proposal or timeout before prevote [WIP] Wait for proposal or timeout before prevote Oct 8, 2018
@xla xla changed the title [WIP] Wait for proposal or timeout before prevote [WIP] consensus: Wait for proposal or timeout before prevote Oct 8, 2018
@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from e2cd6b3 to ba336b5 Compare October 9, 2018 11:30
@codecov-io
Copy link

codecov-io commented Oct 9, 2018

Codecov Report

Merging #2540 into develop will decrease coverage by 0.14%.
The diff coverage is 89.65%.

@@             Coverage Diff             @@
##           develop    #2540      +/-   ##
===========================================
- Coverage    61.38%   61.24%   -0.15%     
===========================================
  Files          203      203              
  Lines        16776    16816      +40     
===========================================
+ Hits         10298    10299       +1     
- Misses        5609     5642      +33     
- Partials       869      875       +6
Impacted Files Coverage Δ
state/metrics.go 25% <ø> (ø) ⬆️
p2p/metrics.go 17.94% <ø> (ø) ⬆️
node/node.go 66.05% <0%> (ø) ⬆️
config/config.go 69.28% <100%> (ø) ⬆️
state/execution.go 75.45% <100%> (ø) ⬆️
consensus/state.go 75.64% <92%> (-1.35%) ⬇️
consensus/reactor.go 71.48% <0%> (-1.04%) ⬇️
libs/db/util.go 71.42% <0%> (+0.69%) ⬆️
libs/db/remotedb/remotedb.go 41.52% <0%> (+4.68%) ⬆️

@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from ba336b5 to 2dfd595 Compare October 9, 2018 12:30
@milosevic milosevic changed the title [WIP] consensus: Wait for proposal or timeout before prevote Wait for proposal or timeout before prevote Oct 9, 2018
@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from 2dfd595 to 8284693 Compare October 9, 2018 12:51
@xla xla changed the title Wait for proposal or timeout before prevote consensus: Wait for proposal or timeout before prevote Oct 9, 2018
@xla xla removed the request for review from zramsay October 9, 2018 12:59
xla
xla previously requested changes Oct 9, 2018
Copy link
Contributor

@xla xla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some formatting nits.

func ensureNoNewStep(stepCh <-chan interface{}) {
func ensureNoNewEventOnChannel(ch <-chan interface{}) {
ensureNoNewEvent(ch, ensureTimeout, "We should be stuck waiting, "+
"not receiving new event on the channel")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaking is a bit clunky, it would be better to use this pattern:

	ensureNoNewEvent(
		ch,
		ensureTimeout,
		"We should be stuck waiting, not receiving new event on the channel",
	)

- [node] \#2434 Make node respond to signal interrupts while sleeping for genesis time
- [consensus] [\#1690](https://github.com/tendermint/tendermint/issues/1690) wait for
timeoutPrecommit before starting next round
- [consensus] [\#1745] (https://github.com/tendermint/tendermint/issues/1745) wait for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The space between the ] and ( will not render the link correctly.

func ensureNoNewRoundStep(stepCh <-chan interface{}) {
ensureNoNewEvent(stepCh, ensureTimeout, "We should be stuck waiting, "+
"not moving to the next step")
"not receiving NewRoundStep event")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.


func ensureNoNewUnlock(unlockCh <-chan interface{}) {
ensureNoNewEvent(unlockCh, ensureTimeout, "We should be stuck waiting, "+
"not receiving Unlock event")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

timeoutDuration := time.Duration(timeout*5) * time.Nanosecond
ensureNoNewEvent(stepCh, timeoutDuration, "We should be stuck waiting, "+
"not moving to the next step")
"not receiving NewTimeout event")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.


func ensureNewEvent(ch <-chan interface{}, timeout time.Duration, errorMessage string) {
func ensureNewEvent(ch <-chan interface{}, height int64, round int,
timeout time.Duration, errorMessage string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be broken up by param per line with trailing comma and newline behind the last one.

if !ok {
panic(fmt.Sprintf("expected a *types.EventDataRoundState, "+
"got %v. wrong subscription channel?",
reflect.TypeOf(rs)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from 82235ba to 23999ca Compare October 11, 2018 13:45
Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review - let's chat about it

var config *cfg.Config // NOTE: must be reset for each _test.go file
var ensureTimeout = time.Second * 1 // must be in seconds because CreateEmptyBlocksInterval is
var config *cfg.Config // NOTE: must be reset for each _test.go file
var ensureTimeout = time.Millisecond * 100 // must be in seconds because CreateEmptyBlocksInterval is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, what about the comment claiming this must be in seconds?

Copy link
Contributor Author

@milosevic milosevic Oct 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it was because of this line

config.Consensus.CreateEmptyBlocksInterval = ensureTimeout
as it used to be
config.Consensus.CreateEmptyBlocksInterval = int(ensureTimeout.Seconds())
. Probably comment can be removed.

cs.enterPrecommit(ti.Height, ti.Round)
case cstypes.RoundStepPrecommitWait:
cs.eventBus.PublishEventTimeoutWait(cs.RoundStateEvent())
cs.enterPrecommit(ti.Height, ti.Round)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that we can get here without having done enterPrecommit already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible as you might receive 2/3+ Precommit any so you trigger timeout. In the meantime you don't receive Polka and timeoutPrevote hasn't expired, so you precommit nil before going to the next round.

func (cs *ConsensusState) defaultDoPrevote(height int64, round int) {
logger := cs.Logger.With("height", height, "round", round)

//TODO: Remove this so it is aligned with spec!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's open an issue for it? This isn't a bug, afawk, right? Just an optimization that would complicate the spec/proofs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already an issue open: #2529. It's a bug; we need to propose validBlock, not lockedBlock. I will remove TODO as there is issue open.

// NOTE: our proposal block may be nil or not what received a polka..
// TODO: we may want to still update the ValidBlock and obtain it via gossipping
if !blockID.IsZero() &&
if len(blockID.Hash) != 0 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to confirm these always match - a test was failing.


blockID, ok := precommits.TwoThirdsMajority()
if ok && len(blockID.Hash) != 0 {
// Executed as TwoThirdsMajority could be from a higher round
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this comment still true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could maybe move comment before enterNewRound, L1656.

}
} else if cs.Round <= vote.Round && precommits.HasTwoThirdsAny() {
cs.enterNewRound(height, vote.Round)
cs.enterPrecommit(height, vote.Round)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too early to Precommit at this point, as it would mean Precommiting nil. We wait either prevote based condition to be true or timeoutPrecommit to expire.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think we need to update the enter condition on enterPrecommit then which currently says:

// Enter: any +2/3 precommits for next round.

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 🍑

panic(fmt.Sprintf("expected height %v, got %v", height, blockHeader.Header.Height))
}
if !bytes.Equal(blockHeader.Header.Hash(), blockHash) {
panic(fmt.Sprintf("expected header %v, got %v", blockHash, blockHeader.Header.Hash()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be expected header's hash %X, got %X?

case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for new activity on the channel")
case <-ch:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for break

if cs.Height != height || round < cs.Round || (cs.Round == round && cstypes.RoundStepPrecommitWait <= cs.Step) {
logger.Debug(fmt.Sprintf("enterPrecommitWait(%v/%v): Invalid args. Current step: %v/%v/%v", height, round, cs.Height, cs.Round, cs.Step))
if cs.Height != height || round < cs.Round || (cs.Round == round && cs.triggeredTimeoutPrecommit) {
logger.Debug(fmt.Sprintf("enterPrecommitWait(%v/%v): Invalid args. Current step: %v/%v/%v", height, round, cs.Height, cs.Round, cs.triggeredTimeoutPrecommit))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment needs to be updated to reflect it's no longer about "step". maybe %v/%v (timeoutPrecommit: %v)

@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from 23999ca to 6ec7076 Compare October 12, 2018 11:55
@milosevic milosevic force-pushed the zarko/1745-wait-for-proposal-or-timeout-before-prevote branch from 957c806 to 8094436 Compare October 12, 2018 12:54
@jackzampolin jackzampolin mentioned this pull request Oct 12, 2018
7 tasks
Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @milosevic . And thanks for the great update to the tests!

logger.Debug(
fmt.Sprintf(
"enterPrecommitWait(%v/%v): Invalid args. "+
"Current state is Height/Round: %v/%v/, triggeredTimeoutPrecommit:%v",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should keep the Step in this output, for completeness/debugging.

// If +2/3 prevotes for *anything* for future round:
if cs.Round < vote.Round && prevotes.HasTwoThirdsAny() {
// Round-skip if there is any 2/3+ of votes ahead of us
cs.enterNewRound(height, vote.Round)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason we don't enterPrevote or enterPrecommit here anymore is because we have to see the entire propose step through before prevoting, right?

Either we will:

  • receive the whole proposal, and then prevote block
  • receive +2/3-prevotes for nil, and then prevote nil
  • timeout propose, and then prevote nil

Whereas before, we just prevoted right away on seeing +2/3-prevote-any

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spec, the second condition (receiving +2/3-prevotes for nil) is conditioned by step=prevote (which means a process always wait for the other two conditions to be true) and we precommit nil at that point. It is probably fine also to prevote nil upon receiving +2/3-prevotes for nil even if step != prevote, but we need to check the proofs for this.

if cs.Round < vote.Round && prevotes.HasTwoThirdsAny() {
// Round-skip if there is any 2/3+ of votes ahead of us
cs.enterNewRound(height, vote.Round)
} else if cs.Round == vote.Round && cstypes.RoundStepPrevote <= cs.Step { // current round
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the reason we add cstypes.RoundStepPrevote <= cs.Step and drop the enterPrevote from here is because

  • if cs.Step < RoundStepPrevote, we still have to wait out the full propose step, so +2/3-prevote-any can't trigger anything. Even if we receive +2/3-prevote-block, we still have to wait until addProposalBlock gets the last part and calls enterPrevote, or we finally timeoutPropose
  • if cs.Step >= RoundStepPrevote`, we've already prevoted

So now addVote will only call enterPrevote if it just received the last vote in a POL it was waiting for to complete a proposal

}
} else if cs.Round <= vote.Round && precommits.HasTwoThirdsAny() {
cs.enterNewRound(height, vote.Round)
cs.enterPrecommit(height, vote.Round)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think we need to update the enter condition on enterPrecommit then which currently says:

// Enter: any +2/3 precommits for next round.

// wait for precommit
ensureNewVote(voteCh)

ensureVote(voteCh, height, round, types.VoteTypePrecommit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe these should just be ensurePrevote and ensurePrecommit and we can drop the last arg

Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more notes based on discussion from the issue #1745

if prevotes.HasTwoThirdsMajority() {
cs.enterPrecommit(height, vote.Round)
} else {
cs.enterPrevote(height, vote.Round) // if the vote is ahead of us
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaekwon was suggesting we call enterPropose here instead of enterPrevote. But since we already guarantee the round matches and the cs.Step >= PrevoteStep, we shouldn't need to.

}
} else if cs.Round <= vote.Round && precommits.HasTwoThirdsAny() {
cs.enterNewRound(height, vote.Round)
cs.enterPrecommit(height, vote.Round)
Copy link
Contributor

@ebuchman ebuchman Oct 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jae was suggesting we also call enterPropose and enterPrevoteWait here, instead of the enterPrecommit. Note enterNewRound calls enterPropose, so it shouldn't be needed. And we want to wait the whole proposal step before doing any prevoting, so I don't think we need the enterPrevoteWait here either (it will happen latter if it needs to).

But then calling enterPrecommitWait right away means we now only have TimeoutPrecommit to receive the proposal block. Is that right?

In this case, we could be in any step, and may even be jumping to a later round, but then we only give ourselves TimeoutPrecommit to receive the proposal block for that round.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then calling enterPrecommitWait right away means we now only have TimeoutPrecommit to receive the proposal block. Is that right?
In this case, we could be in any step, and may even be jumping to a later round, but then we only give ourselves TimeoutPrecommit to receive the proposal block for that round.

That's why in Lemma 5 we need to assume that timeoutPropose(r) > 2Delta + timeoutPrecommit(r-1), i.e., timeoutPropose need to be big enough so we will wait for a process that jump to a higher round to receive Proposal. So we don't wait timeoutPrecommit to receive the proposal block. We wait timeoutPrecommit to set validBlock in the current round. And we wait timeoutPropose to receive the proposal, but there is dependency on the duration of timeoutPropose on timeoutPrecommit. Does this answer your question?

cs.enterNewRound(cs.Height, 0)
}
} else {
cs.enterPrecommitWait(height, vote.Round)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@milosevic @ebuchman
Hey guys. Late to follow up this code, but I have a question for line 1668.
Thanks in advance.

Why should we enter precommit wait? we already checked majority for nil so we don't have to wait because consensus already achieved for nil.
I think 1668 should be enterNewRound because it don't need to wait more precommit votes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants