consensus: Wait timeout precommit before starting new round#2493
consensus: Wait timeout precommit before starting new round#2493
Conversation
|
Note that there is an open PR (#2132) that mentions this issue, but it's actually more related to #1745. I have pulled in ensureVote test function from #2132 (sorry for squashing it, needed to fix some stuff after merged #2132 here). #1745 will be addressed in the follow up PR so we can probably close #2132 for those two new PRs. |
Codecov Report
@@ Coverage Diff @@
## develop #2493 +/- ##
===========================================
+ Coverage 61.29% 61.31% +0.01%
===========================================
Files 202 202
Lines 16720 16658 -62
===========================================
- Hits 10249 10214 -35
+ Misses 5602 5582 -20
+ Partials 869 862 -7
|
Pull in ensureVote test function from #2132
6446be9 to
1d8f155
Compare
consensus/state_test.go
Outdated
| signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2, vs3, vs4) | ||
|
|
||
| <-timeoutWaitCh | ||
| <-newRoundCh |
There was a problem hiding this comment.
It's probably a good idea to wrap these channel reads with timeouts to prevent the tests from hanging accidently and ease in debugging.
There was a problem hiding this comment.
Good point. I will use the same pattern as in ensureX methods to read from round and timeout channels.
consensus/state.go
Outdated
| if cs.config.SkipTimeoutCommit && precommits.HasAll() { | ||
| // if we have all the votes now, | ||
| // go straight to new round (skip timeout commit) | ||
| // cs.scheduleTimeout(time.Duration(0), cs.Height, 0, cstypes.RoundStepNewHeight) |
There was a problem hiding this comment.
Do we need to keep this commented code line around?
consensus/state.go
Outdated
| // cs.scheduleTimeout(time.Duration(0), cs.Height, 0, cstypes.RoundStepNewHeight) | ||
| cs.enterNewRound(cs.Height, 0) | ||
| } | ||
|
|
| } | ||
| vote := edv.Vote | ||
| if vote.Height != height { | ||
| panic(fmt.Sprintf("expected height %v, got %v", height, vote.Height)) |
There was a problem hiding this comment.
Instead of panics this function should return errors, which makes it easier to report back in the tests.
2e9e654 to
7a60808
Compare
melekes
left a comment
There was a problem hiding this comment.
Good job! Would be good to test the type (edv, ok := v.(types.EventDataUnlock)) in functions like ensureNewUnlock, .. etc The same thing we do in ensureVote
consensus/common_test.go
Outdated
| errorMessage string) { | ||
| timer := time.NewTimer(timeout) | ||
| select { | ||
| case <-timer.C: |
consensus/common_test.go
Outdated
| voteType byte) { | ||
| timer := time.NewTimer(ensureTimeout) | ||
| select { | ||
| case <-timer.C: |
There was a problem hiding this comment.
<-time.After(ensureTimeout)
consensus/common_test.go
Outdated
| edv, ok := v.(types.EventDataVote) | ||
| if !ok { | ||
| panic(fmt.Sprintf("expected a *types.Vote, got %v. wrong subscription channel?", | ||
| t.Error(fmt.Sprintf("expected a *types.Vote, got %v. wrong subscription channel?", |
7a60808 to
300330a
Compare
298f60b to
bb8411b
Compare
| // cs.scheduleTimeout(time.Duration(0), cs.Height, 0, cstypes.RoundStepNewHeight) | ||
| cs.enterNewRound(cs.Height, 0) | ||
| } | ||
| if ok && len(blockID.Hash) != 0 { |
There was a problem hiding this comment.
If len(blockID.Hash) == 0, don't we need to explicitly enterPrecommitWait?
There was a problem hiding this comment.
We should always enterPrecommitWait upon receiving 2f+1 any precommits. We don't need to treat 2f+1 Precommit nil as a special case. We do it here:
Lines 1657 to 1660 in bb8411b
There was a problem hiding this comment.
Ok right, it's taken care of by the else if. The first condition is only for +2/3 majority for a block hash. If there's a +2/3 majority for nil, it will be treated by the second condition. Thanks.
consensus/common_test.go
Outdated
| case <-stepCh: | ||
| panic("We should be stuck waiting, not moving to the next step") | ||
| case <-ch: | ||
| t.Error(errorMessage) |
There was a problem hiding this comment.
We probably want these to be panics for now until we build something more intelligent, otherwise we won't know which line in the actual test function caused the failure.
There was a problem hiding this comment.
Or alternatively to the error check in the actual test and t.Fatalf there. I know some may be concerned about readability but it's a test after all and the execution should be clear.
There was a problem hiding this comment.
For now, I would also say to leave panic. Those tests are mostly covering critical algorithm scenarios so it is good if they are readable. Tests will probably become shorter after consensus state refactor and then we can do it properly.
Pull in ensureVote test function from #2132
Fix #1690.