Skip to content

state: Use last height changed if validator set is empty#3560

Merged
xla merged 2 commits intodevelopfrom
anton/fix-load-validators
Apr 15, 2019
Merged

state: Use last height changed if validator set is empty#3560
xla merged 2 commits intodevelopfrom
anton/fix-load-validators

Conversation

@melekes
Copy link
Contributor

@melekes melekes commented Apr 14, 2019

What happened:

New code v0.31.4 was supposed to fall back to last height changed when/if it
fails to find validators at checkpoint height (to make release
non-breaking).

But because we did not check if validator set is empty (only checking if we have
an info is not enough, we save info for every block), the fall back
logic was never executed => resulting in LoadValidators returning an
empty validator set for cases where lastStoredHeight is checkpoint
height (i.e. almost all heights if the application does not change
validator set often).

How it was found:

one of our users - @sunboshan reported a bug here
#3537 (comment)

  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Updated CHANGELOG_PENDING.md

@liamsi
Copy link
Contributor

liamsi commented Apr 14, 2019

we are seeing #2723 again; restarting the build to be certain all other tests pass
full output: build_54473_step_105_container_3.txt.zip

t.Fatal(err)
}
if loadedVals.Size() == 0 {
t.Fatal("Expected validators to be non-empty")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code would never be reached, right? Because panic("empty validator set") would happen before? I like that these two lines make this more explicit! (keep them, I'm just asking).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove || valInfo2.ValidatorSet == nil check, it will reach this code and fail.

With || valInfo2.ValidatorSet == nil, the code does not panic and returns non-empty validator set.

Copy link
Contributor

@liamsi liamsi Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, just tried it locally: I did not add the || valInfo2.ValidatorSet == nil check but added the changes in the test. The function will panic before the loadedVals.Size() == 0-check can be reached because LoadValidators calls IncrementProposerPriority anyways. Which will panic with empty validator set:

func (vals *ValidatorSet) IncrementProposerPriority(times int) {
if vals.IsNilOrEmpty() {
panic("empty validator set")
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still fine to leave the check just in case.

Copy link
Contributor

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @melekes! LGTM 👍 I did not verify querying the rpc returns a non-empty validator set. Looking at the changes and the modifications in the test, my understanding is that it indeed fixes #3537 (comment)

@xla xla changed the title use last height changed if validator set is empty state: Use last height changed if validator set is empty Apr 15, 2019
@liamsi
Copy link
Contributor

liamsi commented Apr 15, 2019

Now failing:

=== RUN   TestSIGHUP
MustReadFile failed: open /tmp/sighup_test259331718: no such file or directory
FAIL	github.com/tendermint/tendermint/libs/autofile	0.052s
Exited with code 1

What happened:

New code was supposed to fall back to last height changed when/if it
failed to find validators at checkpoint height (to make release
non-breaking).

But because we did not check if validator set is empty, the fall back
logic was never executed => resulting in LoadValidators returning an
empty validator set for cases where `lastStoredHeight` is checkpoint
height (i.e. almost all heights if the application does not change
validator set often).

How it was found:

one of our users - @sunboshan reported a bug here
#3537 (comment)
@melekes melekes force-pushed the anton/fix-load-validators branch from 25807f3 to b292900 Compare April 15, 2019 12:29
@codecov-io
Copy link

codecov-io commented Apr 15, 2019

Codecov Report

Merging #3560 into develop will increase coverage by 0.02%.
The diff coverage is 50%.

@@            Coverage Diff             @@
##           develop   #3560      +/-   ##
==========================================
+ Coverage    64.17%   64.2%   +0.02%     
==========================================
  Files          213     213              
  Lines        17373   17356      -17     
==========================================
- Hits         11149   11143       -6     
+ Misses        5301    5291      -10     
+ Partials       923     922       -1
Impacted Files Coverage Δ
state/store.go 71.23% <50%> (+2.05%) ⬆️
libs/clist/clist.go 66.66% <0%> (-1.52%) ⬇️
consensus/reactor.go 72.13% <0%> (-0.83%) ⬇️
libs/db/remotedb/grpcdb/server.go 0% <0%> (ø) ⬆️
consensus/state.go 80% <0%> (+0.58%) ⬆️
p2p/pex/pex_reactor.go 79.44% <0%> (+0.61%) ⬆️
rpc/client/httpclient.go 66.51% <0%> (+0.89%) ⬆️
consensus/replay.go 70.47% <0%> (+2.05%) ⬆️

Copy link
Contributor

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding a changelog entry 👍

@xla xla merged commit 50b87c3 into develop Apr 15, 2019
@xla xla deleted the anton/fix-load-validators branch April 15, 2019 14:54
brapse pushed a commit to brapse/tendermint that referenced this pull request Jun 5, 2019
…3560)

What happened:

New code was supposed to fall back to last height changed when/if it
failed to find validators at checkpoint height (to make release
non-breaking).

But because we did not check if validator set is empty, the fall back
logic was never executed => resulting in LoadValidators returning an
empty validator set for cases where `lastStoredHeight` is checkpoint
height (i.e. almost all heights if the application does not change
validator set often).

How it was found:

one of our users - @sunboshan reported a bug here
tendermint#3537 (comment)

* use last height changed in validator set is empty
* add a changelog entry
Stumble pushed a commit to lino-network/tendermint that referenced this pull request Jul 25, 2019
…3560)

What happened:

New code was supposed to fall back to last height changed when/if it
failed to find validators at checkpoint height (to make release
non-breaking).

But because we did not check if validator set is empty, the fall back
logic was never executed => resulting in LoadValidators returning an
empty validator set for cases where `lastStoredHeight` is checkpoint
height (i.e. almost all heights if the application does not change
validator set often).

How it was found:

one of our users - @sunboshan reported a bug here
tendermint#3537 (comment)

* use last height changed in validator set is empty
* add a changelog entry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants