localtestcluster: re-order setting of gossip descriptor#54224
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Sep 11, 2020
Merged
localtestcluster: re-order setting of gossip descriptor#54224craig[bot] merged 1 commit intocockroachdb:masterfrom
craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
The heartbeat loop depends on gossip to retrieve the node ID. When stressing a few tests that make use of LocalTestCluster, I was seeing empty liveness records for empty node IDs being heartbeated. By re-ordering things as such we bring it closer to the Server initialization ordering. Release note: None
Member
knz
approved these changes
Sep 11, 2020
Contributor
knz
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @tbg)
Contributor
Author
|
Thanks for the review! bors r+ |
Contributor
|
Build succeeded: |
craig bot
pushed a commit
that referenced
this pull request
Sep 11, 2020
54216: kvserver: address migration concern with node liveness r=irfansharif a=irfansharif In #53842 we introduced a change to always persist a liveness record on start up. As part of that change, we refactored how the liveness heartbeat codepath dealt with missing liveness records: it knew to fetch it from KV given we were now maintaining the invariant that it would always be present. Except that wasn't necessarily true, as demonstrated by the following scenario: ``` // - v20.1 node gets added to v20.1 cluster, and is quickly removed // before being able to persist its liveness record. // - The cluster is upgraded to v20.2. // - The node from earlier is rolled into v20.2, and re-added to the // cluster. // - It's never able to successfully heartbeat (it didn't join // through the join rpc, bootstrap, or gossip). Welp. ``` Though admittedly unlikely, we should handle it all the same instead of simply erroring out. We'll just fall back to creating the liveness record in-place as we did in v20.1 code. We can remove this fallback in 21.1 code. --- First commit is from #54224. Release note: None Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The heartbeat loop depends on gossip to retrieve the node ID. When
stressing a few tests that make use of LocalTestCluster, I was seeing
empty liveness records for empty node IDs being heartbeated. By
re-ordering things as such we bring it closer to the Server
initialization ordering.
Release note: None