kvserver: eagerly initialize Raft groups#104657
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Jul 18, 2023
Merged
kvserver: eagerly initialize Raft groups#104657craig[bot] merged 1 commit intocockroachdb:masterfrom
craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
Member
e02400c to
0362a7a
Compare
Contributor
Author
|
Seeing some worrying test failures here. Not sure this is worth the risk/effort right now. |
a950550 to
3dd5218
Compare
Contributor
Author
|
I cut the scope of this down from always enforcing a non-nil Raft group to simply initializing during replica creation, since there were a bunch of tests that relied on creating replicas with no Raft groups. |
tbg
approved these changes
Jun 26, 2023
3dd5218 to
0ee4c4e
Compare
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Jul 12, 2023
This one was possible hit. Interestingly, it would be "accidentally" fixed by cockroachdb#104657 as well, since after that PR we would never see a nil `status` here. Touches cockroachdb#106568
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Jul 14, 2023
This one was possible hit. Interestingly, it would be "accidentally" fixed by cockroachdb#104657 as well, since after that PR we would never see a nil `status` here. Touches cockroachdb#106568
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Jul 14, 2023
This one was possible hit. Interestingly, it would be "accidentally" fixed by cockroachdb#104657 as well, since after that PR we would never see a nil `status` here. Touches cockroachdb#106568
0ee4c4e to
f11dd84
Compare
This patch eagerly initializes Raft groups during replica construction, instead of deferring it until first use. The original motivation for lazy initialization was to avoid election storms, but since replicas start out quiesced they won't do anything until unquiesced anyway, and Raft pre-vote will prevent disturbing established leaders. This has a minor impact on startup time: node startup time with 50.000 replicas and warm OS caches increased from 12.6 to 13.6 seconds. That seems fine, and it's likely better to do that work before joining the cluster. This does not yet enforce the invariant that every replica has a non-nil Raft group: destroyed replicas remove their Raft group (releasing its memory for garbage collection), and many tests construct replicas without Raft groups. This may be addressed later. Epic: none Release note: None
f11dd84 to
a851fe9
Compare
Contributor
Author
|
bors r+ |
Contributor
|
Build succeeded: |
nvb
added a commit
to nvb/cockroach
that referenced
this pull request
Jun 14, 2024
This commit fixes a mistake in cockroachdb#104657, where we switched a call on the request path from `maybeInitializeRaftGroup` to `maybeUnquiesce`. As a result, we were unquiescing ranges whenver they saw any read or write traffic. We need to unquiesce on writes, but not on reads. This was probably not having much of an effect because the leader would usually evaluate this call and skip the "wake leader" and "may campaign" actions, but it would still start ticking until it decided to quiesce again, which is unnecessary. Even though this was a regression, I don't intend to backport this patch because it feels risky and the regression seems mostly harmless in practice. Release notes: None
craig bot
pushed a commit
that referenced
this pull request
Jun 14, 2024
125661: kv: don't unquiesce on read-only requests r=nvanbenschoten a=nvanbenschoten This commit fixes a mistake in #104657, where we switched a call on the request path from `maybeInitializeRaftGroup` to `maybeUnquiesce`. As a result, we were unquiescing ranges whenver they saw any read or write traffic. We need to unquiesce on writes, but not on reads. This was probably not having much of an effect because the leader would usually evaluate this call and skip the "wake leader" and "may campaign" actions, but it would still start ticking until it decided to quiesce again, which is unnecessary. Even though this was a regression, I don't intend to backport this patch because it feels risky and the regression seems mostly harmless in practice. Epic: None Release notes: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch eagerly initializes Raft groups during replica construction, instead of deferring it until first use. The original motivation for lazy initialization was to avoid election storms, but since replicas start out quiesced they won't do anything until unquiesced anyway, and Raft pre-vote will prevent disturbing established leaders.
This has a minor impact on startup time: node startup time with 50.000 replicas and warm OS caches increased from 12.6 to 13.6 seconds. That seems fine, and it's likely better to do that work before joining the cluster.
This does not yet enforce the invariant that every replica has a non-nil Raft group: destroyed replicas remove their Raft group (releasing its memory for garbage collection), and many tests construct replicas without Raft groups. This may be addressed later.
Resolves #73715.
Epic: none
Release note: none