Skip to content

kvserver: eagerly initialize Raft groups#104657

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
erikgrinaker:eager-raft-group
Jul 18, 2023
Merged

kvserver: eagerly initialize Raft groups#104657
craig[bot] merged 1 commit intocockroachdb:masterfrom
erikgrinaker:eager-raft-group

Conversation

@erikgrinaker
Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker commented Jun 9, 2023

This patch eagerly initializes Raft groups during replica construction, instead of deferring it until first use. The original motivation for lazy initialization was to avoid election storms, but since replicas start out quiesced they won't do anything until unquiesced anyway, and Raft pre-vote will prevent disturbing established leaders.

This has a minor impact on startup time: node startup time with 50.000 replicas and warm OS caches increased from 12.6 to 13.6 seconds. That seems fine, and it's likely better to do that work before joining the cluster.

This does not yet enforce the invariant that every replica has a non-nil Raft group: destroyed replicas remove their Raft group (releasing its memory for garbage collection), and many tests construct replicas without Raft groups. This may be addressed later.

Resolves #73715.

Epic: none
Release note: none

@erikgrinaker erikgrinaker self-assigned this Jun 9, 2023
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@erikgrinaker erikgrinaker force-pushed the eager-raft-group branch 2 times, most recently from e02400c to 0362a7a Compare June 15, 2023 18:28
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

Seeing some worrying test failures here. Not sure this is worth the risk/effort right now.

=== RUN   TestPause_alter_table_add_foreign_key/pause_stage_1_of_2
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1  slice[12,13) out of bound [11,11]
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !goroutine 27557 [running]:
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !runtime/debug.Stack()
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  GOROOT/src/runtime/debug/stack.go:24 +0x65
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0xc004adaf00, {{{0xc005515200, 0x24}, {0x4e80c77, 0x1}, {0x4e80c77, 0x1}, {0x4e80c77, 0x1}}, 0x1768ea260bbe72ee, ...})
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/util/log/clog.go:261 +0xb8
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.logfDepthInternal({0x616ac60, 0xc005c59bc0}, 0x3, 0x4, 0x0, 0x0?, {0x4e71ac3, 0x21}, {0xc004d00440, 0x4, ...})
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/util/log/channels.go:106 +0x645
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.logfDepth(...)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/util/log/channels.go:39
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.FatalfDepth(...)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/bazel-out/k8-fastbuild/bin/pkg/util/log/log_channels_generated.go:920
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftLogger).Panicf(0xc00bd6eb50, {0x4e71ac3, 0x21}, {0xc004d00440, 0x4, 0x4})
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/raft.go:118 +0xcd
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !go.etcd.io/raft/v3.(*raftLog).mustCheckOutOfBounds(0xc00bfa9220, 0xc, 0xd)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  go.etcd.io/raft/v3/external/io_etcd_go_raft_v3/log.go:554 +0x23d
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !go.etcd.io/raft/v3.(*raftLog).slice(0xc00bfa9220, 0xc, 0xd, 0x4000000)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  go.etcd.io/raft/v3/external/io_etcd_go_raft_v3/log.go:492 +0x4e
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !go.etcd.io/raft/v3.(*raftLog).nextCommittedEnts(0xc00bfa9220, 0x0?)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  go.etcd.io/raft/v3/external/io_etcd_go_raft_v3/log.go:235 +0x19f
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !go.etcd.io/raft/v3.(*RawNode).readyWithoutAccept(_)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  go.etcd.io/raft/v3/external/io_etcd_go_raft_v3/rawnode.go:146 +0xd7
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !go.etcd.io/raft/v3.(*RawNode).Ready(_)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  go.etcd.io/raft/v3/external/io_etcd_go_raft_v3/rawnode.go:134 +0x5e
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked.func2(0xc000b27900?)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:859 +0x115
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).withRaftGroupLocked.func1(...)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:2063
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).withRaftGroupLocked(0xc00a227200, 0x616ac60?)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:2064 +0x4e
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked(_, {_, _}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:851 +0x2d7
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReady(_, {_, _}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:807 +0x19b
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processReady(0xc00a0cca80, 0xc0059f9e60?)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:646 +0x145
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftSchedulerShard).worker(0xc003aa3650, {0x616ac60, 0xc006c6cfc0}, {0x6150df0, 0xc00a0cca80}, 0xc00a0cc000)
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:418 +0x1f2
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).Start.func2({0x616ac60?, 0xc006c6cfc0?})
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:321 +0x45
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:484 +0x146
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !  github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:475 +0x43b
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !
F230615 18:58:29.576938 27557 go.etcd.io/raft/v3/log.go:554  [T1,n1,s1,r50/1:/Table/{48-50}] 1 !For more context, check log files in: /artifacts/tmp/_tmp/2cb16150a1d62f4db8cc89cc951df360/logTestPause_alter_table_add_foreign_key4248197905

@erikgrinaker erikgrinaker force-pushed the eager-raft-group branch 7 times, most recently from a950550 to 3dd5218 Compare June 25, 2023 09:16
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

I cut the scope of this down from always enforcing a non-nil Raft group to simply initializing during replica creation, since there were a bunch of tests that relied on creating replicas with no Raft groups.

@erikgrinaker erikgrinaker marked this pull request as ready for review June 25, 2023 09:18
@erikgrinaker erikgrinaker requested a review from a team June 25, 2023 09:18
@erikgrinaker erikgrinaker requested a review from a team as a code owner June 25, 2023 09:18
@erikgrinaker erikgrinaker requested review from pav-kv and tbg June 25, 2023 09:18
tbg added a commit to tbg/cockroach that referenced this pull request Jul 12, 2023
This one was possible hit. Interestingly, it would be "accidentally" fixed by
cockroachdb#104657 as well, since after that
PR we would never see a nil `status` here.

Touches cockroachdb#106568
tbg added a commit to tbg/cockroach that referenced this pull request Jul 14, 2023
This one was possible hit. Interestingly, it would be "accidentally" fixed by
cockroachdb#104657 as well, since after that
PR we would never see a nil `status` here.

Touches cockroachdb#106568
tbg added a commit to tbg/cockroach that referenced this pull request Jul 14, 2023
This one was possible hit. Interestingly, it would be "accidentally" fixed by
cockroachdb#104657 as well, since after that
PR we would never see a nil `status` here.

Touches cockroachdb#106568
This patch eagerly initializes Raft groups during replica construction,
instead of deferring it until first use. The original motivation for
lazy initialization was to avoid election storms, but since replicas
start out quiesced they won't do anything until unquiesced anyway, and
Raft pre-vote will prevent disturbing established leaders.

This has a minor impact on startup time: node startup time with 50.000
replicas and warm OS caches increased from 12.6 to 13.6 seconds. That
seems fine, and it's likely better to do that work before joining the
cluster.

This does not yet enforce the invariant that every replica has a non-nil
Raft group: destroyed replicas remove their Raft group (releasing its
memory for garbage collection), and many tests construct replicas
without Raft groups. This may be addressed later.

Epic: none
Release note: None
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jul 18, 2023

Build succeeded:

@craig craig bot merged commit fd7fc2a into cockroachdb:master Jul 18, 2023
@erikgrinaker erikgrinaker deleted the eager-raft-group branch October 3, 2023 09:54
nvb added a commit to nvb/cockroach that referenced this pull request Jun 14, 2024
This commit fixes a mistake in cockroachdb#104657, where we switched a call on the
request path from `maybeInitializeRaftGroup` to `maybeUnquiesce`. As a
result, we were unquiescing ranges whenver they saw any read or write
traffic. We need to unquiesce on writes, but not on reads.

This was probably not having much of an effect because the leader would
usually evaluate this call and skip the "wake leader" and "may campaign"
actions, but it would still start ticking until it decided to quiesce
again, which is unnecessary.

Even though this was a regression, I don't intend to backport this patch
because it feels risky and the regression seems mostly harmless in practice.

Release notes: None
craig bot pushed a commit that referenced this pull request Jun 14, 2024
125661: kv: don't unquiesce on read-only requests r=nvanbenschoten a=nvanbenschoten

This commit fixes a mistake in #104657, where we switched a call on the request path from `maybeInitializeRaftGroup` to `maybeUnquiesce`. As a result, we were unquiescing ranges whenver they saw any read or write traffic. We need to unquiesce on writes, but not on reads.

This was probably not having much of an effect because the leader would usually evaluate this call and skip the "wake leader" and "may campaign" actions, but it would still start ticking until it decided to quiesce again, which is unnecessary.

Even though this was a regression, I don't intend to backport this patch because it feels risky and the regression seems mostly harmless in practice.

Epic: None
Release notes: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

replication: explore eliminating lazy internalRaftGroup initialization in favor of quiescence

3 participants