kvserver: don't transfer leases to draining nodes during scatters#79295
Conversation
fb62e18 to
c0ed057
Compare
tbg
left a comment
There was a problem hiding this comment.
Great to see this! A drive-by question I had is whether the backport of this will be problematic, and whether we're planning a surgical fix to backport instead.
c0ed057 to
02b63ef
Compare
|
Yeah, I was thinking that we'd just backport a small two-liner but fix it "properly" on |
28619eb to
5862228
Compare
Previously, it wasn't very clear how a call into `maybeTransferLeaseAway` would find a lease transfer target. This commit makes a small effort to make this obvious. Release note: None
30f1dbc to
d2c0c32
Compare
kvoli
left a comment
There was a problem hiding this comment.
LGTM!
Reviewed 5 of 5 files at r5, 3 of 3 files at r6, 4 of 4 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
tbg
left a comment
There was a problem hiding this comment.
Nice! Always a pleasure to see shifty ad-hoc code gone.
Reviewed 5 of 5 files at r5, 3 of 3 files at r6, 4 of 4 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 1502 at r6 (raw file):
candidates := make([]roachpb.ReplicaDescriptor, 0, len(existing)) for i := range existing { if existing[i].GetType() != roachpb.VOTER_FULL {
Is this correct? I thought we now possibly transferred leases to a VOTER_INCOMING. Is this only occurring on a separate ad-hoc path?
pkg/kv/kvserver/allocator.go, line 1570 at r6 (raw file):
} } if !leaseholderInExisting && !buildutil.CrdbTestBuild {
It's unusual to have an assertion that's only in production but not in tests. Could you add a comment on why that is necessary? It might be cleaner to use a testing knob here to selectively disable the assertion. I also think we may want to avoid fataling in prod for something like this, when there is the reasonable enough option to "do nothing". I wouldn't want to deal with the fatal error in prod, as it will be hard to contain and thus difficult to keep the cluster alive.
pkg/kv/kvserver/allocator.go, line 1620 at r6 (raw file):
opts transferLeaseOptions, ) roachpb.ReplicaDescriptor { checkTransferLeaseSource := opts.checkTransferLeaseSource
I wonder if it's worth to pull through and rename this parameter to allowLeaseRepl. checkTransferLeaseSource seems cryptic.
pkg/kv/kvserver/allocator.go, line 1640 at r6 (raw file):
} existing = a.ValidLeaseTargets(
😍
pkg/kv/kvserver/allocator.go, line 1865 at r6 (raw file):
stats *replicaStats, ) bool { if a.leaseholderViolatesPreferences(ctx, conf, leaseRepl, existing) {
Maybe this method would be more aptly named leaseholderShouldMoveDueToPreferences, since this makes it clear that the method checks more than just a violation of preferences (it also verifies that there's a better place to put the lease according to prefs)
pkg/kv/kvserver/replica_command.go, line 3331 at r7 (raw file):
targetStoreID := potentialLeaseTargets[newLeaseholderIdx].StoreID if targetStoreID != r.store.StoreID() { log.VEventf(ctx, 2, "randomly transferring lease to s%d", targetStoreID)
Does ctx have the repl in it? With these kinds of messages you always want the range we're talking about.
ff22388 to
32b2be2
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @kvoli, @nvanbenschoten, and @tbg)
pkg/kv/kvserver/allocator.go, line 1502 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Is this correct? I thought we now possibly transferred leases to a VOTER_INCOMING. Is this only occurring on a separate ad-hoc path?
We do transfer leases to incoming voters now, but that's ad-hoc code for replication changes. AdminChangeReplicas calling into the allocator was always a bit fraught, and we're no longer doing that now since #77841.
pkg/kv/kvserver/allocator.go, line 1570 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
It's unusual to have an assertion that's only in production but not in tests. Could you add a comment on why that is necessary? It might be cleaner to use a testing knob here to selectively disable the assertion. I also think we may want to avoid fataling in prod for something like this, when there is the reasonable enough option to "do nothing". I wouldn't want to deal with the fatal error in prod, as it will be hard to contain and thus difficult to keep the cluster alive.
All the current callers of this do include the leaseRepl in the slice of existing replicas, so I just wanted some regression insurance for this. Based on your comment, I've downgraded this to a log.Error for now.
pkg/kv/kvserver/allocator.go, line 1620 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I wonder if it's worth to pull through and rename this parameter to
allowLeaseRepl.checkTransferLeaseSourceseems cryptic.
Added it as the final commit in this PR.
pkg/kv/kvserver/allocator.go, line 1865 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Maybe this method would be more aptly named
leaseholderShouldMoveDueToPreferences, since this makes it clear that the method checks more than just a violation of preferences (it also verifies that there's a better place to put the lease according to prefs)
Done.
pkg/kv/kvserver/replica_command.go, line 3331 at r7 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Does
ctxhave the repl in it? With these kinds of messages you always want the range we're talking about.
Yep, It should be added here:
cockroach/pkg/kv/kvserver/replica_send.go
Lines 132 to 134 in 68566b6
b67f95f to
bec4261
Compare
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in order to make it more readable and, to abstract out a new exported `Allocator` method called `ValidLeaseTargets()`. The contract of `ValidLeaseTargets()` is as follows: ``` // ValidLeaseTargets returns a set of candidate stores that are suitable to be // transferred a lease for the given range. // // - It excludes stores that are dead, or marked draining or suspect. // - If the range has lease_preferences, and there are any non-draining, // non-suspect nodes that match those preferences, it excludes stores that don't // match those preferences. // - It excludes replicas that may need snapshots. If replica calling this // method is not the Raft leader (meaning that it doesn't know whether follower // replicas need a snapshot or not), produces no results. ``` Previously, there were multiple places where we were performing the logic that's encapsulated by `ValidLeaseTargets()`, which was a potential source of bugs. This is an attempt to unify this logic in one place that's relatively well-tested. This commit is only a refactor, and does not attempt to change any behavior. As such, no existing tests have been changed, with the exception of a subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over that subtest to understand why the behavior change made by this patch is desirable. The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691. Release note: none
Previously, `AdminScatter` called with the `RandomizeLeases` option could potentially transfer leases to nodes marked draining. This commit leverages the refactor from the last commit to fix this bug by first filtering the set of candidates down to a set of valid candidates that meet lease preferences and are not marked suspect or draining. Release note (bug fix): Fixes a bug where draining / drained nodes could re-acquire leases during an import or an index backfill.
This commit inverts this boolean and renames it to `excludeLeaseRepl`, which is a lot more intuitive for the callers of `Allocator.TransferLeaseTarget`. Release note: none
bec4261 to
3c7e3e4
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Optimistically merging based on the approvals. TFTRs!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @kvoli, @nvanbenschoten, and @tbg)
tbg
left a comment
There was a problem hiding this comment.
Reviewed 8 of 8 files at r11, 4 of 4 files at r12, 6 of 6 files at r13, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 1570 at r11 (raw file):
} if !leaseholderInExisting { log.Errorf(ctx, "programming error: expected leaseholder store to be in the slice of existing replicas")
But then don't you want to return?
pkg/kv/kvserver/allocator.go, line 1653 at r11 (raw file):
ctx, source, existing, stats, nil, candidateLeasesMean, ) if !excludeLeaseRepl {
Hadn't read this code before. I don't understand it. You're sure this is all correct, right? I know you're not changing anything in this PR, so just a drive by comment.
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)
pkg/kv/kvserver/allocator.go, line 1570 at r11 (raw file):
Previously, tbg (Tobias Grieger) wrote…
But then don't you want to return?
Done.
pkg/kv/kvserver/allocator.go, line 1653 at r11 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Hadn't read this code before. I don't understand it. You're sure this is all correct, right? I know you're not changing anything in this PR, so just a drive by comment.
I don't really understand why it's written the way it is, and it could use some further clean-up. I'll try and do something about it soon™.
|
bors r+ |
|
Build succeeded: |
This patch is a more targetted fix for what cockroachdb#79295 does, for the purposes of backporting to release-21.2. Release note (bug fix): Draining nodes could previously re-acquire leases during an import or an index backfill. This bug is now fixed.
kvserver: introduce Allocator.ValidLeaseTargets()
This commit is a minor refactor of the
Allocator.TransferLeaseTargetlogic inorder to make it more readable and, to abstract out a new exported
Allocatormethod called
ValidLeaseTargets().The contract of
ValidLeaseTargets()is as follows:Previously, there were multiple places where we were performing the logic
that's encapsulated by
ValidLeaseTargets(), which was a potential source ofbugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside
TestAllocatorTransferLeaseTargetDraining. See the comment overthat subtest to understand why the behavior change made by this patch is
desirable.
The next commit in this PR uses this method to fix (at least part of) #74691.
Release note: none
kvserver: don't transfer leases to draining nodes during scatters
Previously,
AdminScattercalled with theRandomizeLeasesoption couldpotentially transfer leases to nodes marked draining. This commit leverages
the refactor from the last commit to fix this bug by first filtering the set of
candidates down to a set of valid candidates that meet lease preferences and
are not marked suspect or draining.
Relates to and fixes a part of #74691.
Release note (bug fix): Fixes a bug where draining / drained nodes could
re-acquire leases during an import or an index backfill.