Skip to content

kvserver: don't transfer leases to draining nodes during scatters#79295

Merged
craig[bot] merged 4 commits intocockroachdb:masterfrom
aayushshah15:20220401_betterRandomLeaseTransfers
Apr 7, 2022
Merged

kvserver: don't transfer leases to draining nodes during scatters#79295
craig[bot] merged 4 commits intocockroachdb:masterfrom
aayushshah15:20220401_betterRandomLeaseTransfers

Conversation

@aayushshah15
Copy link
Copy Markdown
Contributor

@aayushshah15 aayushshah15 commented Apr 3, 2022

kvserver: introduce Allocator.ValidLeaseTargets()

This commit is a minor refactor of the Allocator.TransferLeaseTarget logic in
order to make it more readable and, to abstract out a new exported Allocator
method called ValidLeaseTargets().

The contract of ValidLeaseTargets() is as follows:

// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

Previously, there were multiple places where we were performing the logic
that's encapsulated by ValidLeaseTargets(), which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside TestAllocatorTransferLeaseTargetDraining. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) #74691.

Release note: none

kvserver: don't transfer leases to draining nodes during scatters

Previously, AdminScatter called with the RandomizeLeases option could
potentially transfer leases to nodes marked draining. This commit leverages
the refactor from the last commit to fix this bug by first filtering the set of
candidates down to a set of valid candidates that meet lease preferences and
are not marked suspect or draining.

Relates to and fixes a part of #74691.

Release note (bug fix): Fixes a bug where draining / drained nodes could
re-acquire leases during an import or an index backfill.

@aayushshah15 aayushshah15 requested a review from a team as a code owner April 3, 2022 18:34
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@aayushshah15 aayushshah15 requested review from kvoli and nvb April 3, 2022 18:34
@aayushshah15 aayushshah15 force-pushed the 20220401_betterRandomLeaseTransfers branch from fb62e18 to c0ed057 Compare April 3, 2022 18:38
Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this! A drive-by question I had is whether the backport of this will be problematic, and whether we're planning a surgical fix to backport instead.

@aayushshah15 aayushshah15 force-pushed the 20220401_betterRandomLeaseTransfers branch from c0ed057 to 02b63ef Compare April 3, 2022 18:45
@aayushshah15
Copy link
Copy Markdown
Contributor Author

Yeah, I was thinking that we'd just backport a small two-liner but fix it "properly" on master.

@aayushshah15 aayushshah15 force-pushed the 20220401_betterRandomLeaseTransfers branch 6 times, most recently from 28619eb to 5862228 Compare April 3, 2022 22:52
Previously, it wasn't very clear how a call into `maybeTransferLeaseAway` would
find a lease transfer target. This commit makes a small effort to make this
obvious.

Release note: None
@aayushshah15 aayushshah15 force-pushed the 20220401_betterRandomLeaseTransfers branch 2 times, most recently from 30f1dbc to d2c0c32 Compare April 3, 2022 22:59
Copy link
Copy Markdown
Contributor

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Reviewed 5 of 5 files at r5, 3 of 3 files at r6, 4 of 4 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Always a pleasure to see shifty ad-hoc code gone.

Reviewed 5 of 5 files at r5, 3 of 3 files at r6, 4 of 4 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)


pkg/kv/kvserver/allocator.go, line 1502 at r6 (raw file):

	candidates := make([]roachpb.ReplicaDescriptor, 0, len(existing))
	for i := range existing {
		if existing[i].GetType() != roachpb.VOTER_FULL {

Is this correct? I thought we now possibly transferred leases to a VOTER_INCOMING. Is this only occurring on a separate ad-hoc path?


pkg/kv/kvserver/allocator.go, line 1570 at r6 (raw file):

		}
	}
	if !leaseholderInExisting && !buildutil.CrdbTestBuild {

It's unusual to have an assertion that's only in production but not in tests. Could you add a comment on why that is necessary? It might be cleaner to use a testing knob here to selectively disable the assertion. I also think we may want to avoid fataling in prod for something like this, when there is the reasonable enough option to "do nothing". I wouldn't want to deal with the fatal error in prod, as it will be hard to contain and thus difficult to keep the cluster alive.


pkg/kv/kvserver/allocator.go, line 1620 at r6 (raw file):

	opts transferLeaseOptions,
) roachpb.ReplicaDescriptor {
	checkTransferLeaseSource := opts.checkTransferLeaseSource

I wonder if it's worth to pull through and rename this parameter to allowLeaseRepl. checkTransferLeaseSource seems cryptic.


pkg/kv/kvserver/allocator.go, line 1640 at r6 (raw file):

	}

	existing = a.ValidLeaseTargets(

😍


pkg/kv/kvserver/allocator.go, line 1865 at r6 (raw file):

	stats *replicaStats,
) bool {
	if a.leaseholderViolatesPreferences(ctx, conf, leaseRepl, existing) {

Maybe this method would be more aptly named leaseholderShouldMoveDueToPreferences, since this makes it clear that the method checks more than just a violation of preferences (it also verifies that there's a better place to put the lease according to prefs)


pkg/kv/kvserver/replica_command.go, line 3331 at r7 (raw file):

			targetStoreID := potentialLeaseTargets[newLeaseholderIdx].StoreID
			if targetStoreID != r.store.StoreID() {
				log.VEventf(ctx, 2, "randomly transferring lease to s%d", targetStoreID)

Does ctx have the repl in it? With these kinds of messages you always want the range we're talking about.

@aayushshah15 aayushshah15 force-pushed the 20220401_betterRandomLeaseTransfers branch 2 times, most recently from ff22388 to 32b2be2 Compare April 6, 2022 00:31
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @kvoli, @nvanbenschoten, and @tbg)


pkg/kv/kvserver/allocator.go, line 1502 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Is this correct? I thought we now possibly transferred leases to a VOTER_INCOMING. Is this only occurring on a separate ad-hoc path?

We do transfer leases to incoming voters now, but that's ad-hoc code for replication changes. AdminChangeReplicas calling into the allocator was always a bit fraught, and we're no longer doing that now since #77841.


pkg/kv/kvserver/allocator.go, line 1570 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

It's unusual to have an assertion that's only in production but not in tests. Could you add a comment on why that is necessary? It might be cleaner to use a testing knob here to selectively disable the assertion. I also think we may want to avoid fataling in prod for something like this, when there is the reasonable enough option to "do nothing". I wouldn't want to deal with the fatal error in prod, as it will be hard to contain and thus difficult to keep the cluster alive.

All the current callers of this do include the leaseRepl in the slice of existing replicas, so I just wanted some regression insurance for this. Based on your comment, I've downgraded this to a log.Error for now.


pkg/kv/kvserver/allocator.go, line 1620 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I wonder if it's worth to pull through and rename this parameter to allowLeaseRepl. checkTransferLeaseSource seems cryptic.

Added it as the final commit in this PR.


pkg/kv/kvserver/allocator.go, line 1865 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Maybe this method would be more aptly named leaseholderShouldMoveDueToPreferences, since this makes it clear that the method checks more than just a violation of preferences (it also verifies that there's a better place to put the lease according to prefs)

Done.


pkg/kv/kvserver/replica_command.go, line 3331 at r7 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Does ctx have the repl in it? With these kinds of messages you always want the range we're talking about.

Yep, It should be added here:

// Add the range log tag.
ctx = r.AnnotateCtx(ctx)

@aayushshah15 aayushshah15 force-pushed the 20220401_betterRandomLeaseTransfers branch from b67f95f to bec4261 Compare April 6, 2022 02:59
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
Previously, `AdminScatter` called with the `RandomizeLeases` option could
potentially transfer leases to nodes marked draining.  This commit leverages
the refactor from the last commit to fix this bug by first filtering the set of
candidates down to a set of valid candidates that meet lease preferences and
are not marked suspect or draining.

Release note (bug fix): Fixes a bug where draining / drained nodes could
re-acquire leases during an import or an index backfill.
This commit inverts this boolean and renames it to `excludeLeaseRepl`, which is
a lot more intuitive for the callers of `Allocator.TransferLeaseTarget`.

Release note: none
@aayushshah15 aayushshah15 force-pushed the 20220401_betterRandomLeaseTransfers branch from bec4261 to 3c7e3e4 Compare April 6, 2022 03:06
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimistically merging based on the approvals. TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @kvoli, @nvanbenschoten, and @tbg)

Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r11, 4 of 4 files at r12, 6 of 6 files at r13, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)


pkg/kv/kvserver/allocator.go, line 1570 at r11 (raw file):

	}
	if !leaseholderInExisting {
		log.Errorf(ctx, "programming error: expected leaseholder store to be in the slice of existing replicas")

But then don't you want to return?


pkg/kv/kvserver/allocator.go, line 1653 at r11 (raw file):

			ctx, source, existing, stats, nil, candidateLeasesMean,
		)
		if !excludeLeaseRepl {

Hadn't read this code before. I don't understand it. You're sure this is all correct, right? I know you're not changing anything in this PR, so just a drive by comment.

Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)


pkg/kv/kvserver/allocator.go, line 1570 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

But then don't you want to return?

Done.


pkg/kv/kvserver/allocator.go, line 1653 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Hadn't read this code before. I don't understand it. You're sure this is all correct, right? I know you're not changing anything in this PR, so just a drive by comment.

I don't really understand why it's written the way it is, and it could use some further clean-up. I'll try and do something about it soon™.

@aayushshah15
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 7, 2022

Build succeeded:

@craig craig bot merged commit bb27f76 into cockroachdb:master Apr 7, 2022
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request May 1, 2022
This patch is a more targetted fix for what cockroachdb#79295 does, for the
purposes of backporting to release-21.2.

Release note (bug fix): Draining nodes could previously re-acquire
leases during an import or an index backfill. This bug is now fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants