Skip to content

kvserver: fix bug that could lead to invalid non-voter rebalance attempts#73670

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
aayushshah15:20211209_fixExemptedReplicasBug
Dec 10, 2021
Merged

kvserver: fix bug that could lead to invalid non-voter rebalance attempts#73670
craig[bot] merged 1 commit intocockroachdb:masterfrom
aayushshah15:20211209_fixExemptedReplicasBug

Conversation

@aayushshah15
Copy link
Copy Markdown
Contributor

A bug was introduced in #61239
which made it such that non-voters could, in certain edge case scenarios,
attempt to rebalance to stores that already had a voting replica for the same
range. However, this bug wouldn't actually materialize into user observable
symptoms because such rebalance decisions were rejected before the
replicateQueue ever tried to execute them. See:

case nonVoterTarget:
if found {
// Non-voters should not consider any of the range's existing stores as
// valid candidates. If we get here, we must have raced with another
// rebalancing decision.
return nil, false, errors.AssertionFailedf(
"invalid rebalancing decision: trying to"+
" move non-voter to a store that already has a replica %s for the range", rdesc,
)
}
chgs = []roachpb.ReplicationChange{
{ChangeType: roachpb.ADD_NON_VOTER, Target: addTarget},
{ChangeType: roachpb.REMOVE_NON_VOTER, Target: removeTarget},
}
}
.
This commit cleans this up and adds a test demonstrating one such edge case.

Release note: None

@aayushshah15 aayushshah15 requested a review from a team as a code owner December 10, 2021 03:23
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@aayushshah15 aayushshah15 requested a review from nvb December 10, 2021 03:23
require.Falsef(
t, ok, "expected no action; got rebalance from s%d to s%d", remove.StoreID, add.StoreID,
)
require.Regexp(t, "it already has a voter", finishAndGetRecording().String())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe add a message "rebalancing failed for a unexpected reason" as we do with ok condition.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: nice catch!

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)


pkg/kv/kvserver/allocator_scorer.go, line 891 at r1 (raw file):

			for _, replOnExemptedStore := range replicasOnExemptedStores {
				if store.StoreID == replOnExemptedStore.StoreID {
					log.VEventf(

Let's add a comment that indicates that replicasOnExemptedStores will be empty when considering voter targets, so this logic is only used when ranking candidates for the placement of a non-voting replica.

…mpts

A bug was introduced in cockroachdb#61239
which made it such that non-voters could, in certain edge case scenarios,
attempt to rebalance to stores that already had a voting replica for the same
range. However, this bug wouldn't actually materialize into user observable
symptoms because such rebalance decisions were rejected before the
`replicateQueue` ever tried to execute them. See:
https://github.com/cockroachdb/cockroach/blob/8bcd038ab6b76f6bf5894f2ffbbdbf10108b4c32/pkg/kv/kvserver/replicate_queue.go#L1257-L1271.
This commit cleans this up and adds a test demonstrating one such edge case.

Release note: None
@aayushshah15 aayushshah15 force-pushed the 20211209_fixExemptedReplicasBug branch from b71a9cb to 6c829c2 Compare December 10, 2021 18:24
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)


pkg/kv/kvserver/allocator_scorer.go, line 891 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Let's add a comment that indicates that replicasOnExemptedStores will be empty when considering voter targets, so this logic is only used when ranking candidates for the placement of a non-voting replica.

done.


pkg/kv/kvserver/allocator_test.go, line 4146 at r1 (raw file):

Previously, aliher1911 (Oleg) wrote…

nit: Maybe add a message "rebalancing failed for a unexpected reason" as we do with ok condition.

done.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 10, 2021

Build succeeded:

@craig craig bot merged commit b145aba into cockroachdb:master Dec 10, 2021
@aayushshah15 aayushshah15 deleted the 20211209_fixExemptedReplicasBug branch December 10, 2021 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants