Skip to content

kvclient: allow DistSender to consider non-voters when routing requests#58772

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
aayushshah15:distsender-consider-non-voters
Jan 29, 2021
Merged

kvclient: allow DistSender to consider non-voters when routing requests#58772
craig[bot] merged 3 commits intocockroachdb:masterfrom
aayushshah15:distsender-consider-non-voters

Conversation

@aayushshah15
Copy link
Copy Markdown
Contributor

Before this patch, the DistSender would only route BatchRequests to
replicas of types VOTER_FULL and VOTER_INCOMING. This patch changes
that to let the DistSender route requests (for instance, follower read
requests) to NON_VOTER replicas as well.

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch from f45a635 to 4a2cca8 Compare January 12, 2021 07:37
@lunevalex lunevalex added the A-multiregion Related to multi-region label Jan 12, 2021
@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch 2 times, most recently from ebc8c80 to 426bc7a Compare January 12, 2021 17:20
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for instance, follower read requests

I think we want only "follower read requests" to be sent to NON_VOTERs. Can you confirm that that's what the patch does?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/replica_slice.go, line 46 at r1 (raw file):

// that are not gossiped are omitted from the result.
//
// Generally, only voting and non-voting replicas are returned. However, if a

nit: s/only voting and non-voting replicas are returned/learners are not returned.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15)


pkg/kv/kvserver/replica_follower_read.go, line 65 at r1 (raw file):

	}
	// TODO DURING REVIEW: Is there a reason we shouldn't be allowing
	// VOTER_INCOMINGs here? Since we use LEARNERs to upreplicate, VOTER_INCOMINGs

I thought either @andreimatei or @ajwerner had made that change. Maybe there's a PR sitting around to do so?

This is also related to @andreimatei's plans to allow VOTER_INCOMING replicas to acquire leases, in which case we'll certainly need to issue both read and write requests to them.

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch 5 times, most recently from 7902b8c to c39c54a Compare January 14, 2021 09:19
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that that's what the patch does?

Just to make sure I understood you, we don't want to include NON_VOTERs in the ReplicaSlice in order to avoid a possible network hop in the case we have a request that cannot be served by a follower replica and we have a stale view of the leaseholder, right?

PTAL at what I've done there.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/replica_slice.go, line 46 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: s/only voting and non-voting replicas are returned/learners are not returned.

Done.


pkg/kv/kvserver/replica_follower_read.go, line 65 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I thought either @andreimatei or @ajwerner had made that change. Maybe there's a PR sitting around to do so?

This is also related to @andreimatei's plans to allow VOTER_INCOMING replicas to acquire leases, in which case we'll certainly need to issue both read and write requests to them.

I added a commit to this PR that let's VOTER_INCOMINGs serve follower reads as well.

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch from c39c54a to 35a7675 Compare January 14, 2021 12:51
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/dist_sender.go, line 1786 at r2 (raw file):

	ba.RangeID = desc.RangeID
	leaseholder := routing.Leaseholder()
	canFollowerRead := (ds.clusterID != nil) && CanSendToFollower(ds.clusterID.Get(), ds.st, ba)

ugh this is not your mess, but do you mind checking if it's easy to have the cluster id be bound inside the CanSendToFollower() (i.e. in the ccl package)? Like, make the cluster id be provided to that package once and for all when a cluster id is known, instead of burdening callers like this with it.


pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 234 at r2 (raw file):

	// TODO(aayush): We should be able create RangeFeeds on non-voting replicas.
	// Do this in a focussed patch.
	replicas, err := NewReplicaSlice(ctx, ds.nodeDescs, desc, nil, false)

nit: "Do this in a focussed patch." is a weird TODO to merge.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 66 at r2 (raw file):

	desc *roachpb.RangeDescriptor,
	leaseholder *roachpb.ReplicaDescriptor,
	canUseFollowerRead bool,

Make it an enum pls.
And if you call the options, for example, Voters and VotersAndNonVoters, then there's no reason to say anything about leases in any comments. Let the caller figure out what it wants and way and be prescriptive.

Since I hate the ReplicaSlice anyway, I'd prefer to keep it as dumb as possible.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):

	// hop, everything would still work if we had `All` here.
	var replicas []roachpb.ReplicaDescriptor
	if canUseFollowerRead {

let's tie the condition to checkCanReceiveLease(). If you rebase on top of #58722, that function gets a usable interface. It might be time to move it out of batcheval into a more accessible package.


pkg/roachpb/api.go, line 56 at r2 (raw file):

// RequiresReadLease returns whether the ReadConsistencyType requires
// that a read-only request be performed on an active valid leaseholder.
// TODO DURING REVIEW: Not entirely related to the PR, but this confused me for

yeah; i'd probably just call it Inconistent


pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):

// replication change, incoming or outgoing voters. Notably, this set must
// encapsulate all replicas of a range for a range merge to proceed.
func (d ReplicaSet) VoterFullAndNonVoterDescriptors() []ReplicaDescriptor {

what's going on here? The implementation is the same as below. Two things with seemingly different name for the same effect?
If you wanna make one about merges, then put that in the name. But I'd still stick to one.

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

what's going on here? The implementation is the same as below. Two things with seemingly different name for the same effect?
If you wanna make one about merges, then put that in the name. But I'd still stick to one.

Like, either name them for what they return (my preference given where they live) or name them for what they're used for. If we name them but what they return, there should be only one.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r2, 4 of 4 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)


pkg/kv/kvclient/kvcoord/dist_sender.go, line 1786 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

ugh this is not your mess, but do you mind checking if it's easy to have the cluster id be bound inside the CanSendToFollower() (i.e. in the ccl package)? Like, make the cluster id be provided to that package once and for all when a cluster id is known, instead of burdening callers like this with it.

Mind opening an issue for this? Seems pretty unrelated to this PR.


pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 232 at r2 (raw file):

		latencyFn = ds.rpcContext.RemoteClocks.Latency
	}
	// TODO(aayush): We should be able create RangeFeeds on non-voting replicas.

Yes, this will be a major reason to create non-voting replicas. We should support it. Doing so in a separate patch so we can include a good test sounds like the right approach to me.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 66 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Make it an enum pls.
And if you call the options, for example, Voters and VotersAndNonVoters, then there's no reason to say anything about leases in any comments. Let the caller figure out what it wants and way and be prescriptive.

Since I hate the ReplicaSlice anyway, I'd prefer to keep it as dumb as possible.

Either that or invert this to be onlyPossibleLeaseholders bool. I fear the inclusion of the leaseholder arg here already did away with keeping this dumb.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):

It might be time to move it out of batcheval into a more accessible package.

Agreed. Let's move it to roachpb. And make it return a bool instead of an error so it's cheap to call.


pkg/kv/kvserver/replica_learner_test.go, line 775 at r3 (raw file):

}

func TestLearnerAndJointConfigFollowerRead(t *testing.T) {

Should this test have been expanded in the first commit to include non-voting replicas? Or should we otherwise introduce a new test for follower reads off non-voting replicas? Actually, it doesn't look like we're actually testing that VOTER_INCOMING replicas can server follower reads now, just that they can't not serve them. So I think we need a test in each of these commits. Maybe there's an existing test that we can generalize?


pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Like, either name them for what they return (my preference given where they live) or name them for what they're used for. If we name them but what they return, there should be only one.

+1 to these points.

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch 6 times, most recently from a00c003 to b6c328f Compare January 18, 2021 01:55
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/dist_sender.go, line 1786 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Mind opening an issue for this? Seems pretty unrelated to this PR.

It seems like we generate it inside bootstrapCluster() and then plumb it through to where we create the DistSender. I can't immediately think of a way to do what you want Andrei, I'm adding a TODO for now.


pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 232 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Yes, this will be a major reason to create non-voting replicas. We should support it. Doing so in a separate patch so we can include a good test sounds like the right approach to me.

👍


pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 234 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: "Do this in a focussed patch." is a weird TODO to merge.

Fixed.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 66 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Either that or invert this to be onlyPossibleLeaseholders bool. I fear the inclusion of the leaseholder arg here already did away with keeping this dumb.

Turned it into an enum.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It might be time to move it out of batcheval into a more accessible package.

Agreed. Let's move it to roachpb. And make it return a bool instead of an error so it's cheap to call.

Done except for changing the return type of that method since it seems like a bunch of its callers might care about the kind of error being returned (i.e. if the ReplicaID being passed in does not exist). In case this is about avoiding potential allocations, would it make it cheap enough if the errors were consts at the top of the file?


pkg/kv/kvserver/replica_learner_test.go, line 775 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this test have been expanded in the first commit to include non-voting replicas? Or should we otherwise introduce a new test for follower reads off non-voting replicas? Actually, it doesn't look like we're actually testing that VOTER_INCOMING replicas can server follower reads now, just that they can't not serve them. So I think we need a test in each of these commits. Maybe there's an existing test that we can generalize?

I extended an existing test for the first commit, and added a new one testing follower reads on VOTER_INCOMING.


pkg/roachpb/api.go, line 56 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

yeah; i'd probably just call it Inconistent

Added a TODO for it.


pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

+1 to these points.

I had messed up, I had intended to have it such that a predVoterFullOrNonVoter() filters for usage in range merges (doesn't include VOTER_INCOMING) and a predVoterOrNonVoter() filters for the DistSender to route follower read requests to (which does include VOTER_INCOMING). All fixed now.

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch from b6c328f to dd12430 Compare January 18, 2021 11:39
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)


pkg/roachpb/metadata_replicas.go, line 497 at r9 (raw file):

}

// CheckCanReceiveLease checks whether `wouldbeLeaseholder` can receive a lease. Returns an error

wrapping

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: mod a few comments.

Reviewed 6 of 7 files at r7, 6 of 11 files at r8, 6 of 6 files at r9.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)


pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 232 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

👍

Mind opening an issue for this? Issues are better for tracking anything that has user-visible impact.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):

In case this is about avoiding potential allocations, would it make it cheap enough if the errors were consts at the top of the file?

Yes, that would make things cheap enough. We'll need to make that change in this commit if we want to keep this as an error. We don't want to introduce new allocations into this hot path if they can be easily avoided.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 42 at r9 (raw file):

type ReplicaSlice []ReplicaInfo

type ReplicaSliceFilter int

The linter will likely tell you this, but this will need a comment, as will each value.


pkg/kv/kvserver/replica_learner_test.go, line 791 at r9 (raw file):

	defer tc.Stopper().Stop(ctx)
	db := sqlutils.MakeSQLRunner(tc.ServerConn(0))
	db.Exec(t, fmt.Sprintf(`SET CLUSTER SETTING kv.closed_timestamp.target_duration = '%s'`,

Out of curiosity, why was this needed?


pkg/sql/physicalplan/replicaoracle/oracle.go, line 228 at r9 (raw file):

	}

	replicas, err := replicaSliceOrErr(ctx, o.nodeDescs, desc, kvcoord.OnlyPotentialLeaseholders)

How did you decide on these choices. It's not clear to me why some should use OnlyPotentialLeaseholders and others should use AllExtantReplicas.

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch 2 times, most recently from d935acf to 1481e6d Compare January 27, 2021 16:52
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

In case this is about avoiding potential allocations, would it make it cheap enough if the errors were consts at the top of the file?

Yes, that would make things cheap enough. We'll need to make that change in this commit if we want to keep this as an error. We don't want to introduce new allocations into this hot path if they can be easily avoided.

Done, but these pre-allocated errors cannot be parametrized right? LMK if you're not ok with what I've done here.


pkg/kv/kvclient/kvcoord/replica_slice.go, line 42 at r9 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The linter will likely tell you this, but this will need a comment, as will each value.

Done.


pkg/kv/kvserver/replica_learner_test.go, line 791 at r9 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Out of curiosity, why was this needed?

Previous incantation was getting parsed as ~8333 hours by CRDB. Before I wrote a separate test for follower reads on VOTER_INCOMING, I tried to quickly check it by modifying this test & noticed the funkiness.


pkg/roachpb/metadata_replicas.go, line 497 at r9 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

wrapping

Looks like you fixed it in your PR.


pkg/sql/physicalplan/replicaoracle/oracle.go, line 228 at r9 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

How did you decide on these choices. It's not clear to me why some should use OnlyPotentialLeaseholders and others should use AllExtantReplicas.

We only use the closestOracle for follower-read queries see: https://github.com/cockroachdb/cockroach/blob/b44a0a0f184ad8437ee56b133dc6ecc7227e4393/pkg/ccl/kvccl/kvfollowerreadsccl/followerreads.go#L137, and binPackingOracle for the rest. We don't use the randomChoiceOracle for anything besides testing AFAICT.

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch from 1481e6d to c6f6723 Compare January 28, 2021 03:27
Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 17 files at r10, 6 of 11 files at r11, 6 of 6 files at r12.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)

@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch from c6f6723 to 88740cd Compare January 28, 2021 20:43
Before this patch, the DistSender would only route BatchRequests to
replicas of types `VOTER_FULL` and `VOTER_INCOMING`. This patch changes
that to let the DistSender route requests (for instance, follower read
requests) to `NON_VOTER` replicas as well.

Makes progress on cockroachdb#51943

Release note: None
@aayushshah15 aayushshah15 force-pushed the distsender-consider-non-voters branch from 88740cd to 13b2180 Compare January 28, 2021 22:31
@aayushshah15
Copy link
Copy Markdown
Contributor Author

TFTRs!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 29, 2021

Build succeeded:

@craig craig bot merged commit c41a6f6 into cockroachdb:master Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-multiregion Related to multi-region

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants