kvclient: allow DistSender to consider non-voters when routing requests#58772
Conversation
f45a635 to
4a2cca8
Compare
ebc8c80 to
426bc7a
Compare
andreimatei
left a comment
There was a problem hiding this comment.
for instance, follower read requests
I think we want only "follower read requests" to be sent to NON_VOTERs. Can you confirm that that's what the patch does?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)
pkg/kv/kvclient/kvcoord/replica_slice.go, line 46 at r1 (raw file):
// that are not gossiped are omitted from the result. // // Generally, only voting and non-voting replicas are returned. However, if a
nit: s/only voting and non-voting replicas are returned/learners are not returned.
nvb
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15)
pkg/kv/kvserver/replica_follower_read.go, line 65 at r1 (raw file):
} // TODO DURING REVIEW: Is there a reason we shouldn't be allowing // VOTER_INCOMINGs here? Since we use LEARNERs to upreplicate, VOTER_INCOMINGs
I thought either @andreimatei or @ajwerner had made that change. Maybe there's a PR sitting around to do so?
This is also related to @andreimatei's plans to allow VOTER_INCOMING replicas to acquire leases, in which case we'll certainly need to issue both read and write requests to them.
7902b8c to
c39c54a
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Can you confirm that that's what the patch does?
Just to make sure I understood you, we don't want to include NON_VOTERs in the ReplicaSlice in order to avoid a possible network hop in the case we have a request that cannot be served by a follower replica and we have a stale view of the leaseholder, right?
PTAL at what I've done there.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, and @nvanbenschoten)
pkg/kv/kvclient/kvcoord/replica_slice.go, line 46 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: s/only voting and non-voting replicas are returned/learners are not returned.
Done.
pkg/kv/kvserver/replica_follower_read.go, line 65 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I thought either @andreimatei or @ajwerner had made that change. Maybe there's a PR sitting around to do so?
This is also related to @andreimatei's plans to allow VOTER_INCOMING replicas to acquire leases, in which case we'll certainly need to issue both read and write requests to them.
I added a commit to this PR that let's VOTER_INCOMINGs serve follower reads as well.
c39c54a to
35a7675
Compare
andreimatei
left a comment
There was a problem hiding this comment.
LGTM
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)
pkg/kv/kvclient/kvcoord/dist_sender.go, line 1786 at r2 (raw file):
ba.RangeID = desc.RangeID leaseholder := routing.Leaseholder() canFollowerRead := (ds.clusterID != nil) && CanSendToFollower(ds.clusterID.Get(), ds.st, ba)
ugh this is not your mess, but do you mind checking if it's easy to have the cluster id be bound inside the CanSendToFollower() (i.e. in the ccl package)? Like, make the cluster id be provided to that package once and for all when a cluster id is known, instead of burdening callers like this with it.
pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 234 at r2 (raw file):
// TODO(aayush): We should be able create RangeFeeds on non-voting replicas. // Do this in a focussed patch. replicas, err := NewReplicaSlice(ctx, ds.nodeDescs, desc, nil, false)
nit: "Do this in a focussed patch." is a weird TODO to merge.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 66 at r2 (raw file):
desc *roachpb.RangeDescriptor, leaseholder *roachpb.ReplicaDescriptor, canUseFollowerRead bool,
Make it an enum pls.
And if you call the options, for example, Voters and VotersAndNonVoters, then there's no reason to say anything about leases in any comments. Let the caller figure out what it wants and way and be prescriptive.
Since I hate the ReplicaSlice anyway, I'd prefer to keep it as dumb as possible.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):
// hop, everything would still work if we had `All` here. var replicas []roachpb.ReplicaDescriptor if canUseFollowerRead {
let's tie the condition to checkCanReceiveLease(). If you rebase on top of #58722, that function gets a usable interface. It might be time to move it out of batcheval into a more accessible package.
pkg/roachpb/api.go, line 56 at r2 (raw file):
// RequiresReadLease returns whether the ReadConsistencyType requires // that a read-only request be performed on an active valid leaseholder. // TODO DURING REVIEW: Not entirely related to the PR, but this confused me for
yeah; i'd probably just call it Inconistent
pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):
// replication change, incoming or outgoing voters. Notably, this set must // encapsulate all replicas of a range for a range merge to proceed. func (d ReplicaSet) VoterFullAndNonVoterDescriptors() []ReplicaDescriptor {
what's going on here? The implementation is the same as below. Two things with seemingly different name for the same effect?
If you wanna make one about merges, then put that in the name. But I'd still stick to one.
andreimatei
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)
pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
what's going on here? The implementation is the same as below. Two things with seemingly different name for the same effect?
If you wanna make one about merges, then put that in the name. But I'd still stick to one.
Like, either name them for what they return (my preference given where they live) or name them for what they're used for. If we name them but what they return, there should be only one.
nvb
left a comment
There was a problem hiding this comment.
Reviewed 7 of 7 files at r2, 4 of 4 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvclient/kvcoord/dist_sender.go, line 1786 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
ugh this is not your mess, but do you mind checking if it's easy to have the cluster id be bound inside the
CanSendToFollower()(i.e. in the ccl package)? Like, make the cluster id be provided to that package once and for all when a cluster id is known, instead of burdening callers like this with it.
Mind opening an issue for this? Seems pretty unrelated to this PR.
pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 232 at r2 (raw file):
latencyFn = ds.rpcContext.RemoteClocks.Latency } // TODO(aayush): We should be able create RangeFeeds on non-voting replicas.
Yes, this will be a major reason to create non-voting replicas. We should support it. Doing so in a separate patch so we can include a good test sounds like the right approach to me.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 66 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Make it an enum pls.
And if you call the options, for example,VotersandVotersAndNonVoters, then there's no reason to say anything about leases in any comments. Let the caller figure out what it wants and way and be prescriptive.Since I hate the
ReplicaSliceanyway, I'd prefer to keep it as dumb as possible.
Either that or invert this to be onlyPossibleLeaseholders bool. I fear the inclusion of the leaseholder arg here already did away with keeping this dumb.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):
It might be time to move it out of batcheval into a more accessible package.
Agreed. Let's move it to roachpb. And make it return a bool instead of an error so it's cheap to call.
pkg/kv/kvserver/replica_learner_test.go, line 775 at r3 (raw file):
} func TestLearnerAndJointConfigFollowerRead(t *testing.T) {
Should this test have been expanded in the first commit to include non-voting replicas? Or should we otherwise introduce a new test for follower reads off non-voting replicas? Actually, it doesn't look like we're actually testing that VOTER_INCOMING replicas can server follower reads now, just that they can't not serve them. So I think we need a test in each of these commits. Maybe there's an existing test that we can generalize?
pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Like, either name them for what they return (my preference given where they live) or name them for what they're used for. If we name them but what they return, there should be only one.
+1 to these points.
a00c003 to
b6c328f
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)
pkg/kv/kvclient/kvcoord/dist_sender.go, line 1786 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Mind opening an issue for this? Seems pretty unrelated to this PR.
It seems like we generate it inside bootstrapCluster() and then plumb it through to where we create the DistSender. I can't immediately think of a way to do what you want Andrei, I'm adding a TODO for now.
pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 232 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Yes, this will be a major reason to create non-voting replicas. We should support it. Doing so in a separate patch so we can include a good test sounds like the right approach to me.
👍
pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 234 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: "Do this in a focussed patch." is a weird TODO to merge.
Fixed.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 66 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Either that or invert this to be
onlyPossibleLeaseholders bool. I fear the inclusion of theleaseholderarg here already did away with keeping this dumb.
Turned it into an enum.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It might be time to move it out of batcheval into a more accessible package.
Agreed. Let's move it to
roachpb. And make it return a bool instead of an error so it's cheap to call.
Done except for changing the return type of that method since it seems like a bunch of its callers might care about the kind of error being returned (i.e. if the ReplicaID being passed in does not exist). In case this is about avoiding potential allocations, would it make it cheap enough if the errors were consts at the top of the file?
pkg/kv/kvserver/replica_learner_test.go, line 775 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should this test have been expanded in the first commit to include non-voting replicas? Or should we otherwise introduce a new test for follower reads off non-voting replicas? Actually, it doesn't look like we're actually testing that
VOTER_INCOMINGreplicas can server follower reads now, just that they can't not serve them. So I think we need a test in each of these commits. Maybe there's an existing test that we can generalize?
I extended an existing test for the first commit, and added a new one testing follower reads on VOTER_INCOMING.
pkg/roachpb/api.go, line 56 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
yeah; i'd probably just call it
Inconistent
Added a TODO for it.
pkg/roachpb/metadata_replicas.go, line 266 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
+1 to these points.
I had messed up, I had intended to have it such that a predVoterFullOrNonVoter() filters for usage in range merges (doesn't include VOTER_INCOMING) and a predVoterOrNonVoter() filters for the DistSender to route follower read requests to (which does include VOTER_INCOMING). All fixed now.
b6c328f to
dd12430
Compare
andreimatei
left a comment
There was a problem hiding this comment.
LGTM
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)
pkg/roachpb/metadata_replicas.go, line 497 at r9 (raw file):
} // CheckCanReceiveLease checks whether `wouldbeLeaseholder` can receive a lease. Returns an error
wrapping
nvb
left a comment
There was a problem hiding this comment.
Reviewed 6 of 7 files at r7, 6 of 11 files at r8, 6 of 6 files at r9.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvclient/kvcoord/dist_sender_rangefeed.go, line 232 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
👍
Mind opening an issue for this? Issues are better for tracking anything that has user-visible impact.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):
In case this is about avoiding potential allocations, would it make it cheap enough if the errors were consts at the top of the file?
Yes, that would make things cheap enough. We'll need to make that change in this commit if we want to keep this as an error. We don't want to introduce new allocations into this hot path if they can be easily avoided.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 42 at r9 (raw file):
type ReplicaSlice []ReplicaInfo type ReplicaSliceFilter int
The linter will likely tell you this, but this will need a comment, as will each value.
pkg/kv/kvserver/replica_learner_test.go, line 791 at r9 (raw file):
defer tc.Stopper().Stop(ctx) db := sqlutils.MakeSQLRunner(tc.ServerConn(0)) db.Exec(t, fmt.Sprintf(`SET CLUSTER SETTING kv.closed_timestamp.target_duration = '%s'`,
Out of curiosity, why was this needed?
pkg/sql/physicalplan/replicaoracle/oracle.go, line 228 at r9 (raw file):
} replicas, err := replicaSliceOrErr(ctx, o.nodeDescs, desc, kvcoord.OnlyPotentialLeaseholders)
How did you decide on these choices. It's not clear to me why some should use OnlyPotentialLeaseholders and others should use AllExtantReplicas.
d935acf to
1481e6d
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)
pkg/kv/kvclient/kvcoord/replica_slice.go, line 78 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
In case this is about avoiding potential allocations, would it make it cheap enough if the errors were consts at the top of the file?
Yes, that would make things cheap enough. We'll need to make that change in this commit if we want to keep this as an
error. We don't want to introduce new allocations into this hot path if they can be easily avoided.
Done, but these pre-allocated errors cannot be parametrized right? LMK if you're not ok with what I've done here.
pkg/kv/kvclient/kvcoord/replica_slice.go, line 42 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
The linter will likely tell you this, but this will need a comment, as will each value.
Done.
pkg/kv/kvserver/replica_learner_test.go, line 791 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Out of curiosity, why was this needed?
Previous incantation was getting parsed as ~8333 hours by CRDB. Before I wrote a separate test for follower reads on VOTER_INCOMING, I tried to quickly check it by modifying this test & noticed the funkiness.
pkg/roachpb/metadata_replicas.go, line 497 at r9 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
wrapping
Looks like you fixed it in your PR.
pkg/sql/physicalplan/replicaoracle/oracle.go, line 228 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
How did you decide on these choices. It's not clear to me why some should use
OnlyPotentialLeaseholdersand others should useAllExtantReplicas.
We only use the closestOracle for follower-read queries see: https://github.com/cockroachdb/cockroach/blob/b44a0a0f184ad8437ee56b133dc6ecc7227e4393/pkg/ccl/kvccl/kvfollowerreadsccl/followerreads.go#L137, and binPackingOracle for the rest. We don't use the randomChoiceOracle for anything besides testing AFAICT.
1481e6d to
c6f6723
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 5 of 17 files at r10, 6 of 11 files at r11, 6 of 6 files at r12.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)
c6f6723 to
88740cd
Compare
Release note: None
Before this patch, the DistSender would only route BatchRequests to replicas of types `VOTER_FULL` and `VOTER_INCOMING`. This patch changes that to let the DistSender route requests (for instance, follower read requests) to `NON_VOTER` replicas as well. Makes progress on cockroachdb#51943 Release note: None
Release note: None
88740cd to
13b2180
Compare
|
TFTRs! bors r+ |
|
Build succeeded: |
Before this patch, the DistSender would only route BatchRequests to
replicas of types
VOTER_FULLandVOTER_INCOMING. This patch changesthat to let the DistSender route requests (for instance, follower read
requests) to
NON_VOTERreplicas as well.Release note: None