kvserver: reset tscache after merge to closed ts #59854
kvserver: reset tscache after merge to closed ts #59854andreimatei wants to merge 2 commits intocockroachdb:masterfrom
Conversation
Export the replica's closed timestamp accessor, so that it can be used by the EvalContext interface in the next commits. Release note: None
This patch deals with what happens to the RHS's timestamp cache on a merge. Before this patch, we were either not touching the cache at all, when the leases of the LHS and RHS were collocated at merge time, or we were bumping the RHS's ts cache up to the freeze point otherwise (because, in this case, the RHS's ts cache info has been lost). This patch goes further: now we'll bump the RHS ts cache up to the RHS closed timestamp on the argument the the RHS's closed timestamp is lost. This patch is needed by the effort to move closed timestamp to be per-range instead of per-store, and also to have future-time closed timestamps. Today, the new ts cache bump is not necessary for a fairly subtle reason: if the pre-merge leases are collocated,, then the closed ts of the RHS is not "lost" because it's the same as the one of the LHS. If the leases are not collocated, the freeze time of the RHS is certainly above its closed ts. So, in either case, the current code doesn't lead to the possibility of accepting write post-merge that invalidate previous follower reads. The RHS' closed timestamp is plumbed from the freeze to the merge through subsume response. Release note: None
nvb
left a comment
There was a problem hiding this comment.
Reviewed 6 of 6 files at r1, 9 of 9 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/kv/kvserver/helpers_test.go, line 293 at r1 (raw file):
} // MaxClosed returns the maximum closed timestamp known to the Replica.
What's the point of this test method now?
pkg/kv/kvserver/replica.go, line 404 at r1 (raw file):
lastReplicaAdded roachpb.ReplicaID lastReplicaAddedTime time.Time // initialMaxClosed is the initial MaxClosedTimestamp timestamp for the replica as known
Did you mean to change this?
pkg/kv/kvserver/batcheval/cmd_subsume.go, line 171 at r2 (raw file):
reply.LeaseAppliedIndex = lai reply.FreezeStart = cArgs.EvalCtx.Clock().NowAsClockTimestamp() closedTS, ok := cArgs.EvalCtx.MaxClosedTimestamp(ctx)
This seems correct for when we have per-range closed timestamps, but it is correct with the current system? Is it possible that we have already proposed a higher closed timestamp on the current side-transport but haven't yet reflected that in MaxClosedTimestamp? I wonder if we should just return false for this method if we haven't started using per-range closed timestamps on this range.
pkg/roachpb/api.proto, line 1661 at r2 (raw file):
(gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/util/hlc.ClockTimestamp"]; // ClosedTS is the range's closed timestamp at the moment of the subsumption.
nit: name this closed_timestamp, like it's named in the RangeInfo and Entry structs.
pkg/roachpb/data.proto, line 161 at r2 (raw file):
// bump the timestamp cache for the keys previously owned by the subsumed // range. util.hlc.Timestamp right_closedTS = 6 [(gogoproto.nullable) = false];
Same point about naming. right_closed_timestamp.
andreimatei
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/helpers_test.go, line 293 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
What's the point of this test method now?
removed
pkg/kv/kvserver/replica.go, line 404 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Did you mean to change this?
no
pkg/kv/kvserver/batcheval/cmd_subsume.go, line 171 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This seems correct for when we have per-range closed timestamps, but it is correct with the current system? Is it possible that we have already proposed a higher closed timestamp on the current side-transport but haven't yet reflected that in
MaxClosedTimestamp? I wonder if we should just returnfalsefor this method if we haven't started using per-range closed timestamps on this range.
In light of this, I've moved this patch to #59854, and made this code use a new FrozenClosedTimestamp, different from MaxClosedTimestamp.
pkg/roachpb/api.proto, line 1661 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: name this
closed_timestamp, like it's named in theRangeInfoandEntrystructs.
done
pkg/roachpb/data.proto, line 161 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Same point about naming.
right_closed_timestamp.
done
|
moved to #59854 |
|
Sorry, moved to #59566 |
This patch deals with what happens to the RHS's timestamp cache on a
merge. Before this patch, we were either not touching the cache at all,
when the leases of the LHS and RHS were collocated at merge time, or we
were bumping the RHS's ts cache up to the freeze point otherwise
(because, in this case, the RHS's ts cache info has been lost). This
patch goes further: now we'll bump the RHS ts cache up to the RHS closed
timestamp on the argument the the RHS's closed timestamp is lost.
This patch is needed by the effort to move closed timestamp to be
per-range instead of per-store, and also to have future-time closed
timestamps. Today, the new ts cache bump is not necessary for a fairly
subtle reason: if the pre-merge leases are collocated,, then the closed
ts of the RHS is not "lost" because it's the same as the one of the LHS.
If the leases are not collocated, the freeze time of the RHS is
certainly above its closed ts. So, in either case, the current code
doesn't lead to the possibility of accepting write post-merge that
invalidate previous follower reads.
The RHS' closed timestamp is plumbed from the freeze to the merge
through subsume response.
Release note: None