Skip to content

kvserver: prevent follower reads while a range is subsumed#50265

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
aayushshah15:rangefeed_drop
Aug 4, 2020
Merged

kvserver: prevent follower reads while a range is subsumed#50265
craig[bot] merged 3 commits intocockroachdb:masterfrom
aayushshah15:rangefeed_drop

Conversation

@aayushshah15
Copy link
Copy Markdown
Contributor

@aayushshah15 aayushshah15 commented Jun 16, 2020

Before this commit, during a merge, the RHS leaseholder’s store
could continue broadcasting (actionable) closed timestamp updates
even after it had been subsumed. This allowed the followers to be
able to serve follower reads past the subsumption time of RHS.
Additionally, after the merge, if the LHS had a lower closed timestamp
than the RHS, it could allow writes to the keyspan owned by RHS
at timestamps lower than the RHS’s max closed timestamp.

This commit fixes this bug by requiring that the followers catch up
to a LeaseAppliedIndex that belongs to the entry succeeding the
Subsume request.

Fixes #44878

Release note (bug fix): Fixed a rare bug that could cause actionable
closed timestamps to effectively regress over a given keyspan. This
could in turn lead to a serializability violation when using follower
reads. This was due to ill-defined interactions between range merges
and the closed timestamp subsystem.

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 16, 2020

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Jun 16, 2020
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@aayushshah15
Copy link
Copy Markdown
Contributor Author

I feel uneasy about having store.cfg exported, please let me know if I missed something cleaner.

In a future commit, I will add a test that forces a lease transfer during the merge critical state.

@aayushshah15 aayushshah15 removed the request for review from danhhz June 16, 2020 10:05
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 16, 2020

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were there supposed to be one commits or two in here? I see what with a message looking like the squash of two.

Nit: there's a lot diffs created by the exporting of that Cfg guy. Moving that change to a dedicated commit would make reviewing easier.

Nit: the release note is too technical; users don't understand this. Also the tense is wrong (as per https://wiki.crdb.io/wiki/spaces/CRDB/pages/186548364/Release+notes) - say "Fixed a bug".

Release note (bug fix): Fixes a bug that allowed follower reads
during the critical state of a merge transaction.

I'd say

Fixed a rare bug causing transaction serializability violations when using follower reads.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, @nvanbenschoten, and @tbg)


pkg/kv/kvserver/closed_timestamp_test.go, line 710 at r1 (raw file):

// done.
func setupClusterForClosedTSTestingWithCustomKnobs(
	ctx context.Context, t *testing.T, targetDuration time.Duration, knobs base.TestingKnobs,

might as well take a TestClusterArgs.


pkg/kv/kvserver/closed_timestamp_test.go, line 801 at r1 (raw file):

	repls []*kvserver.Replica,
) {
	defaultTestingKnobs := base.TestingKnobs{

Consider keeping a single setupClusterForClosedTSTesting() function by extracting defaultTestingKnobs into a aggressiveResolvedTSPushKnobs that everybody passed in. That way it'll be easier for tests to compose these default knobs with other additional knobs.


pkg/kv/kvserver/replica.go, line 1330 at r1 (raw file):

// neighbor is in its critical phase and, if so, arranges to block all requests
// until the merge completes.
func (r *Replica) maybeWatchForMerge(ctx context.Context, untrack closedts.ReleaseFunc) error {

pls comment the new param


pkg/kv/kvserver/replica.go, line 1361 at r1 (raw file):

		// Nothing more to do.
		r.mu.Unlock()
		return nil

don't we need to call untrack() with LAI+1 in this case?


pkg/kv/kvserver/replica.go, line 1370 at r1 (raw file):

	// arrived and now, which is rare but entirely legal.
	r.unquiesceLocked()
	// We prevent replicas from being able to serve follower reads on timestamps

I think this comment needs to spell things out more: say that we block follower reads by publishing a MLAI that's never going to come in the happy case, and we do that because the closed timestamps that are published from now on will not be reflected in the LHS's tscache after the merge.


pkg/kv/kvserver/replica.go, line 1372 at r1 (raw file):

	// We prevent replicas from being able to serve follower reads on timestamps
	// that fall between the subsumption time (FreezeStart) and the timestamp at
	// which the merge transaction commits or aborts (i.e. when a merge is in its

s/when a merge/the timestamp window representing the range's critical state

Also I don't see the syntagm "critical state" used elsewhere around here. Maybe say something more descriptive like "subsumed state".


pkg/kv/kvserver/replica.go, line 1382 at r1 (raw file):

	// has been subsumed are lease requests, which do not bump the LAI.
	//
	// See https://github.com/cockroachdb/cockroach/issues/44878 for discussion.

nit: I personally consider most such issue links to be just click bait. The comments should usually be sufficient for explaining what's going on; when they're not sufficient because we failed to write them properly, there's git blame.


pkg/kv/kvserver/replica_read.go, line 127 at r1 (raw file):

	log.Event(ctx, "executing read-only batch")

	untrack := r.maybeTrackProposal(ctx, ba)

can this down, move next to res.Local.ReleaseFunc = untrack ?


pkg/kv/kvserver/replica_read.go, line 130 at r1 (raw file):

		}
		br, res, pErr = evaluateBatch(ctx, kvserverbase.CmdIDKey(""), rw, rec, nil, ba, true /* readOnly */)

spurious diff


pkg/kv/kvserver/replica_read.go, line 150 at r1 (raw file):

		return nil, res, pErr
	}
	res.Local.ReleaseFunc = untrack

Is this indirection really needed? Can't we simply untrack here?
I see that there are some early returns in maybeWatchForMerge that don't untrack with LAI+1, but I can't tell which ones are expected. Plus one of them seems wrong to me.

If the indirection is necessary, can we at least bind all the arguments to the closure we put in res.Local.ReleaseFunc? They shouldn't change in between here and maybeWatchForMerge, should they? The closure could take a single bool about whether we're watching for the merge or not (a false would translate to all zeros to untrack()).


pkg/kv/kvserver/replica_read.go, line 154 at r1 (raw file):

}

func (r *Replica) maybeTrackProposal(

This funky method has one caller. Inline it there and then I won't have to ask for a better name and comments.


pkg/kv/kvserver/replica_read.go, line 160 at r1 (raw file):

		// We start tracking SubsumeRequests as part of our guarantee to never
		// broadcast a closed timestamp entry between when we evaluate the Subsume
		// request and the time that we run maybeWatchForMerge for the first time.

I don't quite understand the "for the first time" part. Consider axing it.


pkg/kv/kvserver/replica_read.go, line 162 at r1 (raw file):

		// request and the time that we run maybeWatchForMerge for the first time.
		// See comment in maybeWatchForMerge() for details.
		_, untrack := r.store.Cfg.ClosedTimestamp.Tracker.Track(ctx)

I think there's something missing here - the other return value from Track is a minimum timestamp for the (Subsume) request. I think we need to reject the Subsume request if its timestamp is above it. I think the comment above should spell out the important of Subsume's ts.


pkg/kv/kvserver/batcheval/result/result.go, line 69 at r1 (raw file):

	// Call maybeWatchForMerge.
	MaybeWatchForMerge bool
	// ReleaseFunc, when it's not the empty function, stores the callback returned

I think this needs more words; the read-only SubsumeRequest doesn't have a LeaseAppliedIndex.

Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, @nvanbenschoten, and @tbg)


pkg/kv/kvserver/replica.go, line 1361 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

don't we need to call untrack() with LAI+1 in this case?

We only need to release this tracked proposal the first time we run maybeWatchForMerge (which should always be right after the subsume request evaluation unless there is a lease transfer). Any time we see that we've already started the mergeComplete channel, we should already have released this proposal then. This is what the comment below (in maybeTrackProposal) talks about, I'll make it clearer.


pkg/kv/kvserver/replica_read.go, line 127 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

can this down, move next to res.Local.ReleaseFunc = untrack ?

Starting to track the proposal after evaluation of the subsume request seemed problematic for the given scenario:
t1: RHS leaseholder begins evaluating the subsume request (i.e. t1 is our FreezeStart)
t2: The leaseholder's store broadcasts a closed timestamp that is above t1
t3: Now we begin tracking the proposal, but it's too late because we've already emitted a closed timestamp above FreezeStart

Does that make sense? Let me know if I'm missing something.

@nvb
Copy link
Copy Markdown
Contributor

nvb commented Jun 16, 2020

there's a lot diffs created by the exporting of that Cfg guy. Moving that change to a dedicated commit would make reviewing easier.

Big +1 on this. The main change here is small but subtle and difficult to prove the correctness of. It's a major burden for reviewers and future readers when such changes are mixed in with a refactor that touches 46 files.

// NB: The above statement relies on the invariant that the LAI that
// follows a Subsume request will be applied only after the merge aborts, as
// the RHS replicas will get destroyed if the merge successfully commits. This
// invariant is upheld because the only Raft proposals allowed after a range
// has been subsumed are lease requests, which do not bump the LAI.

Are we sure about this? I mentioned in #44878 (comment) that I'm not convinced this invariant is currently upheld. It relies on latching and that there are Requests that perform Raft proposals that do not acquire any latches that conflict with the SubsumeRequest request, such as ComputeChecksumRequests.

Given that this is so critical for correctness, I think it would be worth adding assertions around raft proposal logic to ensure that we uphold this invariant.

@aayushshah15
Copy link
Copy Markdown
Contributor Author

Sounds good, feel free to hold off on reviewing until I reorg it into multiple commits. I'll also add assertions around the invariant.

@tbg tbg removed their request for review June 25, 2020 09:55
@aayushshah15 aayushshah15 changed the title kvserver: prevent follower reads during range merge critical state kvserver: prevent follower reads while a range is subsumed Jun 25, 2020
@aayushshah15 aayushshah15 force-pushed the rangefeed_drop branch 4 times, most recently from f136ce0 to fd1c9fa Compare June 25, 2020 23:58
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reorganized a bunch of stuff and added a new test that forces a lease transfer on the RHS during the merge (& ensures that our closed timestamp guarantees are upheld).

With regards to ComputeChecksumRequest: It turns out that the ComputeChecksum request does "acquire latches" even though it doesn't declare any keys. However, checkExecutionCanProceed only cares about the latchGuard being non-nil on the leaseholder. It almost seems like an unintentional consequence. Should we be more explicit about this? Sorry for not investigating this properly earlier. I added an assertion that blows up if it sees a proposal that would bump the LAI while the range is subsumed, along with an accompanying test.

This is RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/closed_timestamp_test.go, line 710 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

might as well take a TestClusterArgs.

Done, let me know if I misunderstood you.


pkg/kv/kvserver/closed_timestamp_test.go, line 801 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Consider keeping a single setupClusterForClosedTSTesting() function by extracting defaultTestingKnobs into a aggressiveResolvedTSPushKnobs that everybody passed in. That way it'll be easier for tests to compose these default knobs with other additional knobs.

Done


pkg/kv/kvserver/replica.go, line 1330 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

pls comment the new param

Removed the indirection.


pkg/kv/kvserver/replica.go, line 1370 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think this comment needs to spell things out more: say that we block follower reads by publishing a MLAI that's never going to come in the happy case, and we do that because the closed timestamps that are published from now on will not be reflected in the LHS's tscache after the merge.

Done, let me know if you still find it lacking.


pkg/kv/kvserver/replica.go, line 1372 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

s/when a merge/the timestamp window representing the range's critical state

Also I don't see the syntagm "critical state" used elsewhere around here. Maybe say something more descriptive like "subsumed state".

Done.


pkg/kv/kvserver/replica.go, line 1382 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: I personally consider most such issue links to be just click bait. The comments should usually be sufficient for explaining what's going on; when they're not sufficient because we failed to write them properly, there's git blame.

Removed.


pkg/kv/kvserver/replica_read.go, line 150 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Is this indirection really needed? Can't we simply untrack here?
I see that there are some early returns in maybeWatchForMerge that don't untrack with LAI+1, but I can't tell which ones are expected. Plus one of them seems wrong to me.

If the indirection is necessary, can we at least bind all the arguments to the closure we put in res.Local.ReleaseFunc? They shouldn't change in between here and maybeWatchForMerge, should they? The closure could take a single bool about whether we're watching for the merge or not (a false would translate to all zeros to untrack()).

You're correct that the indirection isn't really necessary since we don't need to untrack() in the lease acquisition path (which is what the initial check in maybeWatchForMerge is for)


pkg/kv/kvserver/replica_read.go, line 154 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

This funky method has one caller. Inline it there and then I won't have to ask for a better name and comments.

Done


pkg/kv/kvserver/replica_read.go, line 160 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I don't quite understand the "for the first time" part. Consider axing it.

Done.


pkg/kv/kvserver/replica_read.go, line 162 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think there's something missing here - the other return value from Track is a minimum timestamp for the (Subsume) request. I think we need to reject the Subsume request if its timestamp is above it. I think the comment above should spell out the important of Subsume's ts.

Not applicable as discussed offline.


pkg/kv/kvserver/batcheval/result/result.go, line 69 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think this needs more words; the read-only SubsumeRequest doesn't have a LeaseAppliedIndex.

Removed this, see above.

@aayushshah15 aayushshah15 force-pushed the rangefeed_drop branch 2 times, most recently from b89c7c9 to 5f0b494 Compare June 26, 2020 13:32
@aayushshah15
Copy link
Copy Markdown
Contributor Author

The test that forces a lease transfer during the merge is flakey under contention because it relies on the merge txn not retrying 😞. Trying to fix that..

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

nit: I imagine I'm probably driving you crazy with this this stuff, but I think r4 and r5 should be squashed with r2. The testing really belongs with the commit fixing the issue. And the assertion is tiny, and also tied to the main thing this patch is doing. TestStoreRangeMergeCheckConsistencyAfterSubsumption is more random, that can be a separate commit indeed.

Btw, I also find it much better to rewrite all the commits every time I update a PR (like, rebase them all), so that reviewable shows the new commits as a contiguous block, instead of interspersing old commits and new commits.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/client_merge_test.go, line 1534 at r5 (raw file):

// 2. Once a merge is aborted, pending (and future) requests will be allowed to
// be proposed. This is meant as a sanity check for the assertion at the
// end of Replica.propose().

pls put more words here describing this assertion


pkg/kv/kvserver/client_merge_test.go, line 1599 at r5 (raw file):

	case <-checkConsistencyResp:
		t.Fatalf("expected the consistency check to wait until the merge was complete")
	case <-time.After(1 * time.Second):

you probably want more than a second here and below; tests run on loaded machines, particularly under testrace.
Consider testutils.SucceedsSoon() - that's the go to for waiting a while


pkg/kv/kvserver/closed_timestamp_test.go, line 469 at r4 (raw file):

	t.Run("without lease transfer", func(t *testing.T) {
		ctx := context.Background()
		blockMergeTrigger := make(chan interface{}, 10) // headroom in case the merge txn retries

why interface{} and not hlc.Timestamp ?


pkg/kv/kvserver/replica_raft.go, line 337 at r5 (raw file):

	if maxLeaseIndex != 0 {
		r.mu.RLock()
		defer r.mu.RUnlock()

nit: I'd just put the RUnlock() below, without a defer. You want to hold this lock for as little as possible, and tie it with the mergeInProgressRLocked() call. Like, if there's more code added to this method, you don't want it running under the lock.
If the log.Fatal() is hit, the lock won't be release anyway.


pkg/kv/kvserver/replica_read.go, line 127 at r1 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

Starting to track the proposal after evaluation of the subsume request seemed problematic for the given scenario:
t1: RHS leaseholder begins evaluating the subsume request (i.e. t1 is our FreezeStart)
t2: The leaseholder's store broadcasts a closed timestamp that is above t1
t3: Now we begin tracking the proposal, but it's too late because we've already emitted a closed timestamp above FreezeStart

Does that make sense? Let me know if I'm missing something.

ack


pkg/kv/kvserver/replica_read.go, line 126 at r2 (raw file):

	log.Event(ctx, "executing read-only batch")

	untrack := func(_ context.Context, _ ctpb.Epoch, _ roachpb.RangeID, _ ctpb.LAI) {}

I'd suggest simplifying this untrack() to only take a bool - whether or not the range has been subsumed, and closing over all the other arguments right here.

@aayushshah15
Copy link
Copy Markdown
Contributor Author

aayushshah15 commented Jul 20, 2020

Turns out @andreimatei's current in-flight PRs #51437 & #51395 fix exactly these sorts of issues (relating to the presence of "speculative" descriptors in the cache). I'll stress master & this commit a bit further with his fix patched in.

@aayushshah15
Copy link
Copy Markdown
Contributor Author

In addition to what I just posted in #51294, I also ran into some clock sync related Fatals while stressing kvnemesis.

  • I saw a run failing with this on server start.
clock synchronization error: this node is more than 500ms away from at least half of the known nodes (0 of 1 are within the offset)
  • And a run failing with this:
kvnemesis_test.go:67: error executing 'SET CLUSTER SETTING kv.rangefeed.enabled = true': pq: update-setting: remote wall time is too far ahead (970.559966ms) to be trustworthy

With this warning message in the traces:

 event:util/hlc/hlc.go:260 [n2,s2] backward time jump detected (-0.965141 seconds)

Here are the logs for these:
clock_out_of_sync_onboot.zip
backward_clock_jump.zip

The second one seems..really weird?

@aayushshah15
Copy link
Copy Markdown
Contributor Author

FWIW at this point, I believe I’m not going to find anything in these stress runs that falls under this PR. The exact issue this PR fixes is simulated by the targeted unit test. It’s easy to sink a lot of time looking into these nemeses failures so we should probably let it find stuff in the nightly runs & track them separately anyway.

Currently, RaftHeartbeatIntervalTicks is not configurable from a
TestCluster and we disallow RaftElectionTimeoutTicks to be less than or
equal to it. Thus, tests that use TestCluster and, for instance, sleep
for a node's liveness to expire take longer than they really need to.

Release note: None
Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 1 of 23 files at r14, 21 of 21 files at r15, 1 of 1 files at r16.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, and @andreimatei)


pkg/kv/kvserver/replica_closedts.go, line 37 at r6 (raw file):

because it gets interpreted as the range getting quiesced

This doesn't quite capture the conclusion of that conversation. Mind updating it to be a little more clear?


pkg/kv/kvserver/replica_write.go, line 156 at r12 (raw file):

Subsume() cmd_subsume.go

Is this missing an "in"?


pkg/kv/kvserver/batcheval/cmd_compute_checksum.go, line 39 at r16 (raw file):

a rare a closed timestamp

Delete the second "a".

Before this commit, during a merge, the RHS leaseholder’s store could
continue broadcasting (actionable) closed timestamp updates even after
it had been subsumed. This allowed the followers to be able to serve
follower reads past the subsumption time of RHS.  Additionally, after
the merge, if the LHS had a lower closed timestamp than the RHS, it
could allow writes to the keyspan owned by RHS at timestamps lower than
the RHS’s max closed timestamp.

This commit fixes this bug by requiring that the followers catch up to a
LeaseAppliedIndex that belongs to the entry succeeding the Subsume
request. It also adds necessary assertions around the invariant that
while a range is subsumed, nothing (except the merge txn being rolled
back) can bump its lease applied index.

Fixes cockroachdb#44878

Release note (bug fix): Fixed a rare bug that could cause actionable
closed timestamps to effectively regress over a given keyspan. This
could in turn lead to a serializability violation when using follower
reads. This was due to ill-defined interactions between range merges
and the closed timestamp subsystem.
This commit adds a "sanity check" test around the assertion at the
end of Replica.propose() that ensures that the lease applied index
of a subsumed range is never bumped.

Release note: None
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the extremely detailed reviews @nvanbenschoten and @andreimatei

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/replica_closedts.go, line 37 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
because it gets interpreted as the range getting quiesced

This doesn't quite capture the conclusion of that conversation. Mind updating it to be a little more clear?

Done.


pkg/kv/kvserver/replica_write.go, line 156 at r12 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Subsume() cmd_subsume.go

Is this missing an "in"?

Fixed.


pkg/kv/kvserver/batcheval/cmd_compute_checksum.go, line 39 at r16 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
a rare a closed timestamp

Delete the second "a".

Done.

@aayushshah15
Copy link
Copy Markdown
Contributor Author

bors r=nvanbenschoten,andreimatei

aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Aug 4, 2020
We had merges disabled because of the bugs tracked in cockroachdb#44878, but those have
since been fixed by cockroachdb#46085 and cockroachdb#50265.

Release note: None
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 4, 2020

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 4, 2020

Build succeeded:

@craig craig bot merged commit 6305432 into cockroachdb:master Aug 4, 2020
craig bot pushed a commit that referenced this pull request Aug 4, 2020
51305: geo,geoindex: use bounding box for geography coverings that are bad r=sumeerbhola a=sumeerbhola

This change uses a covererInterface implementation for geography
that notices when a covering is using top-level cells of all faces
and in that case uses the bounding box to compute the covering.

Also changed the bounding box calculation for geography shapes to
use only the points and not first construct s2.Regions. The latter
causes marginally bad shapes to continue to have bad coverings since
the bounding box also covers the whole earth.

Release note: None

51882: roachpb: panic when comparing a Lease to a non-lease r=andreimatei a=andreimatei

Release note: None

52146: sql: remove local execution of projectSetNode and implement ConstructProjectSet in the new factory r=yuzefovich a=yuzefovich

Depends on #52108.

**sql: remove local execution of projectSetNode**

We have project set processor which is always planned for
`projectSetNode`, so this commit removes the dead code of its local
execution. Additionally, it removes some unused fields and cleans up
cancellation check of the processor.

Release note: None

**sql: implement ConstructProjectSet in the new factory**

Addresses: #47473.

Release note: None

52320: kvserver: enable merges in kvnemesis r=aayushshah15 a=aayushshah15

We had merges disabled because of the bugs tracked in #44878, but those have
since been fixed by #46085 and #50265.

Release note: None

Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Aayush Shah <aayush.shah15@gmail.com>
@aayushshah15 aayushshah15 deleted the rangefeed_drop branch August 4, 2020 18:14
nvb added a commit to nvb/cockroach that referenced this pull request Feb 27, 2021
Needed for cockroachdb#61137.

This commit updates the manner through which lease transfers (through
`LeaseTransferRequest`) and range merges (through `SubsumeRequest`)
handle the "transfer of power" from their outgoing leaseholder to their
incoming leaseholder. Specifically, it updates the handling of these
requests in order to rationalize the interaction between their
evaluation and the closing of timestamps through the closed timestamp
side-transport. It is now clearer when and how these requests instruct
the outgoing leaseholder to relinquish its ability to advance the closed
timestamp and, as a result, now possible for the requests to query and
operate on the maximum closed timestamp published by the outgoing leaseholder.

For lease transfers, this commit begins by addressing an existing TODO
to push the revocation of the outgoing lease out of `AdminTransferLease`
and into the evaluation of `LeaseTransferRequest` through a new
`RevokeLease` method on the `EvalContext`. Once a lease is revoked, the
side-transport will no longer be able to advance the closed timestamp
under it. This was made possible by cockroachdb#59086 and was suggested by @tbg
during the code review. We generally like to keep replica state changes
out of "admin" requests themselves, which are intended to coordinate
changes through individual non-admin requests. Admin requests generally
don't even need to evaluate on a leaseholder (though they try to), so
having them perform any state changes is fragile.

For range merges, this commit removes the `MaybeWatchForMerge` flag from
the `LocalResult` returned by `SubsumeRequest` and replaces it with a
`WatchForMerge` method on the `EvalContext`. This allows the
`SubsumeRequest` to launch the range merge watcher goroutine during it
evaluation, which the side-transport checks for to see whether a
leaseholder can advance its closed timestamp. In doing so, the
`SubsumeRequest` is able to pause closed timestamps when it wants and is
also able to observe and return the maximum closed timestamp _after_ the
closed timestamp has stopped advancing. This is a stark improvement over
the approach with the original closed timestamp system, which required a
herculean effort in cockroachdb#50265 to make correct.

With these refactors complete, the closed timestamp side-transport
should have a much easier and safer time checking whether a given
leaseholder is able to advance its closed timestamp.

Release justification: Necessary for the safety of new functionality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

O-community Originated from the community X-blathers-triaged blathers was able to find an owner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rangefeed: dropping events (in catch up scan?)

5 participants