Skip to content

kvserver: emit AddSSTable events via rangefeeds#73487

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
erikgrinaker:addsstable-rangefeed-propagate
Jan 27, 2022
Merged

kvserver: emit AddSSTable events via rangefeeds#73487
craig[bot] merged 3 commits intocockroachdb:masterfrom
erikgrinaker:addsstable-rangefeed-propagate

Conversation

@erikgrinaker
Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker commented Dec 5, 2021

testutils: add sstutil package

This patch adds a testutils/sstutil package with SST-related test
utilities.

Release note: None

kvserver: disconnect rangefeeds on MVCC history mutations

This patch adds a field MVCCHistoryMutation to ReplicatedEvalResult,
which disconnects any rangefeeds that overlap with the command span.
Callers are expected to ensure there are no rangefeeds over such spans,
but if they fail to do so it is better to error out rather than silently
pretend like nothing happened. The field is set for AddSSTable,
ClearRange, and RevertRange requests.

This error is exposed via the OnInternalError callback for
rangefeed.Factory-based rangefeed clients. However, if this callback
is not set, these clients will silently stop processing events. This is
unfortunate, but follows from the API design. Most callers do not appear
to set such a callback, and it is left for the owning teams to update
the usage of the library with appropriate error handling. Furthermore,
this error detection requires knowledge about the new
MVCCHistoryMutationError error type, so older nodes in mixed-version
clusters will simply treat this as a retryable error.

Touches #70434.

Release note: None

kvserver: emit AddSSTable events via rangefeeds

This patch emits AddSSTable events across rangefeeds when the
ingestion was done with WriteAtRequestTimestamp enabled (since this
ensures they respect the closed timestamp). This setting is introduced
with 22.1, and callers must check the MVCCAddSStable version gate
before using it, so by extension this event is only emitted once the
entire cluster runs 22.1.

Clients built via rangefeed.Factory have a new WithOnSSTable option
that can be used to register a callback for these events. If such a
callback is not set, the rangefeed will run a catchup scan that includes
the values written by the AddSSTable request.

The entire SST is emitted in binary form, regardless of how the
registration span overlaps it -- it is up to callers to prune the SST
contents as appropriate. Previous values are not included for keys
replaced by the SST.

Resolves #70434.

Release note: None


Initial draft. A few outstanding questions:

  1. Changefeeds will error on these SST events. We currently do not expect to ingest SSTs into online tables, and thus we do not expect changefeeds to see these events. Now that we have MVCC-compliant AddSSTable we could begin to use AddSSTable into online tables, and thus the changefeeds would need to handle this (note here that the SST events do not contain previous value diffs), but I figure we can cross that bridge when we get there. Does this make sense?

  2. Internal consumers that subscribe via rangefeed.Factory must register an OnSSTable handler, otherwise they will not be notified about such events. I'm not sure if we ever expect to ingest SSTs into e.g. online system spans and such, but we may want to consider sending such subscribers an error if they haven't registered OnSSTable rather than silently ignoring them.

  3. These events are likely to be much larger than regular key/value events, and thus there is a greater risk of excessive memory usage since the buffer size per replica is 4096 events. The SSTs are already in memory due to Raft application, but buffering them here can cause them to stick around and pile up. We should consider adding memory accounting here, and use a store-wide memory limit rather than a per-replica fixed-size queue. See: rangefeed: memory accounting and budgeting #73616.

@erikgrinaker erikgrinaker requested review from a team, ajwerner, dt and miretskiy December 5, 2021 19:47
@erikgrinaker erikgrinaker self-assigned this Dec 5, 2021
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 18 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @erikgrinaker)


-- commits, line 7 at r1:
I don't see anywhere in this commit that checks for MVCCAddSSTable setting.
What's the intended use case? Do callers (i.e. rangefeed library users) ought to check
for this setting before enabled OnSST event? Just trying to clarify correct usage.


pkg/kv/kvclient/rangefeed/config.go, line 115 at r1 (raw file):

}

// OnSST is called when a SSTable is ingested.

I was going to suggest that perhaps we should drop this option and SST handling since sst handling is not "trivial"...
But maybe just expand the comment to indicate how subtle OnSST needs to be. In particular, the fact that it may
include keys outside span boundaries.


pkg/kv/kvclient/rangefeed/rangefeed.go, line 367 at r1 (raw file):

				// FIXME(erikgrinaker): Callers who do not set onSSTable will be oblivious
				// to these events -- we may want to send such callers an error and force
				// them to do a catchup scan.

I like the idea of returning an error to cause catchup scan very much.
Or we need to provide a default impl that iterates sst, while ignoring keys outside span boundaries. But
forcing catchup scan sounds a lot better. the caller decides , if they know what they are doing, to ask for SSTs;
otherwise things continue to work.

Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @miretskiy)


-- commits, line 7 at r1:

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I don't see anywhere in this commit that checks for MVCCAddSSTable setting.
What's the intended use case? Do callers (i.e. rangefeed library users) ought to check
for this setting before enabled OnSST event? Just trying to clarify correct usage.

It is up to AddSSTable RPC callers to check MVCCAddSSTable before using WriteAtRequestTimestamp. The server-side logic needs to respect WriteAtRequestTimestamp regardless of the version gate in order to handle the non-atomic rollout of version gates across the cluster, as described here:

// At the same time, with requests/RPCs originating at other crdb nodes, the
// initiator of the request gets to decide what's supported. A node should
// not refuse functionality on the grounds that its view of the version gate
// is as yet inactive. Consider the sender:
//
// func invokeSomeRPC(req) {
// if (specific-version is active) {
// // Like mentioned above, this implies that all nodes in the
// // cluster are running binaries that can handle this new
// // feature. We may have learned about this fact before the
// // node on the other end. This is due to the fact that migration
// // manager informs each node about the specific-version being
// // activated active concurrently. See BumpClusterVersion for
// // where that happens. Still, it's safe for us to enable the new
// // feature flags as we trust the recipient to know how to deal
// // with it.
// req.NewFeatureFlag = true
// }
// send(req)
// }
//
// And consider the recipient:
//
// func someRPC(req) {
// if !req.NewFeatureFlag {
// // Legacy behavior...
// }
// // There's no need to even check if the specific-version is active.
// // If the flag is enabled, the specific-version must have been
// // activated, even if we haven't yet heard about it (we will pretty
// // soon).
// }
//
// See clusterversion.Handle.IsActive and usage of some existing versions
// below for more clues on the matter.

Once that gate starts being activated, all cluster nodes (including rangefeed consumers) will already be running 22.1, so even though they haven't observed the MVCCAddSSTable flip they should be able to consume these events correctly.


pkg/kv/kvclient/rangefeed/config.go, line 115 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I was going to suggest that perhaps we should drop this option and SST handling since sst handling is not "trivial"...
But maybe just expand the comment to indicate how subtle OnSST needs to be. In particular, the fact that it may
include keys outside span boundaries.

We need to pass SSTs across rangefeeds in 22.1 to prepare for streaming cluster-to-cluster replication. Aborting these and forcing catchup-scans on every AddSSTable would likely be too expensive and cause the replication stream to lag significantly (consider that e.g. index builds will be firing off these requests continually for several hours). So I think it'll have to be up to callers to decide if and how they want to handle them.


pkg/kv/kvclient/rangefeed/rangefeed.go, line 367 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I like the idea of returning an error to cause catchup scan very much.
Or we need to provide a default impl that iterates sst, while ignoring keys outside span boundaries. But
forcing catchup scan sounds a lot better. the caller decides , if they know what they are doing, to ask for SSTs;
otherwise things continue to work.

Yeah, it's probably better to error rather than silently ignore them -- I'll just have to audit all of the current internal callers (i.e. non-changefeed users) to make sure they'll actually run catchup-scans when they receive errors.

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @erikgrinaker, and @miretskiy)


pkg/kv/kvclient/rangefeed/config.go, line 115 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

We need to pass SSTs across rangefeeds in 22.1 to prepare for streaming cluster-to-cluster replication. Aborting these and forcing catchup-scans on every AddSSTable would likely be too expensive and cause the replication stream to lag significantly (consider that e.g. index builds will be firing off these requests continually for several hours). So I think it'll have to be up to callers to decide if and how they want to handle them.

well, I think the assumption here is that replication stream will use this library; it may very well do so, but that's not a given.
So, I just want to make sure that the existing users of this library behave correctly. That's why I sort of wanted to expand the comment. It's not a simple "kv" even like before.

@miretskiy miretskiy self-requested a review December 6, 2021 14:12
Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @miretskiy)


pkg/kv/kvclient/rangefeed/config.go, line 115 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

well, I think the assumption here is that replication stream will use this library; it may very well do so, but that's not a given.
So, I just want to make sure that the existing users of this library behave correctly. That's why I sort of wanted to expand the comment. It's not a simple "kv" even like before.

Yep, totally. But I think that if rangefeeds are to include SSTs, then those events should be accessible via this library as well. But yeah, we should look into how to best integrate with existing callers (probably error by default, as discussed elsewhere), and I'll make sure to expand the comment.

Data: sst,
CRC32: util.CRC32(sst),
Span: roachpb.Span{Key: start.Key, EndKey: end.Key},
EmitRangefeed: args.WriteAtRequestTimestamp,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it'd make sense to persist this bool as "AtWriteTimestamp" or something like that, and then below instead of skipping non-mvcc AddSSTables, go ahead and send them to the feed until we get to ev.SST != nil and then there, if the SST was not AtWriteTimestamp, fail the feed because we got an unexpected violation of the mvcc semantics that feed is relying on?

Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might make sense to call this AtRequestTimestamp and defer the rangefeed policy elsewhere, such that the data describes the SST rather than prescribing downstream actions. Even though it may be a bit unfortunate to bury this policy deep in the bowels of Raft state machine application.

However, given the size of these SSTs, I'd rather not ship all of them through the rangefeed machinery unnecessarily. Also, we only propagate the write timestamp into the rangefeed, with the assumption that all MVCC timestamps in the SST are equal to it:

func (r *Replica) handleSSTableRaftMuLocked(
ctx context.Context, sst []byte, sstSpan roachpb.Span, writeTS hlc.Timestamp,
) {

It seems unnecessary to add a boolean parameter here, and then assert that the passed parameter is always true. Shouldn't it be the caller's responsibility to only call the method when appropriate? Downstream consumers can (and probably should) also check the actual MVCC timestamps against the claimed write timestamp to assert that MVCC invariants are upheld, if it's cheap enough.

Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then again, maybe we could move the policy into Processor.ConsumeSSTable such that it ignores !AtWriteTimestamp. Seems more appropriate there than in the Raft core.

Copy link
Copy Markdown
Contributor

@dt dt Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial suggestion was be to move it further beyond raft, all the way to *RangeFeed.processEvents, where it would be an error if an sst event arrived that was not marked as AtRequestTimestamp. If we were just ignoring them at that point I'd be more concerned about the cost of shipping them all the way through just to discard them, but I was imaging that if we kick out an assertion failure error or something when the very first one arrives, and that tears down the feed, then that cost isn't a big deal since it quickly, and importantly loudly, comes to a stop?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but I think I'd prefer to assert that on the server-side at least, e.g. inside the rangefeed processor at Processor.publishSSTable, rather than waiting for it to make its way out to the client and assert it there. In the latter case, we'd have to implement this in all clients (not just the Factory-constructed ones which aren't always used), along with the necessary plumbing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, that make sense to me. The main thing I was interested in is that an unexpected non-mvcc AddSSTable be an error, that shuts the feed down, rather than silently ignored.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe do this for all other non-MVCC operations such ClearRange and RevertRange too? I.e. signal to the rangefeed that we applied a non-MVCC operation, and disconnect it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds about right to me, but would want @miretskiy to confirm, and I don't know if I'd hold up this change for it? (whenever we do it, might be good to have an escape-hatch cluster setting to instead ignore the event as we would today in case I'm wrong)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, see separate commit which disconnects rangefeeds with a permanent MVCCHistoryMutationError for operations that mutate MVCC history -- specifically AddSSTable when not writing at the request timestamp, ClearRange, and RevertRange. This does not apply to MVCC-compliant AddSSTable requests that write at the request timestamp, which are emitted across rangefeeds.

Note that for rangefeed.Factory-based clients, this is surfaced via OnInternalError due to the current API design -- unfortunately, most callers do not set this callback, and will silently ignore the error even though the rangefeed stops. The owning teams should add appropriate error handling here.

@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch from f85b78a to b3a84a2 Compare December 7, 2021 11:01
@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch 3 times, most recently from 2a0693d to 8925146 Compare December 23, 2021 14:48
@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch 7 times, most recently from 546689c to 9c733d1 Compare December 30, 2021 18:18
Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @miretskiy)


pkg/kv/kvclient/rangefeed/config.go, line 115 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Yep, totally. But I think that if rangefeeds are to include SSTs, then those events should be accessible via this library as well. But yeah, we should look into how to best integrate with existing callers (probably error by default, as discussed elsewhere), and I'll make sure to expand the comment.

Expanded the comment with relevant details.


pkg/kv/kvclient/rangefeed/rangefeed.go, line 367 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Yeah, it's probably better to error rather than silently ignore them -- I'll just have to audit all of the current internal callers (i.e. non-changefeed users) to make sure they'll actually run catchup-scans when they receive errors.

Done, we run catchup scans unless callers set OnSSTable. This only applies to MVCC-compliant AddSSTable, i.e. with WriteAtRequestTimestamp set. For legacy AddSSTable, we emit a permanent MVCCHistoryMutationError that this client library exposes via OnInternalError (which unfortunately most callers do not currently handle).

@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch 3 times, most recently from 3e053bb to 37341d9 Compare January 5, 2022 21:11
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

I think this should be ready for review now, barring any test failures.

@erikgrinaker erikgrinaker marked this pull request as ready for review January 5, 2022 21:13
@erikgrinaker erikgrinaker requested review from a team as code owners January 5, 2022 21:13
@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch from 37341d9 to b08ca06 Compare January 6, 2022 09:27
@miretskiy miretskiy requested a review from dt January 6, 2022 16:33
Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice. I think @dt and @ajwerner should also do a pass.

Reviewed 3 of 23 files at r3, 9 of 20 files at r5, 1 of 2 files at r6.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @erikgrinaker, and @miretskiy)


pkg/kv/kvclient/rangefeed/config.go, line 149 at r6 (raw file):

// Also note that AddSSTable requests that do not set the
// WriteAtRequestTimestamp flag, possibly writing below the closed timestamp,
// will cause affected rangefeeds to be disconnected with MVCCHistoryMutationError

nit: perhaps add a terminal before MVCCHIstoryMutationError?


pkg/kv/kvclient/rangefeed/rangefeed.go, line 283 at r6 (raw file):

		start := timeutil.Now()

		// Note that the below channel send will not block forever because

not even sure which channel this refers anymore. I suspect it's talking about producer/consumer blocking.


pkg/kv/kvclient/rangefeed/rangefeed.go, line 288 at r6 (raw file):

		// returning. Because of this, only processEvents may use the cancel
		// function generated below, since the caller must also consume the errCh
		// error.

This logic could probably be simplified somewhat by simply creating errCh w/ buffer for 1 error.
Probably something to do for another time.


pkg/kv/kvclient/rangefeed/rangefeed.go, line 293 at r6 (raw file):

			errCh <- f.client.RangeFeed(ctx, f.spans, ts, f.withDiff, eventCh)
		}); err != nil {
			rfCancel()

This actually makes me a bit sad. I have to jump through the code to verify that we're not leaking context here.
(usually, you always have to call defer rfCancel() immediately after WithCancel -- and actually, I think we we might fail to call rfCancel in some code paths.)
And even if you can convince me that processEvents will always call rfCancel() eventually, I'd say that it's still brittle. To be clear: it's not your code per-say, I think it's because of how this errCh communication was setup.

I wonder why we have to use stopper here at all. We already run this go routine under stopper (with cancel on quiesce). I wonder if another go routine could help us out w/ a more straightforward code.
Something like:

rangefeedTask := func(ctx context.Context) error {
   return f.client.RangeFeed(ctx, f.spans, ts, f.withDiff, eventCh)
}
processEventsTask := func(ctx2 context.Context) error {
   return f.processEvents(ctx, frontier, eventCh)
}

err := ctxgroup.GoAndWait(ctx, rangefeedTask, processEventsTask)

I don't think we'd need errCh at all in this case. Just return the error from any of the go routines. That cancels the other go routine, and you get the original error.


pkg/kv/kvclient/rangefeed/rangefeed.go, line 365 at r6 (raw file):

					rfCancel()
					<-errCh
					return errors.New("received unexpected rangefeed SST event with no OnSSTable handler")

do you think errors.AssertionFailedf might be a better match?


pkg/kv/kvserver/replica_application_state_machine.go, line 623 at r6 (raw file):

	// expected to ensure that no rangefeeds are currently active across such
	// spans, but as a safeguard we disconnect the overlapping rangefeeds
	// with a non-retriable error anyway.

❤️ the comment. Thank you.


pkg/kv/kvserver/replica_application_state_machine.go, line 660 at r6 (raw file):

			b.r.writeStats.recordCount(float64(added), 0)
		}
		if res.AddSSTable.AtWriteTimestamp {

either here or at handleSSTRaftMuLocked, do we need a TODO comment to add memory accounting/limits?


pkg/kv/kvclient/rangefeed/rangefeed_external_test.go, line 504 at r6 (raw file):

	select {
	case <-checkpointC:
	case <-time.After(time.Second):

isn't it a bit aggressive? under stress?

@miretskiy miretskiy self-requested a review January 6, 2022 16:33
@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch from b08ca06 to ca30a07 Compare January 6, 2022 22:20
Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @dt, and @miretskiy)


pkg/kv/kvclient/rangefeed/rangefeed.go, line 293 at r6 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

This actually makes me a bit sad. I have to jump through the code to verify that we're not leaking context here.
(usually, you always have to call defer rfCancel() immediately after WithCancel -- and actually, I think we we might fail to call rfCancel in some code paths.)
And even if you can convince me that processEvents will always call rfCancel() eventually, I'd say that it's still brittle. To be clear: it's not your code per-say, I think it's because of how this errCh communication was setup.

I wonder why we have to use stopper here at all. We already run this go routine under stopper (with cancel on quiesce). I wonder if another go routine could help us out w/ a more straightforward code.
Something like:

rangefeedTask := func(ctx context.Context) error {
   return f.client.RangeFeed(ctx, f.spans, ts, f.withDiff, eventCh)
}
processEventsTask := func(ctx2 context.Context) error {
   return f.processEvents(ctx, frontier, eventCh)
}

err := ctxgroup.GoAndWait(ctx, rangefeedTask, processEventsTask)

I don't think we'd need errCh at all in this case. Just return the error from any of the go routines. That cancels the other go routine, and you get the original error.

Nice, I quite like this -- updated the code. I definitely share your distaste for the initial attempt, and you're right that it would leak a context.

However, I believe RunAsyncTask does a lot of other stuff too, such as setting up tracing spans. But maybe that doesn't matter as much here. Wdyt @andreimatei?


pkg/kv/kvclient/rangefeed/rangefeed.go, line 365 at r6 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

do you think errors.AssertionFailedf might be a better match?

Yeah, I agree.


pkg/kv/kvserver/replica_application_state_machine.go, line 660 at r6 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

either here or at handleSSTRaftMuLocked, do we need a TODO comment to add memory accounting/limits?

I think it's probably a bit overkill, but added a comment on handleSSTableRaftMuLocked. I'd like to implement this in the near future anyway.


pkg/kv/kvclient/rangefeed/rangefeed_external_test.go, line 504 at r6 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

isn't it a bit aggressive? under stress?

Possibly. Bumped them to 3s.

@erikgrinaker
Copy link
Copy Markdown
Contributor Author

@dt @ajwerner Interested in doing a pass over this?

@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch 3 times, most recently from 974f108 to 8b03192 Compare January 27, 2022 08:58
This patch adds a `testutils/sstutil` package with SST-related test
utilities.

Release note: None
This patch adds a field `MVCCHistoryMutation` to `ReplicatedEvalResult`,
which disconnects any rangefeeds that overlap with the command span.
Callers are expected to ensure there are no rangefeeds over such spans,
but if they fail to do so it is better to error out rather than silently
pretend like nothing happened. The field is set for `AddSSTable`,
`ClearRange`, and `RevertRange` requests.

This error is exposed via the `OnInternalError` callback for
`rangefeed.Factory`-based rangefeed clients. However, if this callback
is not set, these clients will silently stop processing events. This is
unfortunate, but follows from the API design. Most callers do not appear
to set such a callback, and it is left for the owning teams to update
the usage of the library with appropriate error handling. Furthermore,
this error detection requires knowledge about the new
`MVCCHistoryMutationError` error type, so older nodes in mixed-version
clusters will simply treat this as a retryable error.

Release note: None
@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch from 8b03192 to af6a7eb Compare January 27, 2022 08:58
This patch emits `AddSSTable` events across rangefeeds when the
ingestion was done with `WriteAtRequestTimestamp` enabled (since this
ensures they respect the closed timestamp). This setting is introduced
with 22.1, and callers must check the `MVCCAddSStable` version gate
before using it, so by extension this event is only emitted once the
entire cluster runs 22.1.

Clients built via `rangefeed.Factory` have a new `WithOnSSTable` option
that can be used to register a callback for these events. If such a
callback is not set, the rangefeed will run a catchup scan that includes
the values written by the `AddSSTable` request.

The entire SST is emitted in binary form, regardless of how the
registration span overlaps it -- it is up to callers to prune the SST
contents as appropriate. Previous values are not included for keys
replaced by the SST.

Release note: None
@erikgrinaker erikgrinaker force-pushed the addsstable-rangefeed-propagate branch from af6a7eb to f048ab0 Compare January 27, 2022 09:34
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

TFTR! For posterity, @ajwerner and @dt deferred to @miretskiy.

bors r=miretskiy

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 27, 2022

Build succeeded:

@craig craig bot merged commit 11da0cf into cockroachdb:master Jan 27, 2022
@erikgrinaker erikgrinaker deleted the addsstable-rangefeed-propagate branch February 14, 2022 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kvserver: rangefeed handling of AddSSTable

4 participants