sql/schemachanger: implement proper backfill checkpointing#73861
sql/schemachanger: implement proper backfill checkpointing#73861craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
0a66363 to
a936877
Compare
2b30d75 to
9aaef74
Compare
|
Adding Steven and Rui since they've been much closer to the backfiller code these days. |
postamar
left a comment
There was a problem hiding this comment.
This is the result of my first review pass. I'm going to give it all some deeper thought. Naturally, I can't comment on the bulk-io aspects of this change.
Reviewed 31 of 33 files at r1, 8 of 8 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @rhu713, and @stevendanna)
-- commits, line 7 at r1:
nit: Touches #59788 perhaps?
pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):
// TODO(ajwerner): There's been talk that SpansToDo is worse than DoneSpans // in terms of constructs. Discuss with @dt before merging this whether we // want to make moves on that.
I don't think that this is an appropriate TODO note to make. Any reader of this note is left wondering why this choice matters, why you think it might need to be revisited, and what motivated you to leave it for later instead of getting it right the first time around. Note also that references to specific people, in particular, never age well.
pkg/sql/index_backfiller.go, line 47 at r2 (raw file):
func (ib *IndexBackfillPlanner) MaybePrepareDestIndexesForBackfill( ctx context.Context, progress scexec.BackfillProgress, td catalog.TableDescriptor, ) (scexec.BackfillProgress, error) {
I find it odd that you don't pass progress by reference in this function, like you do for scanDestIndexesBeforeBackfill.
pkg/sql/schema_change_plan_node.go, line 166 at r2 (raw file):
execCfg.IndexBackfiller, scdeps.NewNoOpBackfillTracker(execCfg.Codec), scdeps.NewNoOpPeriodicBackfillProgressFlusher(),
Could you add a comment explaining that these no-op things are OK because we're not going to be doing backfills synchronously? This would help reduce potential surprise.
Code quote:
scdeps.NewNoOpBackfillTracker(execCfg.Codec),
scdeps.NewNoOpPeriodicBackfillProgressFlusher(),pkg/sql/schemachanger/scdeps/exec_deps.go, line 320 at r2 (raw file):
// PeriodicProgressWriter implements the scexec.Dependencies interface. func (d *execDeps) PeriodicProgressWriter() scexec.PeriodicBackfillProgressFlusher {
Can you use the same very throughout your PR please? Either call these "progress writers" or "progress flushers" but not both.
Personally I don't like the backfillFlushers I've been seeing, it's the progress that's being flushed, not the backfill. It's confusing.
pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):
// BackfillTrackerFactory constructs a new BackfillTracker. type BackfillTrackerFactory func() scexec.BackfillTracker
The local convention here is to suffix these as "Builder" and not "Factory". I'm not sure which one's correct, but I feel we should try to conform to a pattern at least.
pkg/sql/schemachanger/scdeps/exec_deps.go, line 360 at r2 (raw file):
// contains a full set of SpansToDo corresponding to the source index // span and an empty MinimumWriteTimestamp. func NewNoOpBackfillTracker(codec keys.SQLCodec) scexec.BackfillTracker {
I have no idea if it's worth doing but this implementation might conceivably be moved to scexec. It's fine here too, just putting the idea out there.
pkg/sql/schemachanger/scdeps/run_deps.go, line 34 at r2 (raw file):
internalExecutor sqlutil.InternalExecutor, backfiller scexec.Backfiller, backfillTracker BackfillTrackerFactory,
s/backfillTracker/backfillTrackerBuilder/ ?
same applies below also
pkg/sql/schemachanger/scdeps/sctestdeps/backfill.go, line 76 at r2 (raw file):
} type backfiller struct {
nit: can you rename this and all other sctestdeps implementations testBackfiller and the like? The only reason I'm asking is I checked out your branch and looked at the code in goland, and when I look at the interface's implementations I saw backfiller alongside the real implementation and so wondered for a second why there were two implementations.
pkg/sql/schemachanger/scdeps/sctestdeps/test_state.go, line 44 at r2 (raw file):
backfiller scexec.Backfiller indexSpanSplitter scexec.IndexSpanSplitter backfillTracker scexec.BackfillTracker
Can you add a comment here explaining why we're dropping interfaces in TestState? Again, it might be surprising at first glance.
Code quote:
backfiller scexec.Backfiller
indexSpanSplitter scexec.IndexSpanSplitter
backfillTracker scexec.BackfillTrackerpkg/sql/schemachanger/scexec/exec_backfill.go, line 67 at r2 (raw file):
SourceIndexID: table.GetPrimaryIndexID(), DestIndexIDs: []descpb.IndexID{op.IndexID}, })
Determining the SourceIndexID should happen at op-generation-time, not execution-time. After discussing this offline with you, you agreed to follow up in a subsequent commit in this PR.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 125 at r2 (raw file):
} func scanDestinationIndexes(
s/scanDestinationIndexes/maybeScanDestinationIndexes/ ?
pkg/sql/schemachanger/scexec/exec_backfill.go, line 131 at r2 (raw file):
tables map[descpb.ID]catalog.TableDescriptor, tracker BackfillProgressWriter, ) (bool, error) {
s/(bool, error)/(didScan bool, err error)/ ?
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):
// TODO(ajwerner): Deal with setting the running status to indicate scanning // and backfilling.
To be fair, the running status of the schema changer job itself could probably do with some love regardless of backfills. Consider moving this TODO somewhere more appropriate and useful.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 28 at r2 (raw file):
// and backfilling. // TODO(ajwerner): Deal with testing knobs to always flush details and // progress.
This doesn't seem like the right location for this TODO. Shouldn't this be located at the affected knobs and/or tests?
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 29 at r2 (raw file):
// TODO(ajwerner): Deal with testing knobs to always flush details and // progress. // TODO(ajwerner): Avoid writing checkpoints which don't contain updates.
Please clarify.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 52 at r2 (raw file):
} type backfillTrackerConfig struct {
Consider commenting that this thing exists in order to make unit-testing easier.
pkg/sql/schemachanger/backfill_test.go, line 27 at r2 (raw file):
) func TestBackfillCheckpoints(t *testing.T) {
Can you add a comment to describe what this test does?
pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):
require.NotNil(t, tab) fooID = tab.GetID() mut := tabledesc.NewBuilder(tab.TableDesc()).BuildExistingMutableTable()
fyi you can now do tab.NewBuilder() directly
pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 235 at r2 (raw file):
SpansToDo: []roachpb.Span{bar.IndexSpan(keys.SystemSQLCodec, 1)}, } log.Infof(ctx, "hi")
detritus
9aaef74 to
2833033
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @postamar, @rhu713, and @stevendanna)
Previously, postamar (Marius Posta) wrote…
nit:
Touches #59788perhaps?
Done.
pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):
Previously, postamar (Marius Posta) wrote…
I don't think that this is an appropriate TODO note to make. Any reader of this note is left wondering why this choice matters, why you think it might need to be revisited, and what motivated you to leave it for later instead of getting it right the first time around. Note also that references to specific people, in particular, never age well.
I left this note for me to sort out on this review. cc @dt
pkg/sql/index_backfiller.go, line 47 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
I find it odd that you don't pass
progressby reference in this function, like you do forscanDestIndexesBeforeBackfill.
I've inlined that function here now. There was no reason for it to exist.
pkg/sql/schema_change_plan_node.go, line 166 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Could you add a comment explaining that these no-op things are OK because we're not going to be doing backfills synchronously? This would help reduce potential surprise.
Done.
pkg/sql/schemachanger/scdeps/exec_deps.go, line 320 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Can you use the same very throughout your PR please? Either call these "progress writers" or "progress flushers" but not both.
Personally I don't like the
backfillFlushers I've been seeing, it's the progress that's being flushed, not the backfill. It's confusing.
Done.
pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
The local convention here is to suffix these as "Builder" and not "Factory". I'm not sure which one's correct, but I feel we should try to conform to a pattern at least.
My 🚲🏠 is that factories construct stateful object-y things and builders construct data things. 🤷
pkg/sql/schemachanger/scdeps/exec_deps.go, line 360 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
I have no idea if it's worth doing but this implementation might conceivably be moved to
scexec. It's fine here too, just putting the idea out there.
Slight preference for it here.
pkg/sql/schemachanger/scdeps/run_deps.go, line 34 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
s/backfillTracker/backfillTrackerBuilder/?same applies below also
done.
pkg/sql/schemachanger/scdeps/sctestdeps/backfill.go, line 76 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
nit: can you rename this and all other
sctestdepsimplementationstestBackfillerand the like? The only reason I'm asking is I checked out your branch and looked at the code in goland, and when I look at the interface's implementations I sawbackfilleralongside the real implementation and so wondered for a second why there were two implementations.
Done.
pkg/sql/schemachanger/scdeps/sctestdeps/test_state.go, line 44 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Can you add a comment here explaining why we're dropping interfaces in
TestState? Again, it might be surprising at first glance.
Done.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 67 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Determining the
SourceIndexIDshould happen at op-generation-time, not execution-time. After discussing this offline with you, you agreed to follow up in a subsequent commit in this PR.
Done.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 125 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
s/scanDestinationIndexes/maybeScanDestinationIndexes/?
Done.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 131 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
s/(bool, error)/(didScan bool, err error)/?
Done. Also, parallelized this and added commentary.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
To be fair, the running status of the schema changer job itself could probably do with some love regardless of backfills. Consider moving this TODO somewhere more appropriate and useful.
I'd like to rebase my change on yours. I think this comment belongs on your new job registry abstraction.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 28 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
This doesn't seem like the right location for this TODO. Shouldn't this be located at the affected knobs and/or tests?
I've removed this TODO. It's not really relevant. We use the referenced cluster settings in two old schema changer tests that we won't be porting.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 29 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Please clarify.
Did it.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 52 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Consider commenting that this thing exists in order to make unit-testing easier.
Done.
pkg/sql/schemachanger/backfill_test.go, line 27 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Can you add a comment to describe what this test does?
Done.
pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
fyi you can now do
tab.NewBuilder()directly
It'd have to be tab.NewBuilder().(tabledesc.TableDescriptorBuilder).BuildExistingMutableTable() which doesn't seem better.
pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 235 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
detritus
Done.
postamar
left a comment
There was a problem hiding this comment.
Thanks for addressing my concerns. I have nothing left to complain about at this time :)
LGTM if the bulk-io reviewers are happy with it too.
Reviewed 3 of 39 files at r3, 40 of 41 files at r4, 14 of 14 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @postamar, @rhu713, and @stevendanna)
pkg/sql/schema_change_plan_node.go, line 165 at r5 (raw file):
execCfg.JobRegistry, execCfg.IndexBackfiller, // User a no-op tracker and flusher because while backfilling in a
nit: typo: s/User/Use
pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):
Previously, ajwerner wrote…
My 🚲🏠 is that factories construct stateful object-y things and builders construct data things. 🤷
I'll stick to that convention too henceforth, I like it.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):
Previously, ajwerner wrote…
I'd like to rebase my change on yours. I think this comment belongs on your new job registry abstraction.
Sounds good to me.
pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):
Previously, ajwerner wrote…
It'd have to be
tab.NewBuilder().(tabledesc.TableDescriptorBuilder).BuildExistingMutableTable()which doesn't seem better.
Oh, right.
rhu713
left a comment
There was a problem hiding this comment.
This PR looks good from my perspective, though I'd also want to get input from @stevendanna as well.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @postamar, and @stevendanna)
pkg/sql/schemachanger/scexec/dependencies.go, line 124 at r5 (raw file):
// MaybePrepareDestIndexesForBackfill to construct a properly initialized // progress. BackfillIndex(
This interface for the index backfiller looks good to me.
d752caa to
9ef4c93
Compare
postamar
left a comment
There was a problem hiding this comment.
I gave this another, deeper look. I'm happy with this change and my comments are all code hygiene concerns.
Reviewed 37 of 38 files at r9, 16 of 16 files at r10, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)
pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):
// BackfillTrackerFactory constructs a new BackfillTracker. type BackfillTrackerFactory func() scexec.BackfillTracker
I looked into this a bit and I don't see why this type needs to exist. Instead, it looks like the BackfillTracker instance can be created in the WithTxnInJob implementation. This requires moving stuff out of scjob and into scdeps but that doesn't feel wrong at all, quite the opposite in fact. There will be dependency import issue with backfillTrackerConfig but this can be moved to scsqldeps.
What's motivating my comment is that these dependency injection constructors like NewJobRunDependencies are heavy enough as they are, if we can avoid adding more args then why not.
While we're at it let's also make scsqldeps a child package of scdeps...
pkg/sql/schemachanger/scexec/dependencies.go, line 213 at r10 (raw file):
// GetBackfillProgress reads the backfill progress for the specified backfill. // If no such backfill has been stored previously, this call will return a // new BackfillProgress without
incomplete comment
pkg/sql/schemachanger/scexec/exec_backfill.go, line 31 at r10 (raw file):
func executeBackfillOps(ctx context.Context, deps Dependencies, execute []scop.Op) (err error) { backfillsToExecute := extractBackfillsFromOps(execute)
Is there any interest in having the ordering within this data structure be more deterministic? I'm pointing that out because the sort in mergeBackfillFromSameSource isn't stable. I'm guessing no, because everything meaningful appears to be done concurrently. This is perhaps worth pointing out in a comment.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 194 at r10 (raw file):
g, gCtx := errgroup.WithContext(ctx) for i := range progresses { p := progresses[i] // copy for closure
This pattern occurs a few times, consider introducing something like
func (bps []BackfillProgress) forEachProgressConcurrently(func (bp *BackfillProgress) error) error
?
pkg/sql/schemachanger/scexec/exec_backfill.go, line 216 at r10 (raw file):
} func executeBackfill(
nit: names: you have a runBackfills function and an executeBackfill function
pkg/sql/schemachanger/scexec/exec_backfill.go, line 225 at r10 (raw file):
) error { // Split off the index span prior to backfilling. // TODO(ajwerner): Consider parallelizing splits.
Just curious, what would a scenario look like in which which we'd be penalized for doing so sequentially? I guess what I'm really curious about is how high can the number of destination indexes realistically go.
8a32a0e to
5452e15
Compare
ajwerner
left a comment
There was a problem hiding this comment.
I added a commit to the end of the chain which addresses the SpansToDo->CompletedSpan transition. I left it as its own commit just so we could look at it in isolation, I'll squash it down before merging.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)
pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):
Previously, ajwerner wrote…
I left this note for me to sort out on this review. cc @dt
I spoke with @dt and confirmed his preference. I've since typed it that way, it's a little bit worse in my esti
pkg/sql/schema_change_plan_node.go, line 165 at r5 (raw file):
Previously, postamar (Marius Posta) wrote…
nit: typo:
s/User/Use
Done.
pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
I looked into this a bit and I don't see why this type needs to exist. Instead, it looks like the
BackfillTrackerinstance can be created in theWithTxnInJobimplementation. This requires moving stuff out ofscjoband intoscdepsbut that doesn't feel wrong at all, quite the opposite in fact. There will be dependency import issue withbackfillTrackerConfigbut this can be moved toscsqldeps.What's motivating my comment is that these dependency injection constructors like
NewJobRunDependenciesare heavy enough as they are, if we can avoid adding more args then why not.While we're at it let's also make
scsqldepsa child package ofscdeps...
I didn't think I could import sql in scdeps. I chose not to create a new package because this felt job-only. I'll do something here.
pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Sounds good to me.
Done.
4048094 to
55aa377
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Alright, I'm happy with where this ended up. I'm going to bors it when it builds. @stevendanna's comments can be applied as follow-up.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)
pkg/sql/schemachanger/scexec/dependencies.go, line 213 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
incomplete comment
Done.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 31 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
Is there any interest in having the ordering within this data structure be more deterministic? I'm pointing that out because the sort in
mergeBackfillFromSameSourceisn't stable. I'm guessing no, because everything meaningful appears to be done concurrently. This is perhaps worth pointing out in a comment.
The only lack of stability was in the dest index ID order. I've sorted that at the end. Made sure the stability is checked by shuffling in the test.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 194 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
This pattern occurs a few times, consider introducing something like
func (bps []BackfillProgress) forEachProgressConcurrently(func (bp *BackfillProgress) error) error?
Done.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 216 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
nit: names: you have a
runBackfillsfunction and anexecuteBackfillfunction
Done.
pkg/sql/schemachanger/scexec/exec_backfill.go, line 225 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
Just curious, what would a scenario look like in which which we'd be penalized for doing so sequentially? I guess what I'm really curious about is how high can the number of destination indexes realistically go.
Probably not that high. It's one per statement on unique column or unique constraint I believe.
ajwerner
left a comment
There was a problem hiding this comment.
Oh, whoops, forgot to move the implementations out of the job. Will do that now.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)
55aa377 to
2e4a6eb
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Okay, code has moved, I think it's in a happy place now. The job gains now very little code. Most of the logic is in scdeps. Thanks for the push.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)
pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):
Previously, ajwerner wrote…
I didn't think I could import
sqlinscdeps. I chose not to create a new package because this felt job-only. I'll do something here.
Done.
This commit overhauls how backfills are processed and their state managed in the new schema changer. It also enables running of backfills in parallel. Touches cockroachdb#59788. Release note: None
2e4a6eb to
5898a6a
Compare
|
'tis the season. Thanks for the reviews. bors r+ |
|
Build succeeded: |
postamar
left a comment
There was a problem hiding this comment.
Nicely done.
Reviewed 52 of 52 files at r11, 24 of 38 files at r12, 15 of 15 files at r13, 13 of 15 files at r14, 2 of 2 files at r15, 4 of 4 files at r16, 16 of 16 files at r18, 1 of 1 files at r19, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained
pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):
Previously, ajwerner wrote…
I didn't think I could import
sqlinscdeps. I chose not to create a new package because this felt job-only. I'll do something here.
Thanks! Understood.
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of cockroachdb#63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in cockroachdb#139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in cockroachdb#92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error.
143270: kvserver: better obs in TestTxnReadWithinUncertaintyIntervalAfterRangeMerge r=tbg a=tbg Closes #143260. This is a complex test with a rare failure mode. Trace all of the relevant operations so that we can meaningfully engage with it. Epic: none Release note: none 143451: sql,backfill: avoid holding a protected timestamp during the backfill r=rafiss a=rafiss ### sql,rowexec: use WriteTS as timestamp for AddSSTable in backfill PR #64023 made it so we use the backfill's read TS as the batch TS for the AddSSTable operation. Simultaneously with that change, #63945 was being worked on, which made it so that we use a fixed _write_ timestamp for the keys that are backfilled. This write timestamp is the actual minimum lower bound timestamp for the backfill operation, and the read timstamp is allowed to change, so it makes more sense to use the write timestamp for the AddSSTable operation. Looking at the diffs in the PRs makes it seem pretty likely that this was just missed as part of a trivial branch skew. This commit does not cause any behavior change, since at the time of writing, the read timestamp and write timestamp are always identical for the backfill. Release note: None ### sql,backfill: choose new read timestamp when backfill job is resumed When #73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of #63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in #139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in #92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error. ### sql, backfill: use a current timestamp to scan each chunk of the source index There's no need for the backfill to use a fixed read timestamp for the entire job. Instead, each chunk of the original index that needs to be scanned can use a current timestamp. This allows us to remove all the logic for adding protected timestamps for the backfill. Release note (performance improvement): Schema changes that require data to be backfilled no longer hold a protected timestamp for the entire duration of the backfill, which means there is less overhead caused by MVCC garbage collection after the backfill completes. ### sql: stop setting readAsOf in index backfiller As of the parent commit, this is unused by the index backfiller. Release note: None --- fixes #140629 informs #142339 informs #142117 informs #141773 Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of cockroachdb#63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in cockroachdb#139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in cockroachdb#92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error.
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of cockroachdb#63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in cockroachdb#139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in cockroachdb#92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error.
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of cockroachdb#63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in cockroachdb#139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in cockroachdb#92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error.
This commit overhauls how backfills are processed and their state managed
in the new schema changer. It also enables running of backfills in parallel.
Touches #59788.
Release note: None