sql/schemachanger: implement proper backfill checkpointing by ajwerner · Pull Request #73861 · cockroachdb/cockroach

ajwerner · 2021-12-15T19:16:26Z

This commit overhauls how backfills are processed and their state managed
in the new schema changer. It also enables running of backfills in parallel.

Touches #59788.

Release note: None

cockroach-teamcity · 2021-12-15T19:16:33Z

This change is

adityamaru · 2021-12-20T14:27:27Z

Adding Steven and Rui since they've been much closer to the backfiller code these days.

postamar

This is the result of my first review pass. I'm going to give it all some deeper thought. Naturally, I can't comment on the bulk-io aspects of this change.

Reviewed 31 of 33 files at r1, 8 of 8 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @rhu713, and @stevendanna)

-- commits, line 7 at r1:
nit: Touches #59788 perhaps?

pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):

  // TODO(ajwerner): There's been talk that SpansToDo is worse than DoneSpans
  // in terms of constructs. Discuss with @dt before merging this whether we
  // want to make moves on that.

I don't think that this is an appropriate TODO note to make. Any reader of this note is left wondering why this choice matters, why you think it might need to be revisited, and what motivated you to leave it for later instead of getting it right the first time around. Note also that references to specific people, in particular, never age well.

pkg/sql/index_backfiller.go, line 47 at r2 (raw file):

func (ib *IndexBackfillPlanner) MaybePrepareDestIndexesForBackfill(
	ctx context.Context, progress scexec.BackfillProgress, td catalog.TableDescriptor,
) (scexec.BackfillProgress, error) {

I find it odd that you don't pass progress by reference in this function, like you do for scanDestIndexesBeforeBackfill.

pkg/sql/schema_change_plan_node.go, line 166 at r2 (raw file):

		execCfg.IndexBackfiller,
		scdeps.NewNoOpBackfillTracker(execCfg.Codec),
		scdeps.NewNoOpPeriodicBackfillProgressFlusher(),

Could you add a comment explaining that these no-op things are OK because we're not going to be doing backfills synchronously? This would help reduce potential surprise.

Code quote:

		scdeps.NewNoOpBackfillTracker(execCfg.Codec),
		scdeps.NewNoOpPeriodicBackfillProgressFlusher(),

pkg/sql/schemachanger/scdeps/exec_deps.go, line 320 at r2 (raw file):

// PeriodicProgressWriter implements the scexec.Dependencies interface.
func (d *execDeps) PeriodicProgressWriter() scexec.PeriodicBackfillProgressFlusher {

Can you use the same very throughout your PR please? Either call these "progress writers" or "progress flushers" but not both.

Personally I don't like the backfillFlushers I've been seeing, it's the progress that's being flushed, not the backfill. It's confusing.

pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):

// BackfillTrackerFactory constructs a new BackfillTracker.
type BackfillTrackerFactory func() scexec.BackfillTracker

The local convention here is to suffix these as "Builder" and not "Factory". I'm not sure which one's correct, but I feel we should try to conform to a pattern at least.

pkg/sql/schemachanger/scdeps/exec_deps.go, line 360 at r2 (raw file):

// contains a full set of SpansToDo corresponding to the source index
// span and an empty MinimumWriteTimestamp.
func NewNoOpBackfillTracker(codec keys.SQLCodec) scexec.BackfillTracker {

I have no idea if it's worth doing but this implementation might conceivably be moved to scexec. It's fine here too, just putting the idea out there.

pkg/sql/schemachanger/scdeps/run_deps.go, line 34 at r2 (raw file):

	internalExecutor sqlutil.InternalExecutor,
	backfiller scexec.Backfiller,
	backfillTracker BackfillTrackerFactory,

s/backfillTracker/backfillTrackerBuilder/ ?

same applies below also

pkg/sql/schemachanger/scdeps/sctestdeps/backfill.go, line 76 at r2 (raw file):

}

type backfiller struct {

nit: can you rename this and all other sctestdeps implementations testBackfiller and the like? The only reason I'm asking is I checked out your branch and looked at the code in goland, and when I look at the interface's implementations I saw backfiller alongside the real implementation and so wondered for a second why there were two implementations.

pkg/sql/schemachanger/scdeps/sctestdeps/test_state.go, line 44 at r2 (raw file):

	backfiller                        scexec.Backfiller
	indexSpanSplitter                 scexec.IndexSpanSplitter
	backfillTracker                   scexec.BackfillTracker

Can you add a comment here explaining why we're dropping interfaces in TestState? Again, it might be surprising at first glance.

Code quote:

	backfiller                        scexec.Backfiller
	indexSpanSplitter                 scexec.IndexSpanSplitter
	backfillTracker                   scexec.BackfillTracker

pkg/sql/schemachanger/scexec/exec_backfill.go, line 67 at r2 (raw file):

				SourceIndexID: table.GetPrimaryIndexID(),
				DestIndexIDs:  []descpb.IndexID{op.IndexID},
			})

Determining the SourceIndexID should happen at op-generation-time, not execution-time. After discussing this offline with you, you agreed to follow up in a subsequent commit in this PR.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 125 at r2 (raw file):

}

func scanDestinationIndexes(

s/scanDestinationIndexes/maybeScanDestinationIndexes/ ?

pkg/sql/schemachanger/scexec/exec_backfill.go, line 131 at r2 (raw file):

	tables map[descpb.ID]catalog.TableDescriptor,
	tracker BackfillProgressWriter,
) (bool, error) {

s/(bool, error)/(didScan bool, err error)/ ?

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

// TODO(ajwerner): Deal with setting the running status to indicate scanning
// and backfilling.

To be fair, the running status of the schema changer job itself could probably do with some love regardless of backfills. Consider moving this TODO somewhere more appropriate and useful.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 28 at r2 (raw file):

// and backfilling.
// TODO(ajwerner): Deal with testing knobs to always flush details and
// progress.

This doesn't seem like the right location for this TODO. Shouldn't this be located at the affected knobs and/or tests?

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 29 at r2 (raw file):

// TODO(ajwerner): Deal with testing knobs to always flush details and
// progress.
// TODO(ajwerner): Avoid writing checkpoints which don't contain updates.

Please clarify.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 52 at r2 (raw file):

}

type backfillTrackerConfig struct {

Consider commenting that this thing exists in order to make unit-testing easier.

pkg/sql/schemachanger/backfill_test.go, line 27 at r2 (raw file):

)

func TestBackfillCheckpoints(t *testing.T) {

Can you add a comment to describe what this test does?

pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):

					require.NotNil(t, tab)
					fooID = tab.GetID()
					mut := tabledesc.NewBuilder(tab.TableDesc()).BuildExistingMutableTable()

fyi you can now do tab.NewBuilder() directly

pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 235 at r2 (raw file):

					SpansToDo:             []roachpb.Span{bar.IndexSpan(keys.SystemSQLCodec, 1)},
				}
				log.Infof(ctx, "hi")

detritus

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @postamar, @rhu713, and @stevendanna)

-- commits, line 7 at r1:

Previously, postamar (Marius Posta) wrote…

nit: Touches #59788 perhaps?

Done.

pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

I don't think that this is an appropriate TODO note to make. Any reader of this note is left wondering why this choice matters, why you think it might need to be revisited, and what motivated you to leave it for later instead of getting it right the first time around. Note also that references to specific people, in particular, never age well.

I left this note for me to sort out on this review. cc @dt

pkg/sql/index_backfiller.go, line 47 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

I find it odd that you don't pass progress by reference in this function, like you do for scanDestIndexesBeforeBackfill.

I've inlined that function here now. There was no reason for it to exist.

pkg/sql/schema_change_plan_node.go, line 166 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Could you add a comment explaining that these no-op things are OK because we're not going to be doing backfills synchronously? This would help reduce potential surprise.

Done.

pkg/sql/schemachanger/scdeps/exec_deps.go, line 320 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Can you use the same very throughout your PR please? Either call these "progress writers" or "progress flushers" but not both.

Personally I don't like the backfillFlushers I've been seeing, it's the progress that's being flushed, not the backfill. It's confusing.

Done.

pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

The local convention here is to suffix these as "Builder" and not "Factory". I'm not sure which one's correct, but I feel we should try to conform to a pattern at least.

My 🚲🏠 is that factories construct stateful object-y things and builders construct data things. 🤷

pkg/sql/schemachanger/scdeps/exec_deps.go, line 360 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

I have no idea if it's worth doing but this implementation might conceivably be moved to scexec. It's fine here too, just putting the idea out there.

Slight preference for it here.

pkg/sql/schemachanger/scdeps/run_deps.go, line 34 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

s/backfillTracker/backfillTrackerBuilder/ ?

same applies below also

done.

pkg/sql/schemachanger/scdeps/sctestdeps/backfill.go, line 76 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: can you rename this and all other sctestdeps implementations testBackfiller and the like? The only reason I'm asking is I checked out your branch and looked at the code in goland, and when I look at the interface's implementations I saw backfiller alongside the real implementation and so wondered for a second why there were two implementations.

Done.

pkg/sql/schemachanger/scdeps/sctestdeps/test_state.go, line 44 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Can you add a comment here explaining why we're dropping interfaces in TestState? Again, it might be surprising at first glance.

Done.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 67 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Determining the SourceIndexID should happen at op-generation-time, not execution-time. After discussing this offline with you, you agreed to follow up in a subsequent commit in this PR.

Done.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 125 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

s/scanDestinationIndexes/maybeScanDestinationIndexes/ ?

Done.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 131 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

s/(bool, error)/(didScan bool, err error)/ ?

Done. Also, parallelized this and added commentary.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

To be fair, the running status of the schema changer job itself could probably do with some love regardless of backfills. Consider moving this TODO somewhere more appropriate and useful.

I'd like to rebase my change on yours. I think this comment belongs on your new job registry abstraction.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 28 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

This doesn't seem like the right location for this TODO. Shouldn't this be located at the affected knobs and/or tests?

I've removed this TODO. It's not really relevant. We use the referenced cluster settings in two old schema changer tests that we won't be porting.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 29 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Please clarify.

Did it.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 52 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Consider commenting that this thing exists in order to make unit-testing easier.

Done.

pkg/sql/schemachanger/backfill_test.go, line 27 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Can you add a comment to describe what this test does?

Done.

pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

fyi you can now do tab.NewBuilder() directly

It'd have to be tab.NewBuilder().(tabledesc.TableDescriptorBuilder).BuildExistingMutableTable() which doesn't seem better.

pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 235 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

detritus

Done.

postamar

Thanks for addressing my concerns. I have nothing left to complain about at this time :)

LGTM if the bulk-io reviewers are happy with it too.

Reviewed 3 of 39 files at r3, 40 of 41 files at r4, 14 of 14 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @postamar, @rhu713, and @stevendanna)

pkg/sql/schema_change_plan_node.go, line 165 at r5 (raw file):

		execCfg.JobRegistry,
		execCfg.IndexBackfiller,
		// User a no-op tracker and flusher because while backfilling in a

nit: typo: s/User/Use

pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):

Previously, ajwerner wrote…

My 🚲🏠 is that factories construct stateful object-y things and builders construct data things. 🤷

I'll stick to that convention too henceforth, I like it.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

Previously, ajwerner wrote…

I'd like to rebase my change on yours. I think this comment belongs on your new job registry abstraction.

Sounds good to me.

pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):

Previously, ajwerner wrote…

It'd have to be tab.NewBuilder().(tabledesc.TableDescriptorBuilder).BuildExistingMutableTable() which doesn't seem better.

Oh, right.

rhu713

This PR looks good from my perspective, though I'd also want to get input from @stevendanna as well.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @postamar, and @stevendanna)

pkg/sql/schemachanger/scexec/dependencies.go, line 124 at r5 (raw file):

	// MaybePrepareDestIndexesForBackfill to construct a properly initialized
	// progress.
	BackfillIndex(

This interface for the index backfiller looks good to me.

postamar

I gave this another, deeper look. I'm happy with this change and my comments are all code hygiene concerns.

Reviewed 37 of 38 files at r9, 16 of 16 files at r10, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)

pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

// BackfillTrackerFactory constructs a new BackfillTracker.
type BackfillTrackerFactory func() scexec.BackfillTracker

I looked into this a bit and I don't see why this type needs to exist. Instead, it looks like the BackfillTracker instance can be created in the WithTxnInJob implementation. This requires moving stuff out of scjob and into scdeps but that doesn't feel wrong at all, quite the opposite in fact. There will be dependency import issue with backfillTrackerConfig but this can be moved to scsqldeps.

What's motivating my comment is that these dependency injection constructors like NewJobRunDependencies are heavy enough as they are, if we can avoid adding more args then why not.

While we're at it let's also make scsqldeps a child package of scdeps...

pkg/sql/schemachanger/scexec/dependencies.go, line 213 at r10 (raw file):

	// GetBackfillProgress reads the backfill progress for the specified backfill.
	// If no such backfill has been stored previously, this call will return a
	// new BackfillProgress without

incomplete comment

pkg/sql/schemachanger/scexec/exec_backfill.go, line 31 at r10 (raw file):

func executeBackfillOps(ctx context.Context, deps Dependencies, execute []scop.Op) (err error) {
	backfillsToExecute := extractBackfillsFromOps(execute)

Is there any interest in having the ordering within this data structure be more deterministic? I'm pointing that out because the sort in mergeBackfillFromSameSource isn't stable. I'm guessing no, because everything meaningful appears to be done concurrently. This is perhaps worth pointing out in a comment.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 194 at r10 (raw file):

	g, gCtx := errgroup.WithContext(ctx)
	for i := range progresses {
		p := progresses[i] // copy for closure

This pattern occurs a few times, consider introducing something like

func (bps []BackfillProgress) forEachProgressConcurrently(func (bp *BackfillProgress) error) error

?

pkg/sql/schemachanger/scexec/exec_backfill.go, line 216 at r10 (raw file):

}

func executeBackfill(

nit: names: you have a runBackfills function and an executeBackfill function

pkg/sql/schemachanger/scexec/exec_backfill.go, line 225 at r10 (raw file):

) error {
	// Split off the index span prior to backfilling.
	// TODO(ajwerner): Consider parallelizing splits.

Just curious, what would a scenario look like in which which we'd be penalized for doing so sequentially? I guess what I'm really curious about is how high can the number of destination indexes realistically go.

ajwerner

I added a commit to the end of the chain which addresses the SpansToDo->CompletedSpan transition. I left it as its own commit just so we could look at it in isolation, I'll squash it down before merging.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)

pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):

Previously, ajwerner wrote…

I left this note for me to sort out on this review. cc @dt

I spoke with @dt and confirmed his preference. I've since typed it that way, it's a little bit worse in my esti

pkg/sql/schema_change_plan_node.go, line 165 at r5 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: typo: s/User/Use

Done.

pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

I looked into this a bit and I don't see why this type needs to exist. Instead, it looks like the BackfillTracker instance can be created in the WithTxnInJob implementation. This requires moving stuff out of scjob and into scdeps but that doesn't feel wrong at all, quite the opposite in fact. There will be dependency import issue with backfillTrackerConfig but this can be moved to scsqldeps.

What's motivating my comment is that these dependency injection constructors like NewJobRunDependencies are heavy enough as they are, if we can avoid adding more args then why not.

While we're at it let's also make scsqldeps a child package of scdeps...

I didn't think I could import sql in scdeps. I chose not to create a new package because this felt job-only. I'll do something here.

pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Sounds good to me.

Done.

ajwerner

Alright, I'm happy with where this ended up. I'm going to bors it when it builds. @stevendanna's comments can be applied as follow-up.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)

pkg/sql/schemachanger/scexec/dependencies.go, line 213 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

incomplete comment

Done.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 31 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

Is there any interest in having the ordering within this data structure be more deterministic? I'm pointing that out because the sort in mergeBackfillFromSameSource isn't stable. I'm guessing no, because everything meaningful appears to be done concurrently. This is perhaps worth pointing out in a comment.

The only lack of stability was in the dest index ID order. I've sorted that at the end. Made sure the stability is checked by shuffling in the test.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 194 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

This pattern occurs a few times, consider introducing something like
func (bps []BackfillProgress) forEachProgressConcurrently(func (bp *BackfillProgress) error) error
?

Done.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 216 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: names: you have a runBackfills function and an executeBackfill function

Done.

pkg/sql/schemachanger/scexec/exec_backfill.go, line 225 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

Just curious, what would a scenario look like in which which we'd be penalized for doing so sequentially? I guess what I'm really curious about is how high can the number of destination indexes realistically go.

Probably not that high. It's one per statement on unique column or unique constraint I believe.

ajwerner

Oh, whoops, forgot to move the implementations out of the job. Will do that now.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)

ajwerner

Okay, code has moved, I think it's in a happy place now. The job gains now very little code. Most of the logic is in scdeps. Thanks for the push.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)

pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

Previously, ajwerner wrote…

I didn't think I could import sql in scdeps. I chose not to create a new package because this felt job-only. I'll do something here.

Done.

This commit overhauls how backfills are processed and their state managed in the new schema changer. It also enables running of backfills in parallel. Touches cockroachdb#59788. Release note: None

ajwerner · 2021-12-23T02:48:54Z

'tis the season. Thanks for the reviews.

bors r+

craig · 2021-12-23T03:32:23Z

Build succeeded:

GitHub CI (Cockroach)

postamar

Nicely done.

Reviewed 52 of 52 files at r11, 24 of 38 files at r12, 15 of 15 files at r13, 13 of 15 files at r14, 2 of 2 files at r15, 4 of 4 files at r16, 16 of 16 files at r18, 1 of 1 files at r19, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

Previously, ajwerner wrote…

I didn't think I could import sql in scdeps. I chose not to create a new package because this felt job-only. I'll do something here.

Thanks! Understood.

When cockroachdb#73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of cockroachdb#63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in cockroachdb#139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in cockroachdb#92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error.

143270: kvserver: better obs in TestTxnReadWithinUncertaintyIntervalAfterRangeMerge r=tbg a=tbg Closes #143260. This is a complex test with a rare failure mode. Trace all of the relevant operations so that we can meaningfully engage with it. Epic: none Release note: none 143451: sql,backfill: avoid holding a protected timestamp during the backfill r=rafiss a=rafiss ### sql,rowexec: use WriteTS as timestamp for AddSSTable in backfill PR #64023 made it so we use the backfill's read TS as the batch TS for the AddSSTable operation. Simultaneously with that change, #63945 was being worked on, which made it so that we use a fixed _write_ timestamp for the keys that are backfilled. This write timestamp is the actual minimum lower bound timestamp for the backfill operation, and the read timstamp is allowed to change, so it makes more sense to use the write timestamp for the AddSSTable operation. Looking at the diffs in the PRs makes it seem pretty likely that this was just missed as part of a trivial branch skew. This commit does not cause any behavior change, since at the time of writing, the read timestamp and write timestamp are always identical for the backfill. Release note: None ### sql,backfill: choose new read timestamp when backfill job is resumed When #73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of #63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in #139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in #92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error. ### sql, backfill: use a current timestamp to scan each chunk of the source index There's no need for the backfill to use a fixed read timestamp for the entire job. Instead, each chunk of the original index that needs to be scanned can use a current timestamp. This allows us to remove all the logic for adding protected timestamps for the backfill. Release note (performance improvement): Schema changes that require data to be backfilled no longer hold a protected timestamp for the entire duration of the backfill, which means there is less overhead caused by MVCC garbage collection after the backfill completes. ### sql: stop setting readAsOf in index backfiller As of the parent commit, this is unused by the index backfiller. Release note: None --- fixes #140629 informs #142339 informs #142117 informs #141773 Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>

When cockroachdb#73861 got merged, we seem to have accidentally made it so the index backfiller will never select a new read timestamp. That is contrary to the goals of cockroachdb#63945, where we initially added the ability to advance the read timestamp forward. This mistake is why we were seeing behavior where a GC threshold error would cause the backfill to retry infinitely (which in turn led us to treat as a permanent error in cockroachdb#139203, which was something we never should have done). We never noticed the bug because when CREATE INDEX was added to the declarative schema changer in cockroachdb#92128, we turned off the declarative schema changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC). This commit makes it so we choose a current timestamp as the read timestamp when planning the backfill, and fixes up that test to show that this works for index backfills in the declarative schema changer. Release note (bug fix): Fixed a bug where a GC threshold error (which appears as "batch timestamp must be after replica GC threshold ...") could cause a schema change that backfills data to fail. Now, the error will cause the backfill to be retried at a higher timestamp to avoid the error.

ajwerner force-pushed the ajwerner/job-progress branch 3 times, most recently from 0a66363 to a936877 Compare December 20, 2021 13:26

ajwerner marked this pull request as ready for review December 20, 2021 13:26

ajwerner requested review from a team as code owners December 20, 2021 13:27

ajwerner requested review from a team, adityamaru, miretskiy and postamar and removed request for a team and miretskiy December 20, 2021 13:27

ajwerner force-pushed the ajwerner/job-progress branch 2 times, most recently from 2b30d75 to 9aaef74 Compare December 20, 2021 13:39

adityamaru requested review from rhu713 and stevendanna and removed request for adityamaru December 20, 2021 14:27

postamar reviewed Dec 20, 2021

View reviewed changes

ajwerner force-pushed the ajwerner/job-progress branch from 9aaef74 to 2833033 Compare December 20, 2021 20:16

ajwerner commented Dec 20, 2021

View reviewed changes

postamar reviewed Dec 20, 2021

View reviewed changes

rhu713 reviewed Dec 20, 2021

View reviewed changes

ajwerner force-pushed the ajwerner/job-progress branch 3 times, most recently from d752caa to 9ef4c93 Compare December 22, 2021 03:55

postamar reviewed Dec 22, 2021

View reviewed changes

ajwerner force-pushed the ajwerner/job-progress branch from 8a32a0e to 5452e15 Compare December 22, 2021 19:19

ajwerner commented Dec 22, 2021

View reviewed changes

ajwerner force-pushed the ajwerner/job-progress branch 3 times, most recently from 4048094 to 55aa377 Compare December 22, 2021 23:36

ajwerner commented Dec 22, 2021

View reviewed changes

ajwerner force-pushed the ajwerner/job-progress branch from 55aa377 to 2e4a6eb Compare December 23, 2021 00:39

ajwerner commented Dec 23, 2021

View reviewed changes

sql/schemachanger: implement proper backfill checkpointing

5898a6a

This commit overhauls how backfills are processed and their state managed in the new schema changer. It also enables running of backfills in parallel. Touches cockroachdb#59788. Release note: None

ajwerner force-pushed the ajwerner/job-progress branch from 2e4a6eb to 5898a6a Compare December 23, 2021 00:52

craig bot merged commit 8d4953e into cockroachdb:master Dec 23, 2021

postamar reviewed Dec 23, 2021

View reviewed changes

rafiss mentioned this pull request Mar 25, 2025

sql,backfill: avoid holding a protected timestamp during the backfill #143451

Merged

rafiss mentioned this pull request Mar 26, 2025

release-25.1: sql,backfill: avoid holding a protected timestamp during the backfill #143520

Merged

rafiss mentioned this pull request Mar 27, 2025

release-24.3: sql,backfill: avoid holding a protected timestamp during the backfill #143594

Merged

rafiss mentioned this pull request Mar 27, 2025

release-24.1: sql,backfill: avoid holding a protected timestamp during the backfill #143595

Merged

Conversation

ajwerner commented Dec 15, 2021

Uh oh!

cockroach-teamcity commented Dec 15, 2021

Uh oh!

adityamaru commented Dec 20, 2021

Uh oh!

postamar left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

postamar left a comment

Choose a reason for hiding this comment

Uh oh!

rhu713 left a comment

Choose a reason for hiding this comment

Uh oh!

postamar left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner commented Dec 23, 2021

Uh oh!

craig bot commented Dec 23, 2021

Uh oh!

postamar left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants