Skip to content

sql/schemachanger: implement proper backfill checkpointing#73861

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
ajwerner:ajwerner/job-progress
Dec 23, 2021
Merged

sql/schemachanger: implement proper backfill checkpointing#73861
craig[bot] merged 2 commits intocockroachdb:masterfrom
ajwerner:ajwerner/job-progress

Conversation

@ajwerner
Copy link
Copy Markdown
Contributor

This commit overhauls how backfills are processed and their state managed
in the new schema changer. It also enables running of backfills in parallel.

Touches #59788.

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@ajwerner ajwerner force-pushed the ajwerner/job-progress branch 3 times, most recently from 0a66363 to a936877 Compare December 20, 2021 13:26
@ajwerner ajwerner marked this pull request as ready for review December 20, 2021 13:26
@ajwerner ajwerner requested review from a team as code owners December 20, 2021 13:27
@ajwerner ajwerner requested review from a team, adityamaru, miretskiy and postamar and removed request for a team and miretskiy December 20, 2021 13:27
@ajwerner ajwerner force-pushed the ajwerner/job-progress branch 2 times, most recently from 2b30d75 to 9aaef74 Compare December 20, 2021 13:39
@adityamaru
Copy link
Copy Markdown
Contributor

Adding Steven and Rui since they've been much closer to the backfiller code these days.

@adityamaru adityamaru requested review from rhu713 and stevendanna and removed request for adityamaru December 20, 2021 14:27
Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the result of my first review pass. I'm going to give it all some deeper thought. Naturally, I can't comment on the bulk-io aspects of this change.

Reviewed 31 of 33 files at r1, 8 of 8 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @rhu713, and @stevendanna)


-- commits, line 7 at r1:
nit: Touches #59788 perhaps?


pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):

  // TODO(ajwerner): There's been talk that SpansToDo is worse than DoneSpans
  // in terms of constructs. Discuss with @dt before merging this whether we
  // want to make moves on that.

I don't think that this is an appropriate TODO note to make. Any reader of this note is left wondering why this choice matters, why you think it might need to be revisited, and what motivated you to leave it for later instead of getting it right the first time around. Note also that references to specific people, in particular, never age well.


pkg/sql/index_backfiller.go, line 47 at r2 (raw file):

func (ib *IndexBackfillPlanner) MaybePrepareDestIndexesForBackfill(
	ctx context.Context, progress scexec.BackfillProgress, td catalog.TableDescriptor,
) (scexec.BackfillProgress, error) {

I find it odd that you don't pass progress by reference in this function, like you do for scanDestIndexesBeforeBackfill.


pkg/sql/schema_change_plan_node.go, line 166 at r2 (raw file):

		execCfg.IndexBackfiller,
		scdeps.NewNoOpBackfillTracker(execCfg.Codec),
		scdeps.NewNoOpPeriodicBackfillProgressFlusher(),

Could you add a comment explaining that these no-op things are OK because we're not going to be doing backfills synchronously? This would help reduce potential surprise.

Code quote:

		scdeps.NewNoOpBackfillTracker(execCfg.Codec),
		scdeps.NewNoOpPeriodicBackfillProgressFlusher(),

pkg/sql/schemachanger/scdeps/exec_deps.go, line 320 at r2 (raw file):

// PeriodicProgressWriter implements the scexec.Dependencies interface.
func (d *execDeps) PeriodicProgressWriter() scexec.PeriodicBackfillProgressFlusher {

Can you use the same very throughout your PR please? Either call these "progress writers" or "progress flushers" but not both.

Personally I don't like the backfillFlushers I've been seeing, it's the progress that's being flushed, not the backfill. It's confusing.


pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):

// BackfillTrackerFactory constructs a new BackfillTracker.
type BackfillTrackerFactory func() scexec.BackfillTracker

The local convention here is to suffix these as "Builder" and not "Factory". I'm not sure which one's correct, but I feel we should try to conform to a pattern at least.


pkg/sql/schemachanger/scdeps/exec_deps.go, line 360 at r2 (raw file):

// contains a full set of SpansToDo corresponding to the source index
// span and an empty MinimumWriteTimestamp.
func NewNoOpBackfillTracker(codec keys.SQLCodec) scexec.BackfillTracker {

I have no idea if it's worth doing but this implementation might conceivably be moved to scexec. It's fine here too, just putting the idea out there.


pkg/sql/schemachanger/scdeps/run_deps.go, line 34 at r2 (raw file):

	internalExecutor sqlutil.InternalExecutor,
	backfiller scexec.Backfiller,
	backfillTracker BackfillTrackerFactory,

s/backfillTracker/backfillTrackerBuilder/ ?

same applies below also


pkg/sql/schemachanger/scdeps/sctestdeps/backfill.go, line 76 at r2 (raw file):

}

type backfiller struct {

nit: can you rename this and all other sctestdeps implementations testBackfiller and the like? The only reason I'm asking is I checked out your branch and looked at the code in goland, and when I look at the interface's implementations I saw backfiller alongside the real implementation and so wondered for a second why there were two implementations.


pkg/sql/schemachanger/scdeps/sctestdeps/test_state.go, line 44 at r2 (raw file):

	backfiller                        scexec.Backfiller
	indexSpanSplitter                 scexec.IndexSpanSplitter
	backfillTracker                   scexec.BackfillTracker

Can you add a comment here explaining why we're dropping interfaces in TestState? Again, it might be surprising at first glance.

Code quote:

	backfiller                        scexec.Backfiller
	indexSpanSplitter                 scexec.IndexSpanSplitter
	backfillTracker                   scexec.BackfillTracker

pkg/sql/schemachanger/scexec/exec_backfill.go, line 67 at r2 (raw file):

				SourceIndexID: table.GetPrimaryIndexID(),
				DestIndexIDs:  []descpb.IndexID{op.IndexID},
			})

Determining the SourceIndexID should happen at op-generation-time, not execution-time. After discussing this offline with you, you agreed to follow up in a subsequent commit in this PR.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 125 at r2 (raw file):

}

func scanDestinationIndexes(

s/scanDestinationIndexes/maybeScanDestinationIndexes/ ?


pkg/sql/schemachanger/scexec/exec_backfill.go, line 131 at r2 (raw file):

	tables map[descpb.ID]catalog.TableDescriptor,
	tracker BackfillProgressWriter,
) (bool, error) {

s/(bool, error)/(didScan bool, err error)/ ?


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

// TODO(ajwerner): Deal with setting the running status to indicate scanning
// and backfilling.

To be fair, the running status of the schema changer job itself could probably do with some love regardless of backfills. Consider moving this TODO somewhere more appropriate and useful.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 28 at r2 (raw file):

// and backfilling.
// TODO(ajwerner): Deal with testing knobs to always flush details and
// progress.

This doesn't seem like the right location for this TODO. Shouldn't this be located at the affected knobs and/or tests?


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 29 at r2 (raw file):

// TODO(ajwerner): Deal with testing knobs to always flush details and
// progress.
// TODO(ajwerner): Avoid writing checkpoints which don't contain updates.

Please clarify.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 52 at r2 (raw file):

}

type backfillTrackerConfig struct {

Consider commenting that this thing exists in order to make unit-testing easier.


pkg/sql/schemachanger/backfill_test.go, line 27 at r2 (raw file):

)

func TestBackfillCheckpoints(t *testing.T) {

Can you add a comment to describe what this test does?


pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):

					require.NotNil(t, tab)
					fooID = tab.GetID()
					mut := tabledesc.NewBuilder(tab.TableDesc()).BuildExistingMutableTable()

fyi you can now do tab.NewBuilder() directly


pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 235 at r2 (raw file):

					SpansToDo:             []roachpb.Span{bar.IndexSpan(keys.SystemSQLCodec, 1)},
				}
				log.Infof(ctx, "hi")

detritus

@ajwerner ajwerner force-pushed the ajwerner/job-progress branch from 9aaef74 to 2833033 Compare December 20, 2021 20:16
Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @postamar, @rhu713, and @stevendanna)


-- commits, line 7 at r1:

Previously, postamar (Marius Posta) wrote…

nit: Touches #59788 perhaps?

Done.


pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

I don't think that this is an appropriate TODO note to make. Any reader of this note is left wondering why this choice matters, why you think it might need to be revisited, and what motivated you to leave it for later instead of getting it right the first time around. Note also that references to specific people, in particular, never age well.

I left this note for me to sort out on this review. cc @dt


pkg/sql/index_backfiller.go, line 47 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

I find it odd that you don't pass progress by reference in this function, like you do for scanDestIndexesBeforeBackfill.

I've inlined that function here now. There was no reason for it to exist.


pkg/sql/schema_change_plan_node.go, line 166 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Could you add a comment explaining that these no-op things are OK because we're not going to be doing backfills synchronously? This would help reduce potential surprise.

Done.


pkg/sql/schemachanger/scdeps/exec_deps.go, line 320 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Can you use the same very throughout your PR please? Either call these "progress writers" or "progress flushers" but not both.

Personally I don't like the backfillFlushers I've been seeing, it's the progress that's being flushed, not the backfill. It's confusing.

Done.


pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

The local convention here is to suffix these as "Builder" and not "Factory". I'm not sure which one's correct, but I feel we should try to conform to a pattern at least.

My 🚲🏠 is that factories construct stateful object-y things and builders construct data things. 🤷


pkg/sql/schemachanger/scdeps/exec_deps.go, line 360 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

I have no idea if it's worth doing but this implementation might conceivably be moved to scexec. It's fine here too, just putting the idea out there.

Slight preference for it here.


pkg/sql/schemachanger/scdeps/run_deps.go, line 34 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

s/backfillTracker/backfillTrackerBuilder/ ?

same applies below also

done.


pkg/sql/schemachanger/scdeps/sctestdeps/backfill.go, line 76 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: can you rename this and all other sctestdeps implementations testBackfiller and the like? The only reason I'm asking is I checked out your branch and looked at the code in goland, and when I look at the interface's implementations I saw backfiller alongside the real implementation and so wondered for a second why there were two implementations.

Done.


pkg/sql/schemachanger/scdeps/sctestdeps/test_state.go, line 44 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Can you add a comment here explaining why we're dropping interfaces in TestState? Again, it might be surprising at first glance.

Done.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 67 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Determining the SourceIndexID should happen at op-generation-time, not execution-time. After discussing this offline with you, you agreed to follow up in a subsequent commit in this PR.

Done.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 125 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

s/scanDestinationIndexes/maybeScanDestinationIndexes/ ?

Done.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 131 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

s/(bool, error)/(didScan bool, err error)/ ?

Done. Also, parallelized this and added commentary.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

To be fair, the running status of the schema changer job itself could probably do with some love regardless of backfills. Consider moving this TODO somewhere more appropriate and useful.

I'd like to rebase my change on yours. I think this comment belongs on your new job registry abstraction.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 28 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

This doesn't seem like the right location for this TODO. Shouldn't this be located at the affected knobs and/or tests?

I've removed this TODO. It's not really relevant. We use the referenced cluster settings in two old schema changer tests that we won't be porting.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 29 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Please clarify.

Did it.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 52 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Consider commenting that this thing exists in order to make unit-testing easier.

Done.


pkg/sql/schemachanger/backfill_test.go, line 27 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Can you add a comment to describe what this test does?

Done.


pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

fyi you can now do tab.NewBuilder() directly

It'd have to be tab.NewBuilder().(tabledesc.TableDescriptorBuilder).BuildExistingMutableTable() which doesn't seem better.


pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 235 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

detritus

Done.

Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my concerns. I have nothing left to complain about at this time :)

LGTM if the bulk-io reviewers are happy with it too.

Reviewed 3 of 39 files at r3, 40 of 41 files at r4, 14 of 14 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @postamar, @rhu713, and @stevendanna)


pkg/sql/schema_change_plan_node.go, line 165 at r5 (raw file):

		execCfg.JobRegistry,
		execCfg.IndexBackfiller,
		// User a no-op tracker and flusher because while backfilling in a

nit: typo: s/User/Use


pkg/sql/schemachanger/scdeps/exec_deps.go, line 354 at r2 (raw file):

Previously, ajwerner wrote…

My 🚲🏠 is that factories construct stateful object-y things and builders construct data things. 🤷

I'll stick to that convention too henceforth, I like it.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

Previously, ajwerner wrote…

I'd like to rebase my change on yours. I think this comment belongs on your new job registry abstraction.

Sounds good to me.


pkg/sql/schemachanger/scexec/exec_backfill_test.go, line 188 at r2 (raw file):

Previously, ajwerner wrote…

It'd have to be tab.NewBuilder().(tabledesc.TableDescriptorBuilder).BuildExistingMutableTable() which doesn't seem better.

Oh, right.

Copy link
Copy Markdown
Contributor

@rhu713 rhu713 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good from my perspective, though I'd also want to get input from @stevendanna as well.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @postamar, and @stevendanna)


pkg/sql/schemachanger/scexec/dependencies.go, line 124 at r5 (raw file):

	// MaybePrepareDestIndexesForBackfill to construct a properly initialized
	// progress.
	BackfillIndex(

This interface for the index backfiller looks good to me.

@ajwerner ajwerner force-pushed the ajwerner/job-progress branch 3 times, most recently from d752caa to 9ef4c93 Compare December 22, 2021 03:55
Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this another, deeper look. I'm happy with this change and my comments are all code hygiene concerns.

Reviewed 37 of 38 files at r9, 16 of 16 files at r10, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)


pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

// BackfillTrackerFactory constructs a new BackfillTracker.
type BackfillTrackerFactory func() scexec.BackfillTracker

I looked into this a bit and I don't see why this type needs to exist. Instead, it looks like the BackfillTracker instance can be created in the WithTxnInJob implementation. This requires moving stuff out of scjob and into scdeps but that doesn't feel wrong at all, quite the opposite in fact. There will be dependency import issue with backfillTrackerConfig but this can be moved to scsqldeps.

What's motivating my comment is that these dependency injection constructors like NewJobRunDependencies are heavy enough as they are, if we can avoid adding more args then why not.

While we're at it let's also make scsqldeps a child package of scdeps...


pkg/sql/schemachanger/scexec/dependencies.go, line 213 at r10 (raw file):

	// GetBackfillProgress reads the backfill progress for the specified backfill.
	// If no such backfill has been stored previously, this call will return a
	// new BackfillProgress without

incomplete comment


pkg/sql/schemachanger/scexec/exec_backfill.go, line 31 at r10 (raw file):

func executeBackfillOps(ctx context.Context, deps Dependencies, execute []scop.Op) (err error) {
	backfillsToExecute := extractBackfillsFromOps(execute)

Is there any interest in having the ordering within this data structure be more deterministic? I'm pointing that out because the sort in mergeBackfillFromSameSource isn't stable. I'm guessing no, because everything meaningful appears to be done concurrently. This is perhaps worth pointing out in a comment.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 194 at r10 (raw file):

	g, gCtx := errgroup.WithContext(ctx)
	for i := range progresses {
		p := progresses[i] // copy for closure

This pattern occurs a few times, consider introducing something like

func (bps []BackfillProgress) forEachProgressConcurrently(func (bp *BackfillProgress) error) error

?


pkg/sql/schemachanger/scexec/exec_backfill.go, line 216 at r10 (raw file):

}

func executeBackfill(

nit: names: you have a runBackfills function and an executeBackfill function


pkg/sql/schemachanger/scexec/exec_backfill.go, line 225 at r10 (raw file):

) error {
	// Split off the index span prior to backfilling.
	// TODO(ajwerner): Consider parallelizing splits.

Just curious, what would a scenario look like in which which we'd be penalized for doing so sequentially? I guess what I'm really curious about is how high can the number of destination indexes realistically go.

@ajwerner ajwerner force-pushed the ajwerner/job-progress branch from 8a32a0e to 5452e15 Compare December 22, 2021 19:19
Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit to the end of the chain which addresses the SpansToDo->CompletedSpan transition. I left it as its own commit just so we could look at it in isolation, I'll squash it down before merging.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)


pkg/jobs/jobspb/jobs.proto, line 538 at r1 (raw file):

Previously, ajwerner wrote…

I left this note for me to sort out on this review. cc @dt

I spoke with @dt and confirmed his preference. I've since typed it that way, it's a little bit worse in my esti


pkg/sql/schema_change_plan_node.go, line 165 at r5 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: typo: s/User/Use

Done.


pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

I looked into this a bit and I don't see why this type needs to exist. Instead, it looks like the BackfillTracker instance can be created in the WithTxnInJob implementation. This requires moving stuff out of scjob and into scdeps but that doesn't feel wrong at all, quite the opposite in fact. There will be dependency import issue with backfillTrackerConfig but this can be moved to scsqldeps.

What's motivating my comment is that these dependency injection constructors like NewJobRunDependencies are heavy enough as they are, if we can avoid adding more args then why not.

While we're at it let's also make scsqldeps a child package of scdeps...

I didn't think I could import sql in scdeps. I chose not to create a new package because this felt job-only. I'll do something here.


pkg/sql/schemachanger/scjob/backfill_tracker.go, line 26 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Sounds good to me.

Done.

@ajwerner ajwerner force-pushed the ajwerner/job-progress branch 3 times, most recently from 4048094 to 55aa377 Compare December 22, 2021 23:36
Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'm happy with where this ended up. I'm going to bors it when it builds. @stevendanna's comments can be applied as follow-up.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)


pkg/sql/schemachanger/scexec/dependencies.go, line 213 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

incomplete comment

Done.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 31 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

Is there any interest in having the ordering within this data structure be more deterministic? I'm pointing that out because the sort in mergeBackfillFromSameSource isn't stable. I'm guessing no, because everything meaningful appears to be done concurrently. This is perhaps worth pointing out in a comment.

The only lack of stability was in the dest index ID order. I've sorted that at the end. Made sure the stability is checked by shuffling in the test.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 194 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

This pattern occurs a few times, consider introducing something like

func (bps []BackfillProgress) forEachProgressConcurrently(func (bp *BackfillProgress) error) error

?

Done.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 216 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: names: you have a runBackfills function and an executeBackfill function

Done.


pkg/sql/schemachanger/scexec/exec_backfill.go, line 225 at r10 (raw file):

Previously, postamar (Marius Posta) wrote…

Just curious, what would a scenario look like in which which we'd be penalized for doing so sequentially? I guess what I'm really curious about is how high can the number of destination indexes realistically go.

Probably not that high. It's one per statement on unique column or unique constraint I believe.

Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, whoops, forgot to move the implementations out of the job. Will do that now.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)

@ajwerner ajwerner force-pushed the ajwerner/job-progress branch from 55aa377 to 2e4a6eb Compare December 23, 2021 00:39
Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, code has moved, I think it's in a happy place now. The job gains now very little code. Most of the logic is in scdeps. Thanks for the push.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)

Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @stevendanna)


pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

Previously, ajwerner wrote…

I didn't think I could import sql in scdeps. I chose not to create a new package because this felt job-only. I'll do something here.

Done.

This commit overhauls how backfills are processed and their state managed
in the new schema changer. It also enables running of backfills in parallel.

Touches cockroachdb#59788.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/job-progress branch from 2e4a6eb to 5898a6a Compare December 23, 2021 00:52
@ajwerner
Copy link
Copy Markdown
Contributor Author

'tis the season. Thanks for the reviews.

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 23, 2021

Build succeeded:

@craig craig bot merged commit 8d4953e into cockroachdb:master Dec 23, 2021
Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done.

Reviewed 52 of 52 files at r11, 24 of 38 files at r12, 15 of 15 files at r13, 13 of 15 files at r14, 2 of 2 files at r15, 4 of 4 files at r16, 16 of 16 files at r18, 1 of 1 files at r19, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/schemachanger/scdeps/exec_deps.go, line 374 at r10 (raw file):

Previously, ajwerner wrote…

I didn't think I could import sql in scdeps. I chose not to create a new package because this felt job-only. I'll do something here.

Thanks! Understood.

rafiss added a commit to rafiss/cockroach that referenced this pull request Mar 26, 2025
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index
backfiller will never select a new read timestamp. That is contrary to
the goals of cockroachdb#63945, where we initially added the ability to advance the
read timestamp forward.

This mistake is why we were seeing behavior where a GC threshold error
would cause the backfill to retry infinitely (which in turn led us to treat
as a permanent error in cockroachdb#139203, which was something we never should have
done). We never noticed the bug because when CREATE INDEX was added to the
declarative schema changer in cockroachdb#92128, we turned off the declarative schema
changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC).

This commit makes it so we choose a current timestamp as the read
timestamp when planning the backfill, and fixes up that test to show
that this works for index backfills in the declarative schema changer.

Release note (bug fix): Fixed a bug where a GC threshold error (which
appears as "batch timestamp must be after replica GC threshold ...")
could cause a schema change that backfills data to fail. Now, the error
will cause the backfill to be retried at a higher timestamp to avoid the
error.
craig bot pushed a commit that referenced this pull request Mar 26, 2025
143270: kvserver: better obs in TestTxnReadWithinUncertaintyIntervalAfterRangeMerge r=tbg a=tbg

Closes #143260.

This is a complex test with a rare failure mode. Trace all of the relevant operations so that we can meaningfully engage with it.

Epic: none
Release note: none

143451: sql,backfill: avoid holding a protected timestamp during the backfill r=rafiss a=rafiss

### sql,rowexec: use WriteTS as timestamp for AddSSTable in backfill

PR #64023 made it so we use the backfill's read TS as the batch TS for
the AddSSTable operation. Simultaneously with that change, #63945 was
being worked on, which made it so that we use a fixed _write_ timestamp
for the keys that are backfilled. This write timestamp is the actual
minimum lower bound timestamp for the backfill operation, and the read
timstamp is allowed to change, so it makes more sense to use the write
timestamp for the AddSSTable operation. Looking at the diffs in the PRs
makes it seem pretty likely that this was just missed as part of a
trivial branch skew.

This commit does not cause any behavior change, since at the time of
writing, the read timestamp and write timestamp are always identical for
the backfill.

Release note: None

### sql,backfill: choose new read timestamp when backfill job is resumed

When #73861 got merged, we seem to have accidentally made it so the index
backfiller will never select a new read timestamp. That is contrary to
the goals of #63945, where we initially added the ability to advance the
read timestamp forward.

This mistake is why we were seeing behavior where a GC threshold error
would cause the backfill to retry infinitely (which in turn led us to treat
as a permanent error in #139203, which was something we never should have
done). We never noticed the bug because when CREATE INDEX was added to the
declarative schema changer in #92128, we turned off the declarative schema
changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC).

This commit makes it so we choose a current timestamp as the read
timestamp when planning the backfill, and fixes up that test to show
that this works for index backfills in the declarative schema changer.

Release note (bug fix): Fixed a bug where a GC threshold error (which
appears as "batch timestamp must be after replica GC threshold ...")
could cause a schema change that backfills data to fail. Now, the error
will cause the backfill to be retried at a higher timestamp to avoid the
error.

### sql, backfill: use a current timestamp to scan each chunk of the source index

There's no need for the backfill to use a fixed read timestamp for the
entire job. Instead, each chunk of the original index that needs to be
scanned can use a current timestamp. This allows us to remove all the
logic for adding protected timestamps for the backfill.

Release note (performance improvement): Schema changes that require
data to be backfilled no longer hold a protected timestamp for the
entire duration of the backfill, which means there is less overhead
caused by MVCC garbage collection after the backfill completes.

### sql: stop setting readAsOf in index backfiller

As of the parent commit, this is unused by the index backfiller.

Release note: None

---

fixes #140629
informs #142339
informs #142117
informs #141773

Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
rafiss added a commit to rafiss/cockroach that referenced this pull request Mar 26, 2025
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index
backfiller will never select a new read timestamp. That is contrary to
the goals of cockroachdb#63945, where we initially added the ability to advance the
read timestamp forward.

This mistake is why we were seeing behavior where a GC threshold error
would cause the backfill to retry infinitely (which in turn led us to treat
as a permanent error in cockroachdb#139203, which was something we never should have
done). We never noticed the bug because when CREATE INDEX was added to the
declarative schema changer in cockroachdb#92128, we turned off the declarative schema
changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC).

This commit makes it so we choose a current timestamp as the read
timestamp when planning the backfill, and fixes up that test to show
that this works for index backfills in the declarative schema changer.

Release note (bug fix): Fixed a bug where a GC threshold error (which
appears as "batch timestamp must be after replica GC threshold ...")
could cause a schema change that backfills data to fail. Now, the error
will cause the backfill to be retried at a higher timestamp to avoid the
error.
rafiss added a commit to rafiss/cockroach that referenced this pull request Mar 27, 2025
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index
backfiller will never select a new read timestamp. That is contrary to
the goals of cockroachdb#63945, where we initially added the ability to advance the
read timestamp forward.

This mistake is why we were seeing behavior where a GC threshold error
would cause the backfill to retry infinitely (which in turn led us to treat
as a permanent error in cockroachdb#139203, which was something we never should have
done). We never noticed the bug because when CREATE INDEX was added to the
declarative schema changer in cockroachdb#92128, we turned off the declarative schema
changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC).

This commit makes it so we choose a current timestamp as the read
timestamp when planning the backfill, and fixes up that test to show
that this works for index backfills in the declarative schema changer.

Release note (bug fix): Fixed a bug where a GC threshold error (which
appears as "batch timestamp must be after replica GC threshold ...")
could cause a schema change that backfills data to fail. Now, the error
will cause the backfill to be retried at a higher timestamp to avoid the
error.
rafiss added a commit to rafiss/cockroach that referenced this pull request Mar 27, 2025
When cockroachdb#73861 got merged, we seem to have accidentally made it so the index
backfiller will never select a new read timestamp. That is contrary to
the goals of cockroachdb#63945, where we initially added the ability to advance the
read timestamp forward.

This mistake is why we were seeing behavior where a GC threshold error
would cause the backfill to retry infinitely (which in turn led us to treat
as a permanent error in cockroachdb#139203, which was something we never should have
done). We never noticed the bug because when CREATE INDEX was added to the
declarative schema changer in cockroachdb#92128, we turned off the declarative schema
changer in the test that would have caught this (i.e. TestIndexBackfillAfterGC).

This commit makes it so we choose a current timestamp as the read
timestamp when planning the backfill, and fixes up that test to show
that this works for index backfills in the declarative schema changer.

Release note (bug fix): Fixed a bug where a GC threshold error (which
appears as "batch timestamp must be after replica GC threshold ...")
could cause a schema change that backfills data to fail. Now, the error
will cause the backfill to be retried at a higher timestamp to avoid the
error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants