sql_instance: migrate to rbr compatible index by jeffswenson · Pull Request #98445 · cockroachdb/cockroach

jeffswenson · 2023-03-11T23:39:24Z

Migrate sql_instance to a regional by row compatible index. The version
gates are intended to follow the protocol discussed in the comment at
the top of upgrades/system_rbr_indexes.go

The crdb_region column ID was changed from 5 to 6 in order to match the
logical order in which the sql_addr and crdb_region columns were added.
The exact ID doesn't really matter in this case since the sql_addr
column was added in v23.1.

Most of the rbr migration work is the same for sqlliveness, lease, and
sql_instances. The main exception to that is the migration cache used by
the sql instance reader. The cache is backed by a range feed and we need
to switch implementations when the version setting changes.

Part of #94843

Relase note: None

cockroach-teamcity · 2023-03-11T23:39:31Z

This change is

ajwerner

This is close. Some of the logic is quite subtle and deserves a bit more commentary and another pass to try to make the control flow clearer.

Reviewed 13 of 22 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson)

pkg/sql/sqlinstance/instancestorage/instancecache.go line 223 at r2 (raw file):

// from the old implementation to the new implementation when the version
// changes to V23_1_SystemRbrReadNew.
func newMigrationCache(

this function is a beast.

pkg/sql/sqlinstance/instancestorage/instancecache.go line 253 at r2 (raw file):

		// Use the background context because we don't want RPC cancellation to
		// prevent starting the new range feed.
		err := stopper.RunAsyncTask(context.Background(), "start-new-cache-implementation", func(ctx context.Context) {

nit: annotate the context with logtags. Probably cacheCtx := logtags.AddTags(context.Background(), logtags.FromContext(ctx))

pkg/sql/sqlinstance/instancestorage/instancecache.go line 256 at r2 (raw file):

			log.Ops.Info(ctx, "starting new system.sql_instance cache")

			cache, err := newCache(ctx)

I'd feel better if there were a defer which sent on this channel the return value of a function. In general, I feel like it'd be nicer if you made closures for two functions which send on the channels and lay out the invariants about who is going to send on which channel. After having scrutinized this a bunch, I don't see obvious bugs if there are no panics, but this thing is surely complex.

pkg/sql/sqlinstance/instancestorage/instancecache.go line 306 at r2 (raw file):

	select {
	case err = <-newReady:

I think there's a subtle race here. Imagine that the version change notification happens and then the stopper gets shut down. There's a case where the versionChanged field could get set to true, and the RunAsyncTask code can choose not to run. We'll log a warning but nobody will send on either channel. I think you can fix that by sending on the channel when unable to start the new cache.

All of this code is pretty subtle. My sense is that splitting out some of the logic into helpers can make the behavior a bit clearer.

pkg/sql/tests/system_table_test.go line 223 at r2 (raw file):

		require.NoError(t, desctestutils.TestingValidateSelf(gen))

		switch gen.GetID() {

Explain this in commentary

jeffswenson

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @yuzefovich)

pkg/sql/sqlinstance/instancestorage/instancecache.go line 223 at r2 (raw file):

Previously, ajwerner wrote…

this function is a beast.

I reworked the function and tried to clean up some of the state. Let me know what you think.

pkg/sql/sqlinstance/instancestorage/instancecache.go line 256 at r2 (raw file):

Previously, ajwerner wrote…

I'd feel better if there were a defer which sent on this channel the return value of a function. In general, I feel like it'd be nicer if you made closures for two functions which send on the channels and lay out the invariants about who is going to send on which channel. After having scrutinized this a bunch, I don't see obvious bugs if there are no panics, but this thing is surely complex.

I've reworked the implementation. Let me know if it is still difficult to follow and I will take another crack at it.

pkg/sql/sqlinstance/instancestorage/instancecache.go line 306 at r2 (raw file):

Previously, ajwerner wrote…

I think there's a subtle race here. Imagine that the version change notification happens and then the stopper gets shut down. There's a case where the versionChanged field could get set to true, and the RunAsyncTask code can choose not to run. We'll log a warning but nobody will send on either channel. I think you can fix that by sending on the channel when unable to start the new cache.

All of this code is pretty subtle. My sense is that splitting out some of the logic into helpers can make the behavior a bit clearer.

Done.

pkg/sql/tests/system_table_test.go line 223 at r2 (raw file):

Previously, ajwerner wrote…

Explain this in commentary

Done.

ajwerner

LGTM

pkg/sql/sqlinstance/instancestorage/row_codec_test.go

Migrate sql_instance to a regional by row compatible index. The version gates are intended to follow the protocol discussed in the comment at the top of upgrades/system_rbr_indexes.go The crdb_region column ID was changed from 5 to 6 in order to match the logical order in which the sql_addr and crdb_region columns were added. The exact ID doesn't really matter in this case since the sql_addr column was added in v23.1. Most of the rbr migration work is the same for sqlliveness, lease, and sql_instances. The main exception to that is the migration cache used by the sql instance reader. The cache is backed by a range feed and we need to switch implementations when the version setting changes. Part of cockroachdb#94843 Relase note: None

jeffswenson · 2023-03-14T21:17:29Z

bors r+

craig · 2023-03-14T22:44:16Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

craig · 2023-03-15T00:18:25Z

Build succeeded:

Bazel Essential CI (Cockroach)

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch 2 times, most recently from 6b69604 to 11a767b Compare March 12, 2023 18:59

jeffswenson changed the title ~~sqlinstance: migrate to rbr compatible table~~ sql_instance: migrate to rbr compatible table Mar 12, 2023

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch 2 times, most recently from f7bbe2e to 3ad8b79 Compare March 12, 2023 21:54

This was referenced Mar 13, 2023

upgradejob: initialize the kvdb on TenantDeps #98489

Merged

settingswatcher: version guard support for clusters bootstrapped at old versions #98495

Merged

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch 4 times, most recently from 63ecb4d to d681cbe Compare March 13, 2023 20:37

jeffswenson changed the title ~~sql_instance: migrate to rbr compatible table~~ sql_instance: migrate to rbr compatible index Mar 13, 2023

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch from d681cbe to df42bba Compare March 13, 2023 20:51

jeffswenson marked this pull request as ready for review March 13, 2023 20:52

jeffswenson requested a review from a team March 13, 2023 20:52

jeffswenson requested review from a team as code owners March 13, 2023 20:52

ajwerner mentioned this pull request Mar 13, 2023

sql: allow system tables to be regional by row tables #65536

Closed

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch from df42bba to 7ccf9e9 Compare March 13, 2023 21:18

ajwerner reviewed Mar 13, 2023

View reviewed changes

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch from 7ccf9e9 to ac50d3e Compare March 14, 2023 13:54

jeffswenson requested a review from a team as a code owner March 14, 2023 13:54

jeffswenson requested a review from yuzefovich March 14, 2023 13:54

jeffswenson commented Mar 14, 2023

View reviewed changes

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch 2 times, most recently from 04895a4 to 192893a Compare March 14, 2023 15:14

ajwerner approved these changes Mar 14, 2023

View reviewed changes

pkg/sql/sqlinstance/instancestorage/row_codec_test.go Outdated Show resolved Hide resolved

jeffswenson force-pushed the jeffswenson-sqlinstance-migration branch from 192893a to c9d2ad8 Compare March 14, 2023 19:43

craig bot merged commit d5ff210 into cockroachdb:master Mar 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql_instance: migrate to rbr compatible index#98445

sql_instance: migrate to rbr compatible index#98445
craig[bot] merged 1 commit intocockroachdb:masterfrom
jeffswenson:jeffswenson-sqlinstance-migration

jeffswenson commented Mar 11, 2023 •

edited

Loading

Uh oh!

cockroach-teamcity commented Mar 11, 2023

Uh oh!

ajwerner left a comment

Uh oh!

jeffswenson left a comment

Uh oh!

ajwerner left a comment

Uh oh!

Uh oh!

jeffswenson commented Mar 14, 2023

Uh oh!

craig bot commented Mar 14, 2023

Uh oh!

craig bot commented Mar 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jeffswenson commented Mar 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Mar 11, 2023

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

jeffswenson left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeffswenson commented Mar 14, 2023

Uh oh!

craig bot commented Mar 14, 2023

Uh oh!

craig bot commented Mar 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jeffswenson commented Mar 11, 2023 •

edited

Loading