Skip to content

sql: migration for descriptors fixes#66495

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
RichardJCai:privileges_migration_06152021
Jul 2, 2021
Merged

sql: migration for descriptors fixes#66495
craig[bot] merged 2 commits intocockroachdb:masterfrom
RichardJCai:privileges_migration_06152021

Conversation

@RichardJCai
Copy link
Copy Markdown
Contributor

@RichardJCai RichardJCai commented Jun 15, 2021

This migration fixes corrupted privilege descriptors in the wild resulting
from the fallout of the ZONECONFIG/USAGE bug and the ALTER DATABASE CONVERT TO
SCHEMA bug. Additionally, all privilege descriptors will be updated to explicitly
have Owners. Privilege descriptors created before 20.2 do not have their owner field
set and instead owners are implicitly inferred.

Finally all privilege descriptors will be bumped up to Version21_2 after the migration
so we can strictly validate privileges.

Release note (bugfix): This migration will be run when users upgrade to
21.2 (Specifically 21.2.14).
This migration fixes any privilege descriptors that are corrupted from the
fallout of the ZONECONFIG/USAGE bug on tables and databases after upgrading
from 20.1 -> 20.2 (#65010) and those that are corrupted after converting a
database to a schema (#65697).

Fixes #65012

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@RichardJCai RichardJCai requested review from a team and ajwerner June 15, 2021 19:21
@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 41abe6c to 9772112 Compare June 16, 2021 14:31
Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few comments but I'd like Andrew to take a look at this as well at some point, just to make sure. I'm paranoid about migrations.

Somewhat related and I might be wrong, but the backup case could perhaps be handled by adding a new upgrade step to theRunPostDeserializationChanges method of catalog.DescriptorBuilder, in which case it'd be appropriate to include in this PR.

Copy link
Copy Markdown
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for the release note, i think we can make it something a bit more user-focused. It can just focus on what the end-user would notice, and the other details can stay just in the commit message.

Release note (bug fix): An internal upgrade was added that runs when a cluster is upgraded to v21.2, and which fixes the representation of database object privileges. This fixes a potential source of corruption that could leave the database in an unusable state.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @RichardJCai)


pkg/migration/migrations/fix_privileges_migration.go, line 84 at r1 (raw file):

		if privilegeDesc.Version >= descpb.Version21_2 {
			log.Infof(ctx, "privilege descriptor has already been fixed, skipping %d", upgrade)

nit: will this get logged every time a node starts? maybe we'd want to be less verbose.


pkg/migration/migrations/fix_privileges_migration_external_test.go, line 36 at r1 (raw file):

)

func TestFixPrivilegesMigration(t *testing.T) {

cool tests! nit on adjusting the whtiespace for the sql queries in the comments.

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 14 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RichardJCai)


pkg/migration/migrations/fix_privileges_migration.go, line 37 at r1 (raw file):

	var lastUpgradedID descpb.ID
	for {
		done, idToUpgrade, objectType, err := findNextDescriptorToFix(ctx, d.InternalExecutor, lastUpgradedID)

I find the control flow here to be surprising. It's mostly weird because we're scanning the descriptors table, decoding the thing, then throwing that away, getting it by ID etc.

Instead, what if we just literally scanned the descriptor table and then did the modification in place. I think it would simplify the code a bunch. You can still use descs.Txn to do the writing. catalogkv.NewBuilderWithMVCCTimestamp(&desc, ts) is your friend. Lastly, I think you'll benefit greatly from marking a bit if you did a repair -- I think you can avoid all of the specialized logic that occurs on a per-type basis, keeping all of the complexity in the builder.

@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 9772112 to 38eaed7 Compare June 21, 2021 18:28
Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, did too much cargo culting previously. Reworked so all descriptors are upgraded in one transaction and refactored a bunch of logic including some of the old MaybeFixPrivilege logic, we can hopefully remove it after 21.2

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @rafiss)


pkg/migration/migrations/fix_privileges_migration.go, line 37 at r1 (raw file):

Previously, ajwerner wrote…

I find the control flow here to be surprising. It's mostly weird because we're scanning the descriptors table, decoding the thing, then throwing that away, getting it by ID etc.

Instead, what if we just literally scanned the descriptor table and then did the modification in place. I think it would simplify the code a bunch. You can still use descs.Txn to do the writing. catalogkv.NewBuilderWithMVCCTimestamp(&desc, ts) is your friend. Lastly, I think you'll benefit greatly from marking a bit if you did a repair -- I think you can avoid all of the specialized logic that occurs on a per-type basis, keeping all of the complexity in the builder.

Agreed, too much cargo culting. I've changed a lot here.

Also, I think the Version field is akin to the bit for a repair (it determines whether we need to repair or not as descriptors created with version 21.2 are guaranteed to not have these issues)


pkg/migration/migrations/fix_privileges_migration.go, line 50 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

This might be out of scope but would it be possible to also apply descpb.MaybeFixPrivileges and perhaps descpb.MaybeFixUsagePrivForTablesAndDBs in this migration instead of having to rely solely on the RunPostDeserializationChanges method of catalog.DescriptorBuilder for this purpose?

Done (I believe). Changed it so MaybeFixPrivileges is used in this migration.


pkg/migration/migrations/fix_privileges_migration.go, line 81 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

desc, err := descriptors.GetMutableDescriptorByID(...)?

N/A


pkg/migration/migrations/fix_privileges_migration.go, line 84 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

nit: will this get logged every time a node starts? maybe we'd want to be less verbose.

Should only be called once for each ID during the migration, not during node start.


pkg/migration/migrations/fix_privileges_migration.go, line 102 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

I allude to this in another comment but wouldn't having one transaction per descriptor be wasteful?

Done.


pkg/migration/migrations/fix_privileges_migration.go, line 131 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

Have you considered calling catalogkv.NewBuilderWithMVCCTimestamp(&desc, ts) instead and building a catalog.Descriptor instance?

Not only would this simplify your code by removing a lot of switch cases and added descpb methods but you'd be able to apply the changes in fixPrivileges in the same transaction, which would be preferable.

Alternatively, couldn't this whole findNextDescriptorToFix be expressed as one big SQL query? That would also be preferable. I'm not sure you actually need to return objectType.

Done, changed it to "one big SQL query" or rather have all the descriptors updated in one txn


pkg/migration/migrations/fix_privileges_migration_external_test.go, line 36 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

cool tests! nit on adjusting the whtiespace for the sql queries in the comments.

Done.


pkg/migration/migrations/fix_privileges_migration_external_test.go, line 64 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

The formatting here is all weird (in github at least), is this expected?

Should be fixed.


pkg/migration/migrations/fix_privileges_migration_external_test.go, line 173 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

Seeing as GetAllDescriptors will validate those descriptors which will in turn validate the privilege descriptors, aren't some of the checks below redundant? I'm pretty sure the calls to privilegeDesc.Validate are. Speaking of which, shouldn't some checks be added to that method, perhaps version-gated behind privilegeDesc.Version?

Done, and slightly updated Validate.

@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch 3 times, most recently from 0cdc0ef to 333a7fa Compare June 21, 2021 22:49
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_privileges_migration.go, line 43 at r2 (raw file):

	for ; ok; ok, err = rows.Next(ctx) {
		row := rows.Cur()
		id := descpb.ID(*row[0].(*tree.DInt))

nit: refactor the row parsing into a helper.

I think with the various helpers, this should read:

ok, err := rows.Next(ctx)
for ; ok, ok, err = rows.Next(ctx) {
    desc, err := decodeRow(ctx, row)
    if err != nil {
        return err // wrap the error in decodeRow with which row
    }
    if desc.GetPrivileges.Version > descpb.Version21_2 {
        continue
    }
    if err := upgradePrivileges(ctx, desc.GetID()); err != nil {
        return err
    }
}

pkg/migration/migrations/fix_privileges_migration.go, line 72 at r2 (raw file):

Quoted 11 lines of code…

		switch b.DescriptorType() {
		case catalog.Database:
			objectType = privilege.Database
		case catalog.Schema:
			objectType = privilege.Schema
		case catalog.Table:
			objectType = privilege.Table
		case catalog.Type:
			objectType = privilege.Type
		}

nit: factor this out into a helper.


pkg/migration/migrations/fix_privileges_migration.go, line 76 at r2 (raw file):

		err = descs.Txn(ctx, d.Settings, d.LeaseManager, d.InternalExecutor, d.DB, func(
			ctx context.Context, txn *kv.Txn, descriptors *descs.Collection) error {
			if privilegeDesc.Version >= descpb.Version21_2 {

why do this down here? I think it'd be better if you did this check above with a continue.


pkg/migration/migrations/fix_privileges_migration.go, line 85 at r2 (raw file):

Quoted 13 lines of code…
		err = descs.Txn(ctx, d.Settings, d.LeaseManager, d.InternalExecutor, d.DB, func(
			ctx context.Context, txn *kv.Txn, descriptors *descs.Collection) error {
			if privilegeDesc.Version >= descpb.Version21_2 {
				log.Infof(ctx, "privilege descriptor has already been fixed, skipping %d", id)
				return nil
			}
			privs := mutableDesc.GetPrivileges()
			descpb.MaybeFixPrivileges(
				mutableDesc.GetID(), mutableDesc.GetParentID(),
				&privs, objectType,
			)
			return descriptors.WriteDesc(ctx, false /* kvTrace */, mutableDesc, txn)
		})

nit: factor this out into a helper

@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch 4 times, most recently from 7068d1c to 3fc4b38 Compare June 22, 2021 15:01
Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @rafiss)


pkg/migration/migrations/fix_privileges_migration.go, line 76 at r2 (raw file):

Previously, ajwerner wrote…

why do this down here? I think it'd be better if you did this check above with a continue.

Done

@RichardJCai
Copy link
Copy Markdown
Contributor Author

Somewhat related and I might be wrong, but the backup case could perhaps be handled by adding a new upgrade step to theRunPostDeserializationChanges method of catalog.DescriptorBuilder, in which case it'd be appropriate to include in this PR.

I added a call to MaybeFixPrivileges in RunPostDeserializationChanges for all objects which guarantees that restores will catch it too. My question here is when can we remove this? We'll have to keep it here until we have an API that allows us to sync migrations and restores.

@postamar
Copy link
Copy Markdown

I added a call to MaybeFixPrivileges in RunPostDeserializationChanges for all objects which guarantees that restores will catch it too. My question here is when can we remove this? We'll have to keep it here until we have an API that allows us to sync migrations and restores.

Sadly, not in a long while. Certainly not while we support backups from affected versions, which would be any version up until now, basically. My understanding is that this is a consequence of migrations simply not being a thing until quite recently.

@RichardJCai RichardJCai requested review from ajwerner and postamar June 22, 2021 22:35
@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 3fc4b38 to 6d13ec3 Compare June 23, 2021 15:47
Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment for @ajwerner @postamar
Now that this was added in #66599

This migration fails due to a validation error since we're injecting old descriptors with an older Primary Key version
pq: internal error: running migration for 21.1-14: relation "tb" (60): primary index "primary" has invalid version 0, expected 4

The Validation fails on fix_privileges_migration.go line 117 return descriptors.WriteDesc(ctx, false /* kvTrace */, desc, txn)
Since RunPostDeserialzation doesn't happen during the migration.

Does it make sense to add this fix into the migration as well?

changes.UpgradedIndexFormatVersion = maybeUpgradePrimaryIndexFormatVersion(desc)
	for i := range desc.Indexes {
		changes.UpgradedIndexFormatVersion = changes.UpgradedIndexFormatVersion || maybeUpgradeSecondaryIndexFormatVersion(&desc.Indexes[i])
	}
	for i := range desc.Mutations {
		if idx := desc.Mutations[i].GetIndex(); idx != nil {
			changes.UpgradedIndexFormatVersion = changes.UpgradedIndexFormatVersion || maybeUpgradeSecondaryIndexFormatVersion(idx)
		}
	}

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_privileges_migration.go, line 90 at r3 (raw file):

Previously, ajwerner wrote…

extreme nit: I'd put this function first. It's nice to see the main driver at the top and scroll down to unpack the helpers.

Done.


pkg/sql/doctor/doctor_test.go, line 294 at r3 (raw file):

Quoted 12 lines of code…
						// Skip `toBytes` to produce a descriptor with unset privileges field.
						// The purpose of this is to produce a nil dereference during validation
						// in order to test that doctor recovers from this.
						//
						// Note that it might be the case that validation aught to check that
						// this field is not nil in the first place, in which case this test case
						// will need to craft a corrupt descriptor serialization in a more
						// creative way. Ideally validation code should never cause runtime errors
						// but there's no way to guarantee that short of formally verifying it. We
						// therefore have to consider the possibility of runtime errors (sadly) and
						// doctor should absolutely make every possible effort to continue executing
						// in the face of these, considering its main use case!

Had to update this test case, since MaybeFixPrivileges is called in RunPostDeserializationChanges for types now, this no longer produces a nil dereference. I can't think of a way to do this off the top of my head, any suggestions?


pkg/sql/catalog/descpb/structured.go, line 393 at r3 (raw file):

Previously, ajwerner wrote…

What are these about?

Removed, thought I needed this before but stopped using it.

@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 6d13ec3 to 0d8ea85 Compare June 25, 2021 21:16
@RichardJCai RichardJCai changed the title sql: migration for privilege descriptors. sql: migration for descriptors fixes Jun 25, 2021
@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 0d8ea85 to be65d6e Compare June 26, 2021 02:06
@RichardJCai RichardJCai requested a review from a team June 26, 2021 02:06
@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 868b9f2 to 9061401 Compare June 29, 2021 16:41
Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, and @rafiss)


pkg/migration/migrations/fix_descriptor_migration.go, line 97 at r7 (raw file):

Previously, ajwerner wrote…

So what? Pass in a nil descgetter and it’s just an in-memory op.

Fair enough, don


pkg/migration/migrations/fix_descriptor_migration.go, line 82 at r8 (raw file):

Previously, ajwerner wrote…

super nit: switch the order of d and id in this method.

Done.

This migration calls RunPostDeserializationChanges on every descriptor
fixing any inconsistencies with the descriptor.

This migration fixes corrupted privilege descriptors in the wild resulting
from the fallout of the ZONECONFIG/USAGE bug and the ALTER DATABASE CONVERT TO
SCHEMA bug. Additionally, all privilege descriptors will be updated to explicitly
have Owners. Privilege descriptors created before 20.2 do not have their owner field
set and instead owners are implicitly inferred.

Finally all privilege descriptors will be bumped up to Version21_2 after the migration
so we can strictly validate privileges. Additionally the IsRepaired bit will
be set to true after the fix.

Release note (bug fix): Summary: upon upgrading to 21.2,
users will have all privileges fixed. This means
any invalid privileges on all objects will be removed.

Details:
This migration will be run when users upgrade to
21.2 (Specifically 21.2.14).
This migration fixes any privilege descriptors that are corrupted from the
fallout of the ZONECONFIG/USAGE bug on tables and databases after upgrading
from 20.1 -> 20.2 (cockroachdb#65010) and those that are corrupted after converting a
database to a schema (cockroachdb#65697).
@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 9061401 to 92ad363 Compare June 29, 2021 19:12
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 14 files at r1, 1 of 13 files at r4, 3 of 10 files at r6, 2 of 17 files at r8.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)


pkg/sql/catalog/catalog.go, line 49 at r10 (raw file):

	// GetChanged returns if the MutableDescriptor was changed after running
	// RunPostDeserializationChanges.
	GetChanged() bool

Can we call this something more descriptive. HasPostDeserializationChanges? Also, I was envisioning this method living on the builder rather than the descriptor, but this is fine.


pkg/migration/migrations/fix_descriptor_migration.go, line 46 at r8 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

Right now it's set so every PrivilegeDescriptor created before 21.2 will be updated, so theoretically every descriptor will be changed. The version field is only used for validation purposes. Should we remove this so we don't upgrade every descriptor.

	if p.Version < Version21_2 {
		p.SetVersion(Version21_2)
		changed = true
	}

What does that version mean? Are we getting value out of setting it?

Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_descriptor_migration.go, line 46 at r8 (raw file):

Previously, ajwerner wrote…

What does that version mean? Are we getting value out of setting it?

The version means we can check the validate the privileges strictly.

Validate is prior to and during the migration, in which we have less strict checks.

	// Validate can be called during the fix_privileges_migration introduced in
	// 21.2. It is possible for have invalid privileges prior to 21.2 in certain
	// cases due to bugs. We can strictly check privileges in 21.2 and onwards.
	if p.Version < Version21_2 {
		if objectType == privilege.Schema {
			// Prior to 21_2, it was possible for a schema to have some database
			// privileges on it. This was temporarily fixed by an upgrade on read
			// but in 21.2 onwards, it should be permanently fixed with a migration.
			allowedPrivilegesBits |= privilege.GetValidPrivilegesForObject(privilege.Database).ToBitField()
		}
		if objectType == privilege.Table || objectType == privilege.Database {
			// Prior to 21_2, it was possible for a table or database to have USAGE
			// privilege on it due to a bug when upgrading from 20.1 to 20.2.
			// In 21.2 onwards, it should be permanently fixed with a migration.
			allowedPrivilegesBits |= privilege.USAGE.Mask()
		}
	}

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that the code you're writing here is something we should find a way to package up for reuse. See how you feel about my batching suggestion.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_descriptor_migration.go, line 46 at r8 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

The version means we can check the validate the privileges strictly.

Validate is prior to and during the migration, in which we have less strict checks.

	// Validate can be called during the fix_privileges_migration introduced in
	// 21.2. It is possible for have invalid privileges prior to 21.2 in certain
	// cases due to bugs. We can strictly check privileges in 21.2 and onwards.
	if p.Version < Version21_2 {
		if objectType == privilege.Schema {
			// Prior to 21_2, it was possible for a schema to have some database
			// privileges on it. This was temporarily fixed by an upgrade on read
			// but in 21.2 onwards, it should be permanently fixed with a migration.
			allowedPrivilegesBits |= privilege.GetValidPrivilegesForObject(privilege.Database).ToBitField()
		}
		if objectType == privilege.Table || objectType == privilege.Database {
			// Prior to 21_2, it was possible for a table or database to have USAGE
			// privilege on it due to a bug when upgrading from 20.1 to 20.2.
			// In 21.2 onwards, it should be permanently fixed with a migration.
			allowedPrivilegesBits |= privilege.USAGE.Mask()
		}
	}

The fact that we're going to touch most descriptors makes me wonder if we should be accumulating a batch at a time (we probably should). What if we make a slice of lease.IDVersion and then pick some arbitrary batch size and go do the upgrade when the batch is full.

Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to pull out the descriptor getting / upgrade into a package.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_descriptor_migration.go, line 46 at r8 (raw file):

Previously, ajwerner wrote…

The fact that we're going to touch most descriptors makes me wonder if we should be accumulating a batch at a time (we probably should). What if we make a slice of lease.IDVersion and then pick some arbitrary batch size and go do the upgrade when the batch is full.

Seems reasonable to me, do you have an idea of what is a good arbitrary batch size? 100?

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_descriptor_migration.go, line 46 at r8 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

Seems reasonable to me, do you have an idea of what is a good arbitrary batch size? 100?

That seems like the right order of magnitude. What if we did it based on byte size rather than number of descriptors? desc.DescriptorProto().Size() should be a good proxy. I think 512 KiB would be a good batch size.

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to pull out the descriptor getting / upgrade into a package.

I think it's fine for it to be in the migrations package. Just think about some minor abstracting. We can further abstract it later.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)

@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch 5 times, most recently from 5e6af36 to 275028f Compare June 30, 2021 17:56
Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an abstraction for migrations that will go over all descriptors and fix them.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, and @rafiss)


pkg/sql/catalog/catalog.go, line 49 at r10 (raw file):

Previously, ajwerner wrote…

Can we call this something more descriptive. HasPostDeserializationChanges? Also, I was envisioning this method living on the builder rather than the descriptor, but this is fine.

Done


pkg/migration/migrations/fix_descriptor_migration.go, line 46 at r8 (raw file):

Previously, ajwerner wrote…

That seems like the right order of magnitude. What if we did it based on byte size rather than number of descriptors? desc.DescriptorProto().Size() should be a good proxy. I think 512 KiB would be a good batch size.

Done I used 512 KiB as a batch size.

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_descriptor_migration.go, line 65 at r13 (raw file):

	}

	return descriptorUpgradeMigration(ctx, rows, fixDescriptorFunc, 524288 /* batchSizeInBytes */)

1<<19 /* 512 KiB batch size */?


pkg/migration/migrations/fix_descriptor_migration.go, line 73 at r13 (raw file):

	ctx context.Context, d migration.TenantDeps, id descpb.ID, version descpb.DescriptorVersion,
) error {
	return descs.Txn(ctx, d.Settings, d.LeaseManager, d.InternalExecutor, d.DB, func(

The point of the batching is to batch the writes into a single write batch. Use descriptors.WriteDescToBatch


pkg/migration/migrations/fix_descriptor_migration.go, line 85 at r13 (raw file):

			return nil
		}
		log.Infof(ctx, "upgrading descriptor with id %d", desc.GetID())

nit: this is loud. I don't really want us logging 1M times if we fix 1M descriptors. Maybe lot out the batch?


pkg/migration/migrations/fix_descriptor_migration.go, line 124 at r13 (raw file):


		descs = append(descs, desc)
		timestamps = append(timestamps, ts)

I feel like I wasn't clear enough on the reason for the batching. We'd like to evaluate the predicate inside the loop here and then accumulate a batch to write to the store in a single shot.

@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch 2 times, most recently from 2fb39fc to 314f43b Compare June 30, 2021 19:02
Copy link
Copy Markdown
Contributor Author

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)


pkg/migration/migrations/fix_descriptor_migration.go, line 73 at r13 (raw file):

Previously, ajwerner wrote…

The point of the batching is to batch the writes into a single write batch. Use descriptors.WriteDescToBatch

Oops for some reason I thought I was doing that 🤦 done now


pkg/migration/migrations/fix_descriptor_migration.go, line 85 at r13 (raw file):

Previously, ajwerner wrote…

nit: this is loud. I don't really want us logging 1M times if we fix 1M descriptors. Maybe lot out the batch?

Logging per batch


pkg/migration/migrations/fix_descriptor_migration.go, line 124 at r13 (raw file):

Previously, ajwerner wrote…

		descs = append(descs, desc)
		timestamps = append(timestamps, ts)

I feel like I wasn't clear enough on the reason for the batching. We'd like to evaluate the predicate inside the loop here and then accumulate a batch to write to the store in a single shot.

Sorry, it was clear for some reason I just forgot how WriteDescToBatch worked.

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a make bazel-generate but :lgtm:

Reviewed 1 of 3 files at r9.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @nihalpednekar, @postamar, @rafiss, and @RichardJCai)

@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from 314f43b to d12716d Compare July 1, 2021 14:35
@RichardJCai RichardJCai force-pushed the privileges_migration_06152021 branch from d12716d to 988bd8d Compare July 1, 2021 16:22
@RichardJCai
Copy link
Copy Markdown
Contributor Author

Really appreciate the tight review cycles, thank you!

bors r=ajwerner

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jul 1, 2021

Build failed:

@RichardJCai
Copy link
Copy Markdown
Contributor Author

bors r=ajwerner

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jul 2, 2021

Build succeeded:

@craig craig bot merged commit 6263e75 into cockroachdb:master Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: create long running migration to fix privilege descriptors for tables/db created on 20.1 or before

5 participants