backupccl: remap all restored tables by dt · Pull Request #85492 · cockroachdb/cockroach

dt · 2022-08-02T20:47:06Z

This PR has a few changes, broken down into separate commits:
a) stop restoring tmp tables and remove the special-case code to synthesize their special schemas; These were previously restored only to be dropped so that restored jobs that referenced them would not be broken, but we stopped restoring jobs.
b) synthesize type-change jobs during cluster restore; this goes with not restoring jobs.
c) fix some assumptions in tests/other code about what IDs restored tables have.
d) finally, always assign new IDs to all restored objects, even during cluster restore, removing the need to carefully move conflicting tables or other things around.

Commit-by-commit review recommended.

cockroach-teamcity · 2022-08-02T20:47:21Z

This change is

benbardin

LGTM, but consider also waiting for another review or two from the others :)

ajwerner

though the test failure seems plausibly legit. This was satisfying to read.

Reviewed 3 of 3 files at r1, 1 of 1 files at r2, 2 of 2 files at r3, 2 of 2 files at r4, 1 of 3 files at r5, 4 of 5 files at r6, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @dt)

-- commits line 2 at r1:
I'm so curious

pkg/ccl/backupccl/restore_job.go line 1667 at r3 (raw file):

// tempSystemDatabaseID returns the ID of the descriptor for the temporary
// system database used in full cluster restores.

Can you note that this returns descpb.InvalidID in the case that there are no system tables in the tables slice and that it relies on the DescriptorRewrites map having already been populated?

pkg/ccl/backupccl/restore_old_versions_test.go line 543 at r4 (raw file):

			{"root", "", "{admin}"},
		})
		sqlDB.CheckQueryResults(t, "SELECT comment FROM system.comments", [][]string{

technically, should this have an order by?

adityamaru

Can I bug you to flesh out the commit message in commit number 5. For anyone who comes along later it would be helpful to read why we made this switch etc. Feel free to ignore.

pkg/ccl/backupccl/testdata/backup-restore/temp-tables

pkg/ccl/backupccl/restore_job.go

pkg/ccl/backupccl/testdata/backup-restore/descriptor-conflicts

stevendanna

Overall, it is nice to see this code get deleted.

One question I have is why we don't need the custom restore funcs for comments, session vars, and zone configuration all of which reference descriptor IDs in their table. If we do need them, then perhaps the descriptor-conflicts tests isn't testing exactly what it used to be.

pkg/ccl/backupccl/testdata/backup-restore/descriptor-conflicts

dt · 2022-08-03T18:46:52Z

@stevendanna

One question I have is why we don't need the custom restore funcs for comments, session vars, and zone configuration all of which reference descriptor IDs in their table.

This is why https://github.com/cockroachdb/cockroach/pull/85098/files landed first

stevendanna · 2022-08-03T21:01:55Z

@dt Ah, thanks. missed that one.

dt · 2022-08-10T19:18:30Z

Can I bug you to flesh out the commit message in commit number 5. For anyone who comes along later it would be helpful to read why we made this switch etc. Feel free to ignore.

Added a lengthy commit message.

pkg/ccl/backupccl/system_schema.go

dt · 2022-08-11T15:03:05Z

TFTRs!

bors r+

dt · 2022-08-11T15:14:13Z

bors r-

gotta fix a merge conflict

craig · 2022-08-11T15:14:15Z

Canceled.

Release note: none.

Previously we updated one old ID to one new ID at a time. However this was incorrect: if a new ID can potentially be higher or lower and conflict with an old ID yet to be remapped, and thus subsequently be remapped again incorrectly as it then looks like the other old ID. Since IDs can be higher or lower there is no order that is sure to be free of such conflicts when doing each remapping separately. Instead, we now do all the remappings in one pass, using a CASE to switch from old ID to new ID as we go. Release note: none.

Previously we kept the IDs of all descriptors being restored unchanged during full-cluster restore. This had the advantage of meaning that all mentions of IDs in anything we restored, for example in system tables or jobs, were still valid as we put the same descriptor back at the same ID when we restored it. However, this approach proved very brittle to changes in what IDs we use for what over time; for example, in 22.1, new system tables were introduced with IDs in the lower 50s, while those IDs had previously been available to user-created tables which could appear in the backup being restored. This posed a problem: we needed to restore that backed up user table to ID 52, but there was a new system table in the way. A workaround in 22.1 is that, mid-restore, we can pick up and move an existing system table in the restoring cluster to a new ID, picked to be greater than any ID we are going to restore into, then carry on with the restore into the now conflict-free ID spans. However this approach is brittle, as the pick-up-and-move could break any number of assumptions or edges between descriptors. Fortunately for 22.1, no such edges exist and only a couple new system descriptors had found their way into the formerly user-owned IDs, however if left unchecked, this problem could easily grow in complexity, with some future descriptors being much harder to move. Instead, this changes how we restore during full cluster restore to match how we restore in other restores: each descriptor being restored is assigned a new descriptor ID, generated as the next available ID in the restoring cluster, and references to the old ID in other data being restored are updated. Descriptor ID rewriting within descriptors themselves has been in place as long as restore has existed, however during full-cluster restore, data in addition to descriptors is restored, particularly in system table rows. These rows can also contain references to descriptor IDs, which is why a prior change started updating these rows as they are restored to contain the new IDs. The 'old' and 'new' IDs were always the same in that change however, as cluster-restore did not actually generate and assign _new_ IDs. That last part is what this change does: instead of assigning the same ID, it assigns a new one, and the rewriting enabled in prior changes is now actually updating IDs. The special-case moving of conflicting descriptors is no longer needed now that all descriptors are assigned new IDs and is removed. Release note: none.

Release note: none.

dt · 2022-08-11T15:56:23Z

bors r+

craig · 2022-08-11T17:07:25Z

Build succeeded:

Bazel Essential CI (Cockroach)

dt requested review from a team, adityamaru and benbardin August 2, 2022 20:47

dt requested a review from a team as a code owner August 2, 2022 20:47

dt requested a review from ajwerner August 2, 2022 20:47

dt force-pushed the rekey-always branch 2 times, most recently from d2c0278 to 1e29ba6 Compare August 2, 2022 21:51

dt requested a review from a team August 2, 2022 22:30

dt force-pushed the rekey-always branch from 6d2014e to 5bba8c5 Compare August 2, 2022 22:31

msbutler mentioned this pull request Aug 3, 2022

backupccl: add RESTORE with schema_only #85231

Merged

benbardin approved these changes Aug 3, 2022

View reviewed changes

ajwerner approved these changes Aug 3, 2022

View reviewed changes

dt force-pushed the rekey-always branch from 5bba8c5 to 43b2ac0 Compare August 3, 2022 17:05

adityamaru suggested changes Aug 3, 2022

View reviewed changes

stevendanna reviewed Aug 3, 2022

View reviewed changes

pkg/ccl/backupccl/testdata/backup-restore/descriptor-conflicts Show resolved Hide resolved

dt force-pushed the rekey-always branch 2 times, most recently from 17fb6c0 to d992357 Compare August 10, 2022 19:17

adityamaru approved these changes Aug 10, 2022

View reviewed changes

pkg/ccl/backupccl/system_schema.go Outdated Show resolved Hide resolved

dt force-pushed the rekey-always branch from d992357 to f39c962 Compare August 11, 2022 15:02

dt added 4 commits August 11, 2022 15:21

backupccl: stop restoring temp tables

241557c

Release note: none.

backupccl: synthesize type-change jobs during cluster restore

68ca006

Release note: none.

backupccl: use less brittle method to find tmp system ID

e7b2b95

Release note: none.

backupccl: make some tests less specific on IDs

f5ecc4d

Release note: none.

dt added 3 commits August 11, 2022 15:21

catprivilege: remove special cases for cluster restore

b008287

Release note: none.

dt force-pushed the rekey-always branch from f39c962 to b008287 Compare August 11, 2022 15:21

craig bot merged commit 8e3ee57 into cockroachdb:master Aug 11, 2022

dt mentioned this pull request Aug 12, 2022

backupccl: cluster restores should also rekey before ingesting data #85019

Closed

4 tasks

dt deleted the rekey-always branch August 15, 2022 12:07

Conversation

dt commented Aug 2, 2022

Uh oh!

cockroach-teamcity commented Aug 2, 2022

Uh oh!

benbardin left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

adityamaru left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stevendanna left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dt commented Aug 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevendanna commented Aug 3, 2022

Uh oh!

dt commented Aug 10, 2022

Uh oh!

Uh oh!

dt commented Aug 11, 2022

Uh oh!

dt commented Aug 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

craig bot commented Aug 11, 2022

Uh oh!

dt commented Aug 11, 2022

Uh oh!

craig bot commented Aug 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dt commented Aug 3, 2022 •

edited

Loading

dt commented Aug 11, 2022 •

edited

Loading