Skip to content

backupccl: remap all restored tables#85492

Merged
craig[bot] merged 7 commits intocockroachdb:masterfrom
dt:rekey-always
Aug 11, 2022
Merged

backupccl: remap all restored tables#85492
craig[bot] merged 7 commits intocockroachdb:masterfrom
dt:rekey-always

Conversation

@dt
Copy link
Copy Markdown
Contributor

@dt dt commented Aug 2, 2022

This PR has a few changes, broken down into separate commits:
a) stop restoring tmp tables and remove the special-case code to synthesize their special schemas; These were previously restored only to be dropped so that restored jobs that referenced them would not be broken, but we stopped restoring jobs.
b) synthesize type-change jobs during cluster restore; this goes with not restoring jobs.
c) fix some assumptions in tests/other code about what IDs restored tables have.
d) finally, always assign new IDs to all restored objects, even during cluster restore, removing the need to carefully move conflicting tables or other things around.

Commit-by-commit review recommended.

@dt dt requested review from a team, adityamaru and benbardin August 2, 2022 20:47
@dt dt requested a review from a team as a code owner August 2, 2022 20:47
@dt dt requested a review from ajwerner August 2, 2022 20:47
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@dt dt force-pushed the rekey-always branch 2 times, most recently from d2c0278 to 1e29ba6 Compare August 2, 2022 21:51
@dt dt requested a review from a team August 2, 2022 22:30
Copy link
Copy Markdown
Collaborator

@benbardin benbardin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but consider also waiting for another review or two from the others :)

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: though the test failure seems plausibly legit. This was satisfying to read.

Reviewed 3 of 3 files at r1, 1 of 1 files at r2, 2 of 2 files at r3, 2 of 2 files at r4, 1 of 3 files at r5, 4 of 5 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @dt)


-- commits line 2 at r1:
I'm so curious


pkg/ccl/backupccl/restore_job.go line 1667 at r3 (raw file):

// tempSystemDatabaseID returns the ID of the descriptor for the temporary
// system database used in full cluster restores.

Can you note that this returns descpb.InvalidID in the case that there are no system tables in the tables slice and that it relies on the DescriptorRewrites map having already been populated?


pkg/ccl/backupccl/restore_old_versions_test.go line 543 at r4 (raw file):

			{"root", "", "{admin}"},
		})
		sqlDB.CheckQueryResults(t, "SELECT comment FROM system.comments", [][]string{

technically, should this have an order by?

Copy link
Copy Markdown
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I bug you to flesh out the commit message in commit number 5. For anyone who comes along later it would be helpful to read why we made this switch etc. Feel free to ignore.

Copy link
Copy Markdown
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it is nice to see this code get deleted.

One question I have is why we don't need the custom restore funcs for comments, session vars, and zone configuration all of which reference descriptor IDs in their table. If we do need them, then perhaps the descriptor-conflicts tests isn't testing exactly what it used to be.

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Aug 3, 2022

@stevendanna

One question I have is why we don't need the custom restore funcs for comments, session vars, and zone configuration all of which reference descriptor IDs in their table.

This is why https://github.com/cockroachdb/cockroach/pull/85098/files landed first

@stevendanna
Copy link
Copy Markdown
Collaborator

@dt Ah, thanks. missed that one.

@dt dt force-pushed the rekey-always branch 2 times, most recently from 17fb6c0 to d992357 Compare August 10, 2022 19:17
@dt
Copy link
Copy Markdown
Contributor Author

dt commented Aug 10, 2022

Can I bug you to flesh out the commit message in commit number 5. For anyone who comes along later it would be helpful to read why we made this switch etc. Feel free to ignore.

Added a lengthy commit message.

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Aug 11, 2022

TFTRs!

bors r+

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Aug 11, 2022

bors r-

gotta fix a merge conflict

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 11, 2022

Canceled.

dt added 3 commits August 11, 2022 15:21
Previously we updated one old ID to one new ID at a time. However this was
incorrect: if a new ID can potentially be higher or lower and conflict with
an old ID yet to be remapped, and thus subsequently be remapped again incorrectly
as it then looks like the other old ID. Since IDs can be higher or lower there
is no order that is sure to be free of such conflicts when doing each remapping
separately. Instead, we now do all the remappings in one pass, using a CASE to
switch from old ID to new ID as we go.

Release note: none.
Previously we kept the IDs of all descriptors being restored unchanged
during full-cluster restore. This had the advantage of meaning that all
mentions of IDs in anything we restored, for example in system tables or
jobs, were still valid as we put the same descriptor back at the same ID
when we restored it. However, this approach proved very brittle to
changes in what IDs we use for what over time; for example, in 22.1, new
system tables were introduced with IDs in the lower 50s, while those IDs
had previously been available to user-created tables which could appear
in the backup being restored. This posed a problem: we needed to restore
that backed up user table to ID 52, but there was a new system table in
the way. A workaround in 22.1 is that, mid-restore, we can pick up and
move an existing system table in the restoring cluster to a new ID,
picked to be greater than any ID we are going to restore into, then
carry on with the restore into the now conflict-free ID spans. However
this approach is brittle, as the pick-up-and-move could break any number
of assumptions or edges between descriptors. Fortunately for 22.1, no
such edges exist and only a couple new system descriptors had found
their way into the formerly user-owned IDs, however if left unchecked,
this problem could easily grow in complexity, with some future
descriptors being much harder to move.

Instead, this changes how we restore during full cluster restore to
match how we restore in other restores: each descriptor being restored
is assigned a new descriptor ID, generated as the next available ID in
the restoring cluster, and references to the old ID in other data being
restored are updated. Descriptor ID rewriting within descriptors
themselves has been in place as long as restore has existed, however
during full-cluster restore, data in addition to descriptors is
restored, particularly in system table rows. These rows can also contain
references to descriptor IDs, which is why a prior change started
updating these rows as they are restored to contain the new IDs. The
'old' and 'new' IDs were always the same in that change however, as
cluster-restore did not actually generate and assign _new_ IDs. That
last part is what this change does: instead of assigning the same ID, it
assigns a new one, and the rewriting enabled in prior changes is now
actually updating IDs.

The special-case moving of conflicting descriptors is no longer needed
now that all descriptors are assigned new IDs and is removed.

Release note: none.
@dt
Copy link
Copy Markdown
Contributor Author

dt commented Aug 11, 2022

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 11, 2022

Build succeeded:

@craig craig bot merged commit 8e3ee57 into cockroachdb:master Aug 11, 2022
@dt dt deleted the rekey-always branch August 15, 2022 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants