Skip to content

backupccl: elide data from offline tables during RESTORE#85920

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
msbutler:butler-backup-imports
Aug 30, 2022
Merged

backupccl: elide data from offline tables during RESTORE#85920
craig[bot] merged 1 commit intocockroachdb:masterfrom
msbutler:butler-backup-imports

Conversation

@msbutler
Copy link
Copy Markdown
Collaborator

@msbutler msbutler commented Aug 10, 2022

A previous PR (#83915) enabled BACKUP to target offline descriptors
(including IMPORTing and RESTOREing tables), but didn't fully handle their
rollback on RESTORE. This PR ensures RESTORE properly handles these offline
tables:

  • An offline table created by RESTORE or IMPORT PGDUMP is fully discarded. The
    table does not exist in the restoring cluster.
  • An offline table created by an IMPORT INTO has all importing data elided in
    the restore processor and is restored online to its pre import state.

If the RESTORE occurs AOST after the table returns online, the table is
RESTOREd to its post offline state.

Currently, all tables that return online are fully re-backed up. For many
situations above, this is unecessary. Future work should ensure that these
tables are not re-backed up when we know all operations within the
incremental backup interval on the table were MVCC.

This patch only applies to BACKUP and RESTORE flavors that implicitly target an
offline importing/restoring table. A future PR should allow a user to
explicitly target an existing offline table undergoing an IMPORT.

Fixes #86054

Release justification: fixes bug where offline tables were not properly handled
in restore.

@msbutler msbutler self-assigned this Aug 10, 2022
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@msbutler msbutler changed the title Butler backup imports backupccl: restore table with in-progress import to pre import state Aug 10, 2022
@msbutler msbutler force-pushed the butler-backup-imports branch 2 times, most recently from 840a15a to d70ccad Compare August 15, 2022 18:03
@msbutler msbutler requested a review from dt August 15, 2022 18:03
@msbutler msbutler marked this pull request as ready for review August 15, 2022 18:04
@msbutler msbutler requested a review from a team as a code owner August 15, 2022 18:04
@msbutler msbutler requested a review from a team August 15, 2022 18:04
@msbutler msbutler requested a review from a team as a code owner August 15, 2022 18:04
@msbutler msbutler requested review from ajwerner and removed request for a team August 15, 2022 18:04
@msbutler
Copy link
Copy Markdown
Collaborator Author

RFAL! all test failures relate to first commit which relates to #85979, and not this PR.

@msbutler msbutler removed the request for review from ajwerner August 15, 2022 18:06
@msbutler msbutler force-pushed the butler-backup-imports branch from d70ccad to 6ff17f2 Compare August 16, 2022 22:32
@msbutler msbutler changed the title backupccl: restore table with in-progress import to pre import state backupccl: elide data from in-progress imports during RESTORE Aug 16, 2022
@msbutler msbutler force-pushed the butler-backup-imports branch 3 times, most recently from 58c1887 to e4fa154 Compare August 17, 2022 12:15

// Elide Importing Keys
if importTime := desc.GetInProgressImportStartTime(); wallTime > 0 && importTime > 0 && wallTime >= importTime {
return nil, false, ErrImportingKeyError
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need a non-nil err? isn't false enough?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the public RewriteKey() function, when rewriteTableKey returns !ok and a nil err, this codepath below occurs, which seems unnecessary when eliding importing keys, but perhaps whenever !ok returns, this codepath should get skipped?

rekeyed, ok, err := kr.checkAndRewriteTableKey(noTenantPrefix, wallTime)
if err != nil {
return nil, false, err
}
if kr.lastKeyTenant.id != oldTenantID {
kr.lastKeyTenant.id = oldTenantID
kr.lastKeyTenant.prefix = keys.MakeSQLCodec(oldTenantID).TenantPrefix()
}
newTenantPrefix := kr.codec.TenantPrefix()
if len(newTenantPrefix) == len(kr.lastKeyTenant.prefix) {
keyTenantPrefix := key[:len(kr.lastKeyTenant.prefix)]
copy(keyTenantPrefix, newTenantPrefix)
rekeyed = append(keyTenantPrefix, rekeyed...)
} else {
rekeyed = append(newTenantPrefix, rekeyed...)
}
return rekeyed, ok, err
}

@msbutler msbutler force-pushed the butler-backup-imports branch from e4fa154 to 9bd87ae Compare August 18, 2022 22:42
@msbutler msbutler changed the title backupccl: elide data from in-progress imports during RESTORE backupccl: elide data from offline tables during RESTORE Aug 18, 2022
@msbutler
Copy link
Copy Markdown
Collaborator Author

msbutler commented Aug 18, 2022

RFAL. I've limited the scope of this PR to deal with handling offline tables in restore.

Currently on the backup side, all tables that come back online will get fully re-backed up. In a future PR, I will skip this re-backup step on tables that only see mvcc ops during the inc backup interval.

@msbutler msbutler force-pushed the butler-backup-imports branch from 9bd87ae to 81232a8 Compare August 19, 2022 15:05
Copy link
Copy Markdown
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flushing out one round of comments, I need to look more carefully at the tests but can we maybe write a unit test in key_rewriter_test.go that isolates the new KR logic? The restore/import tests are super valuable but a unit test will give us more focussed control on testing the timestamp comparisons.

@adityamaru adityamaru self-requested a review August 24, 2022 15:30
@msbutler msbutler force-pushed the butler-backup-imports branch from 42bcd88 to 5ccd368 Compare August 24, 2022 16:47
Copy link
Copy Markdown
Collaborator Author

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for first pass! I think i got all your comments and added a keyrewriter test as well.

@msbutler msbutler force-pushed the butler-backup-imports branch 2 times, most recently from e4f4c9b to 2bd82a3 Compare August 26, 2022 01:26
Copy link
Copy Markdown
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test coverage, and thanks for the comments!


func (sip *streamIngestionProcessor) rekey(key roachpb.Key) ([]byte, error) {
rekey, ok, err := sip.rekeyer.RewriteKey(key)
rekey, ok, err := sip.rekeyer.RewriteKey(key, 0 /*wallTime*/)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are not going the route of passing in storage.MVCCKey is there any reason to pass a 0 wallTime here? Can we always pass the keys wallTime in non-test code and get rid of the special meaning of the zero timestamp?

You had mentioned performance being an issue but I think we had settled on:
SIP is not production ready and so when we make it performant and if we see this show up in benchmarks and profile we will special case it or duplicate the method for better readability.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, passing a 0 walltime signals to RewriteKey to never filter on walltime. A SIP may encounter a descriptor with an in progress import, and I don't want the SIP do key filtering. Passing a 0 walltime signals this. We could easily rename the var walltimeFilter to make this slightly more readable.

Alternatively, the caller could always pass a non-zero walltime and also pass a filterbyWalltime bool to makeKeyRewriter that binds to the KeyRewriter struct which signals whether walltimes should be used for filtering in RewriteKey.

Again, I prefer what I've initially drafted, as it requires managing fewer internal vars and has no perf impact on SIP. It's up to the caller to know whether they want to filter on import start time via this low level function thats optimized for performance.

Copy link
Copy Markdown
Contributor

@adityamaru adityamaru Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want the SIP do key filtering

Have we thought this through though, as to why we don't want this behaviour to be uniform with the SIP? I am not very familiar with the implications of this so @stevendanna can probably help make a decision here.

It's up to the caller to know whether they want to filter on import start time

I agree with this sentiment but as you know I don't think this code is a reflection of that. The filtering should not be in this shared rewrite method if its not to be used by anyone but restore but given our offline discussions, I think a happy compromise would be to unconditionally pass in the ts of the key that we are passing in if the semantics are well understood for the SIP.

Renaming the variable to something other than timestamp makes it even more likely that we pass in a timestamp that is not the keys timestamp, which we should never iiuc?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyways if we think we need to think about SIP a little more outside ther scope of this patch, I think things are good as the way they are. I just want to make sure we think about it at some point.

@@ -0,0 +1,469 @@
# This test ensures that database and cluster backups properly
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta note: we're starting 6 servers in this test which could make it quite a slow test, I don't know if theres anything we can do about it given the lack of a "wipe everything" tool for cluster restore but we should run this under stress a few times.

d foofoo table 3 incremental


# A restore on mixed version cluster should always fail on a backup that contains import times
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backup to database_upgrade won't contain descriptors with import time set right even though its post finalization, because the import was run in a pre-finalized state?

@adityamaru adityamaru self-requested a review August 26, 2022 12:13
@msbutler msbutler requested a review from a team August 30, 2022 00:41
Copy link
Copy Markdown
Collaborator Author

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're welcome for all the new test coverage :D

exec-sql
CREATE DATABASE b;
USE b;
CREATE TABLE baz (i INT PRIMARY KEY, s STRING);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Let me know if this change is what you had in mind!

----


# Ensure Database, and cluster full backups capture restoring rows.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added to bottom of test to make all the test results easier to reason about.

# ImportStartTime is not bound to the table's descriptor, therefore during
# RESTORE AOST in-progress IMPORT, these tables should get thrown out.

# TODO (msbutler): Currently, we still incrementally back up data from these
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this comment. this is more relevant to the the upcoming reintroduced spans pr.

@msbutler
Copy link
Copy Markdown
Collaborator Author

@adityamaru RFAL!

@msbutler msbutler force-pushed the butler-backup-imports branch 2 times, most recently from 13871cc to 49a6a70 Compare August 30, 2022 03:38
Copy link
Copy Markdown
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits. Thanks for working with me through this, I think the test coverage is extensive so we should get this in and let it bake. Can you ping SQL Schema once this diff is borsing so that they are well informed when they're fixing their bug?

job paused at pausepoint


# ensure user cannot override offline schema created by table restore into data database d
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really checking this? isn't this collision because of the successful cluster restore you did at L139 after you started server 3? In fact a paused table restore should imo never interfere with the creation of a schema in the target database since we've not published the descriptors in the paused restore yet.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch-- though I'm not sure d in the backed up cluster from L139 actually has the schema. Anyway, I added the following to clarify things, and this test still holds.

USE d;
DROP SCHEMA IF EXISTS me CASCADE;

Note that during the beginning of restore execution, the schema is published to an offline state. I'm not sure how to implement the UX you want. How is the user supposed to override an already published schema? Seems like an issue for another day.

A previous PR (cockroachdb#83915) enabled BACKUP to target offline descriptors
(including IMPORTing and RESTOREing tables), but didn't fully handle their
rollback on RESTORE. This PR ensures RESTORE properly handles these offline
tables:
- An offline table created by RESTORE or IMPORT PGDUMP is fully discarded. The
  table does not exist in the restoring cluster.
- An offline table created by an IMPORT INTO has all importing data elided in
  the restore processor and is restored online to its pre import state.

If the RESTORE occurs AOST after the table returns online, the table is
RESTOREd to it's post offline state.

Currently, all tables that return online are fully re-backed up. For many
situations above, this is unecessary. Future work should ensure that these
tables are not re-backed up when we know all operations within the
incremental backup interval on the table were MVCC.

This patch only applies to BACKUP and RESTORE flavors that implicitly target an
offline importing/restoring table. A future PR should allow a user to
explicitly target an existing offline table undergoing an IMPORT.

Release justification: fixes bug where offline tables were not properly handled
in restore.
@msbutler msbutler force-pushed the butler-backup-imports branch from bf43d03 to 0dbf2ea Compare August 30, 2022 14:27
@msbutler
Copy link
Copy Markdown
Collaborator Author

TFTRS!!

bors r=adityamaru

@craig craig bot merged commit f4b491f into cockroachdb:master Aug 30, 2022
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 30, 2022

Build succeeded:

@msbutler msbutler deleted the butler-backup-imports branch August 30, 2022 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

backupccl: incrementally backup in progress imports on existing tables and elide importing data in RESTORE

4 participants