Skip to content

backup: re-enable fast incremental BACKUP via TBI#46108

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
dt:TBI
Mar 16, 2020
Merged

backup: re-enable fast incremental BACKUP via TBI#46108
craig[bot] merged 1 commit intocockroachdb:masterfrom
dt:TBI

Conversation

@dt
Copy link
Copy Markdown
Contributor

@dt dt commented Mar 14, 2020

Within the current BACKUP/RESTORE feature offering, the way to reduce
RPO is to backup more often. For this to work for more frequent
durations, the cost of backing up what changed must scale with the size
what changed, as opposed to the total data size. This is exactly what
the time-bound iterator optimization is supposed to deliver -- by
recording the span of timestamps that appear in any given SSTable and
then installing a filter to only open sstables that contain relevant
times, we can ignore irrelevant data very cheaply.

However over 2017 and 2018, we encountered a few tricky correctness
issues in the interaction of the time-bound iterator and our MVCC logic
and handling intents, and generally lost confidence in this code,
ultimately disabling its use in the scans used by BACKUP. However #45785
resolved those concerns by ensuring that every key emitted by the
IncrementalIterator was actually read by a normal, non-TBI iterator even
when TBI is used. Thus it is now believed to be safe to re-enable TBI
for incremental backups.

Closes #43799.

Release note (enterprise change): Incremental BACKUP can quickly skip of unchanged data making frequent incremental BACKUPs 10-100x faster depending on data-size and frequency.

Release justification: low-risk and high-impact.

Within the current BACKUP/RESTORE feature offering, the way to reduce
RPO is to backup more often. For this to work for more frequent
durations, the cost of backing up what changed must scale with the size
what changed, as opposed to the total data size. This is exactly what
the time-bound iterator optimization is supposed to deliver -- by
recording the span of timestamps that appear in any given SSTable and
then installing a filter to only open sstables that contain relevant
times, we can ignore irrelevant data very cheaply.

However over 2017 and 2018, we encountered a few tricky correctness
issues in the interaction of the time-bound iterator and our MVCC logic
and handling intents, and generally lost confidence in this code,
ultimately disabling its use in the scans used by BACKUP. However cockroachdb#45785
resolved those concerns by ensuring that every key emitted by the
IncrementalIterator was actually read by a normal, non-TBI iterator even
when TBI is used. Thus it is now believed to be safe to re-enable TBI
for incremental backups.

Release note (enterprise change): Incremental BACKUP can quickly skip of unchanged data making frequent incremental BACKUPs 10-100x faster depending on data-size and frequency.

Release justification: low-risk and high-impact.
@dt dt requested a review from pbardea March 14, 2020 04:05
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Mar 16, 2020

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 16, 2020

Build succeeded

@craig craig bot merged commit 2fac30c into cockroachdb:master Mar 16, 2020
@dt dt deleted the TBI branch March 16, 2020 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

backupccl, engine: re-enable Time-Bound Iterator in ExportToSST

3 participants