Skip to content

prune: Stricter error checks#2674

Merged
fd0 merged 10 commits intorestic:masterfrom
MichaelEischer:prune-strict-checks
Aug 23, 2020
Merged

prune: Stricter error checks#2674
fd0 merged 10 commits intorestic:masterfrom
MichaelEischer:prune-strict-checks

Conversation

@MichaelEischer
Copy link
Copy Markdown
Member

@MichaelEischer MichaelEischer commented Mar 31, 2020

What is the purpose of this change? What does it change?

This PR tightens the sanity checks made by prune in the following three regards:

  • Prune now exactly checks that every single used blob can be found and no longer just approximately checks this condition. This is mostly an overcautious cleanup, but might help to prevent data loss if the backend temporarily returns broken packs.
  • Prune aborts repacking when it finds a different blob in a pack than expected. This usually indicates that the backup has been tampered with or hardware problems on the computer running restic.
  • Prune and rebuild-index fail now if an old index file cannot be removed. For rebuild-index this is problematic as that command is usually used to repair a broken index and shouldn't leave possibly invalid index files. For prune this behavior could lead to data loss, as prune would continue and remove old packs which might still be referenced by one of the old index files that were not deleted. A later backup run would the not re-upload any blobs that are listed in these index, but which might actually be missing from the repository. This could result in a snapshot that's missing data.

Most of the code in this PR implements tests and adds support for the integration tests to wrap a repository to simplify testing repository damages.

Was the change discussed in an issue or in the forum before?

This PRs attempts to solve possible problems in the prune implementation in the hope of possibly fixing some of the reported repository damages. The changes themselves were not discussed before.

Checklist

  • I have read the Contribution Guidelines
  • I have enabled maintainer edits for this PR
  • I have added tests for all changes in this PR
  • [ ] I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

@rawtaz
Copy link
Copy Markdown
Contributor

rawtaz commented Jul 17, 2020

@aawsome and @greatroar could you guys review this to make sure it looks sane and shouldn't carry any unexpected results?

Copy link
Copy Markdown
Contributor

@aawsome aawsome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes are due to using the restic.Repository interface in all places and added checks.

IMO this PR doesn't fully solve the index issue and #2839 should be considered instead - but merging this PR will improve the current situation.

Also this PR needs a rebase and modifications after #2773 has been merged (e.g. new SaveBlob syntax)

@MichaelEischer MichaelEischer force-pushed the prune-strict-checks branch 3 times, most recently from 66a4bee to 4c534e8 Compare July 25, 2020 17:54
Copy link
Copy Markdown
Contributor

@aawsome aawsome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MichaelEischer MichaelEischer force-pushed the prune-strict-checks branch 2 times, most recently from fe01136 to 60248d4 Compare August 2, 2020 08:36
The previous check only approximately verified whether all required
blobs were found. However, after forgetting a few snapshots the
repository contains lots of unused blobs whose number can be sufficient
to make up for missing packs.

When coupled with a malfunctioning backend that temporarily returns broken
data this could cause restic to regard the corresponding packs as
invalid and thereby delete data that's still in use. This change lets
restic play it safe and refuse to delete anything if data is missing.
The old behavior was problematic in the context of rebuild-index as it
could leave old, possibly invalid index files behind without returning a
fatal error.

Prune calls rebuildIndex before removing any data from the repository.
For this use case failing to delete an old index MUST be treated as a
fatal error. Otherwise the index could still contain an old index file
that refers to blobs/packs that were later on deleted by prune. Later
backup runs will assume that the affected blobs already exist in the
repository which results in a backup which misses data.
If a blob in a pack file can be decrypted successfully but contains data
that results in a different hash than stated in the header pack, then
abort repacking. As both the pack header and the blob are
cryptographically verified this either means than a malicious entity
tampered with the backup or indicates hardware problems on the client.
prune should fail with an error in both cases.
The slicing operator `slice[low:high]` default to 0 for the lower bound and
len(slice) for the upper bound when either or both are not specified.
Fix the code to use `cap(slice)` to check for the slice capacity.
Copy link
Copy Markdown
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome PR, thanks for taking the time to split this into small chunks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants