-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Interrupting prune during pack rewriting grows work #2112
Description
Output of restic version
restic 0.9.3 compiled with go1.10.3 on linux/amd64
How did you run restic exactly?
Environment:
RESTIC_REPOSITORY=s3:https://s3.amazonaws.com/restic-<redacted>
AWS_ACCESS_KEY=<redacted>
AWS_SECRET_ACCESS_KEY=<redacted>
Command:
$ restic prune
enter password for repository:
repository 1cf00096 opened successfully, password is correct
counting files in repo
building new index for repo
[1:26:09] 100.00% 174710 / 174710 packs
repository contains 174710 packs (2725111 blobs) with 825.341 GiB
processed 2725111 blobs: 299163 duplicate blobs, 103.380 GiB duplicate
load all snapshots
find data that is still in use for 43 snapshots
[1:48] 100.00% 43 / 43 snapshots
found 2323197 of 2725111 data blobs still in use, removing 401914 blobs
will remove 0 invalid files
will delete 4950 packs and rewrite 30929 packs, this frees 140.405 GiB
signal interrupt received, cleaning up
$ restic prune
enter password for repository:
repository 1cf00096 opened successfully, password is correct
counting files in repo
building new index for repo
[1:12:08] 100.00% 180600 / 180600 packs
repository contains 180600 packs (2805677 blobs) with 852.127 GiB
processed 2805677 blobs: 379729 duplicate blobs, 130.163 GiB duplicate
load all snapshots
find data that is still in use for 43 snapshots
[1:50] 100.00% 43 / 43 snapshots
found 2323197 of 2805677 data blobs still in use, removing 482480 blobs
will remove 0 invalid files
will delete 4950 packs and rewrite 36819 packs, this frees 167.188 GiB
signal interrupt received, cleaning up
What backend/server/service did you use to store the repository?
S3
Expected behavior
Interrupting a process should leave the repository in a state such that repeating the process requires the same amount or less work. Interrupting a prune operation in the rewriting packs phase should result in fewer packs needing to be rewritten next time, or at worst the same amount.
Actual behavior
Every time restic is interrupted in this phase of a prune operation, the next time it is run, the number of packs which need to be rewritten increases. The estimate of space freeing increases as well, but the number of estimated packs to delete remains constant. This means that if trying to prune on a intermittent connection, you must let the prune finish (at least this phase, I do not have any feedback about the other prune phases) or the next attempt will require even more time to complete. Incremental progress is impossible.
Steps to reproduce the behavior
- use restic and and either interrupt a backup operation or perform a
forgetoperation - attempt a
restic prune - Make note of the number of reported packs to be rewritten
- ctrl-c to interrupt after the
prunegets to the rewriting packs phase - attempt another
restic prune - Observe that the amount of work to do in rewriting packs has increased
Do you have any idea what may have caused this?
Based on the code for prune, I'd guess that it rewrites all the packs needing a rewrite and then deletes the old packs as obsolete. if it never finishes rewriting packs, the old packs are still around, and will be rewritten again. I'm not sure though why that doesn't just result in the number of packs requiring rewriting to remain constant, unless the packs that were successfully rewritten are also being flagged for rewriting on subsequent runs. Again if I had to guess it's because the newly-rewritten packs aren't added to an index, which only happens after all packs have been rewritten.
Do you have an idea how to solve the issue?
I'll leave this to those that know the codebase better than me.
Did restic help you or made you happy in any way?
Yes! Thanks for the great project and for being so responsive on these issues