Conversation
e28d96f to
2890f9c
Compare
|
This seemingly simple change has far reaching implications: Different restic versions will interpret the index differently. This could become a problem if obsolete indexes still exist. A new restic version would then ignore an obsolete index, but older versions will still read those as before. If these obsolete indexes refer to no longer existing blobs, we now have a recipe for data loss. I'm not sure I completely understand in what regard this PR treats the index files correcter than before. Is it just not-ignoring the "supersedes" field or also something else? The only scenario where the current behavior could causes data loss is when prune starts to delete pack files without successfully deleting old index files. But that case can be easily checked for during prune. Using the Another benefit of this PR is that as obsolete indexes are dropped while loading, there's no memory usage overhead due to duplicate indexes. When no using the |
In fact it is only about not-ignoring "supersedes" and in fact using and interpreting it (if given) correctly. IMO the only correct handling if a supersedes is given is to ignore this index file, if it is still present. As you also pointed out, NOT ignoring it during
It is during the rebuilding of the index which is run within
See above. There is also the problem, that identifying an error when deleting a file relies on the backend. When using a eventually consistent backend, a nil error may just mean that the delete command was accepted. In edge cases where the storage has to decide in an unclear situation whether to delete a file or not I do not think that any implementation may think that NOT deleting in this case may cause data loss...
You are right about the possibility that this PR may result in too less index entries. However, the conclusion is really simple: Too less index entries may cause problems with restoring (which can be repaired by rebuilding the index) while too much index entries may cause data loss during backup... |
Actually this can be implemented independently from this PR; I'll tackle this shortly. Let me please explain, why I'm proposing this PR in the scope of optimizing prune: The first place is when determining what to do in prune, this is still in #2718. Here superseded but still existing and not ignored index files would result in:
The second place is the index rebuilding, now in #2842. Here the "wrong duplicates" will be still written twice. So I agree that for the proposed prune changes, it is sufficient to skip exact duplicates while merging. |
As #2674 was merged, this is no longer a problem.
I don't think that way of error handling is a good idea as it introduces a new failure mode in restic: Currently when a backup client or the network connections fails, the worst case (unless you hit a bug) is that you have to
If we make this change I want to maintain backwards compatibility. When the
This is in my opinion the main problem which this PR solves. With the above requirements, for a strongly consistent backend (e.g. local filesystem) the only other issue is avoiding some unnecessary repack work due to duplicate blobs. And that is "only" an optimization. We could avoid the problem of an incomplete index by uploading a "tombstone" index as the last step of rewriting the index. That way old indexes are only ignored once the new index was uploaded. To fix the problem of reappearing deleted indexes this would be enough, as packs are only deleted after the tombstone index was uploaded. However I've just made the assumption, that the tombstone index is visible to the next restic run. In that regard, adding the I just noticed that this change is related to how a non-blocking prune would work. There unreferenced blobs are first marked for deletion at which point they may no longer be used in later backup runs, but are still accessible for retrieving data from the repository. Entries from superseded indexes are quite similar to the ones marked for deletion. It's probably a good idea to postpone this PR until after the main prune changes are completed. I also have to think a bit more about the assumptions regarding the required backend consistency guarantees. |
- don't use superseded index files in masterindex - add hint to check if superseded index files still exist
99f64f8 to
0aaf5b7
Compare
There was a problem hiding this comment.
I've rebased the PR to the current master branch. With the new prune implementation there are now three possible scenarios:
- Either there are no obsolete indexes in which case the PR is a noop
- The storage backend didn't (yet) delete some indexes. This case could cause problems for older restic versions and would be fixed by ignoring superseded indexes. I wonder whether we should let
backupreport an error in this case to warn users. pruneorrebuild-indexwas interrupted. In this case there are lots of obsolete indexes, which can be safely ignored. In this case it would probably be a good idea to not throw around lots of warnings. The main benefit of ignoring the obsolete indexes would be a lower memory usage. As this case should be more frequent than the previous one, the current error reporting in this PR should be sufficient.
That is right now it should be quite safe to merge this PR. There's just one caveat. Currently when saving the master index, an old index is marked as obsolete before all entries in it have been transferred. That is the following code lines must be moved after copying the index contents.
restic/internal/repository/master_index.go
Lines 324 to 340 in 66382b2
This means that for the upgrade path it might be necessary to run rebuild-index if a partially rewritten index exists after an interrupted prune or rebuild-index.
| } | ||
|
|
||
| // delete obsolete index files (index files that have already been superseded) | ||
| obsoleteIndexes := (repo.Index()).(*repository.MasterIndex).Obsolete() |
There was a problem hiding this comment.
I think we also need something similar for rebuild-index?
| } | ||
|
|
||
| // remove obsolete indexes | ||
| validIndex.Sub(r.idx.Obsolete()) |
There was a problem hiding this comment.
Wouldn't that cause the obsolete indexes to be downloaded over and over again? After all these are still stored in the repository.
|
By now I've spent some time on planning how index updates should work for a non-blocking prune implementation (there's nothing to show yet). What's becoming clear is however that marking old index data obsolete should not work on an index level, but rather at the granularity of pack files. That is far more flexible and also allows for rewriting only index files that contain too many deleted pack files. As currently the |
Fully agree. If you have something to show up, I'm looking forward to a discussion. As you surely know I claim that rustic already uses a non-blocking prune, the corresponding index files additions are here: https://github.com/rustic-rs/rustic_core/blob/main/crates/core/src/repofile/indexfile.rs#L26 (and L52ff) |
What is the purpose of this change? What does it change?
Treat superseded index files correctly, if they still exist.
This means, they are now ignored for most commands and check prints a hint if they exist.
Was the change discussed in an issue or in the forum before?
closes #2824
Is also used within #2718.
Checklist
changelog/unreleased/that describes the changes for our users (template here)gofmton the code in all commits