Merged
Conversation
abc5ab3 to
a031814
Compare
8 tasks
DarkKirb
added a commit
to DarkKirb/restic
that referenced
this pull request
Dec 29, 2021
Currently restic copy will copy each blob from every snapshot serially, which has performance implications on high-latency backends such as b2. This commit introduces 8x parallelism for blob downloads/uploads which can improve restic copy operations up to 8x for repositories with many small blobs on b2. This commit also addresses the TODO comment in the copyTree function. Related work: A more thorough improvement of the restic copy performance can be found in PR restic#3513
DarkKirb
added a commit
to DarkKirb/restic
that referenced
this pull request
Dec 29, 2021
Currently restic copy will copy each blob from every snapshot serially, which has performance implications on high-latency backends such as b2. This commit introduces 8x parallelism for blob downloads/uploads which can improve restic copy operations up to 8x for repositories with many small blobs on b2. This commit also addresses the TODO comment in the copyTree function. Related work: A more thorough improvement of the restic copy performance can be found in PR restic#3513
MichaelEischer
pushed a commit
to greatroar/restic
that referenced
this pull request
Dec 30, 2021
Currently restic copy will copy each blob from every snapshot serially, which has performance implications on high-latency backends such as b2. This commit introduces 8x parallelism for blob downloads/uploads which can improve restic copy operations up to 8x for repositories with many small blobs on b2. This commit also addresses the TODO comment in the copyTree function. Related work: A more thorough improvement of the restic copy performance can be found in PR restic#3513
a031814 to
672e10b
Compare
14 tasks
The repack operation copies all selected blobs from a set of pack files into new pack files. For prune the source and destination repositories are identical. To implement copy, just use a different source and destination repository.
Member
|
I've taken the liberty of rebasing the branch on master after #3484 was merged. |
fd0
approved these changes
Mar 28, 2022
Member
fd0
left a comment
There was a problem hiding this comment.
It's a very elegant solution, I like it a lot!
mfrischknecht
pushed a commit
to mfrischknecht/restic
that referenced
this pull request
Jun 14, 2022
Currently restic copy will copy each blob from every snapshot serially, which has performance implications on high-latency backends such as b2. This commit introduces 8x parallelism for blob downloads/uploads which can improve restic copy operations up to 8x for repositories with many small blobs on b2. This commit also addresses the TODO comment in the copyTree function. Related work: A more thorough improvement of the restic copy performance can be found in PR restic#3513
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR change? What problem does it solve?
The copy command currently proceeds blob by blob which can be very slow if there is any latency for backend accesses.
This PR reuses the repack operation used by prune to implement copy:
The repack operation copies all selected blobs from a set of pack files into new pack files. For prune the source and destination repositories are identical. To implement copy, just use a different source and destination repository.
This way the copy command gains all performance improvements made to the prune command, while also simplifying the implementation. The main change of this PR is the last commit, all other commits are part of #3484. Although this PR could be implemented standalone, I've decided to use #3484 as base as it only accesses the relevant parts of pack files instead of always downloading the full pack file.
The PR also adds a progress bar for the number of pack files copied for the current snapshot.
Was the change previously discussed in an issue or on the forum?
Fixes #2923.
Checklist
[ ] I have added documentation for relevant changes (in the manual).changelog/unreleased/that describes the changes for our users (see template).gofmton the code in all commits.