feat(backends/s3): add warmup support before repacks and restores#5173
feat(backends/s3): add warmup support before repacks and restores#5173MichaelEischer merged 9 commits intorestic:masterfrom
Conversation
7108753 to
84c7179
Compare
4948809 to
aeeecf2
Compare
MichaelEischer
left a comment
There was a problem hiding this comment.
The whole approach still feels somewhat experimental, so I think we should move the code behind an alpha feature flag.
In general, automatically warming up and waiting for the required pack files looks like a solid approach. I prefer that variant compared to having --warmup flags that require the user to guess when the warmup is complete (and in case of prune won't interact well with concurrent modifications of the repository). However, to further progress cold storage support, we'll need at least some feedback for the user to indicate that restic is actually waiting for files to be warmed up in contrast to being stuck.
Having more supported commands would also be nice, but that can be changed later on while the feature is still in alpha state.
|
Thanks @MichaelEischer for the feedback. I'm happy that you find this approach reasonable and that we can make cold storage support move forward in restic.
So I'll rebase my branch and fix everything probably tomorrow, or by the end of next week.
IIRC, this is more tedious than it seems because the current progress bar implementationS are "swallowing" everything. I'll see what I can do (probably in another branch if you don't mind, so as to not pollute this PR with UI-related stuff). |
1b30397 to
27dd732
Compare
|
I think I addressed all feedbacks. Regarding user feedback, there is some fragmentation in the current codebase. Repack can only use a |
71f8c30 to
7226f2a
Compare
|
I just removed the |
This commit introduces basic support for transitioning pack files stored in cold storage to hot storage on S3 and S3-compatible providers. To prevent unexpected behavior for existing users, the feature is gated behind new flags: - `s3.enable-restore`: opt-in flag (defaults to false) - `s3.restore-days`: number of days for the restored objects to remain in hot storage (defaults to `7`) - `s3.restore-timeout`: maximum time to wait for a single restoration (default to `1 day`) - `s3.restore-tier`: retrieval tier at which the restore will be processed. (default to `Standard`) As restoration times can be lengthy, this implementation preemptively restores selected packs to prevent incessant restore-delays during downloads. This is slightly sub-optimal as we could process packs out-of-order (as soon as they're transitioned), but this would really add too much complexity for a marginal gain in speed. To maintain simplicity and prevent resources exhautions with lots of packs, no new concurrency mechanisms or goroutines were added. This just hooks gracefully into the existing routines. **Limitations:** - Tests against the backend were not written due to the lack of cold storage class support in MinIO. Testing was done manually on Scaleway's S3-compatible object storage. If necessary, we could explore testing with LocalStack or mocks, though this requires further discussion. - Currently, this feature only warms up before restores and repacks (prune/copy), as those are the two main use-cases I came across. Support for other commands may be added in future iterations, as long as affected packs can be calculated in advance. - The feature is gated behind a new alpha `s3-restore` feature flag to make it explicit that the feature is still wet behind the ears. - There is no explicit user notification for ongoing pack restorations. While I think it is not necessary because of the opt-in flag, showing some notice may improve usability (but would probably require major refactoring in the progress bar which I didn't want to start). Another possibility would be to add a flag to send restores requests and fail early. See restic#3202
It's easier to handle multiple handles in the backend directly, and it may open the door to reducing the number of requests made to the backend in the future.
a6f761c to
f40ec01
Compare
MichaelEischer
left a comment
There was a problem hiding this comment.
Thanks for the changes so far. I have a few more comments below. Regarding the progress information, let's stick to the simple log messages for now. I don't see a simple way to add proper progress bars without a lot of refactoring beforehand.
| dstRepo restic.Repository, | ||
| packs restic.IDSet, | ||
| keepBlobs repackBlobSet, | ||
| p *progress.Counter, |
There was a problem hiding this comment.
When using progress.Printer then it would be possible to also show a progress.Counter. For prune that would pretty much work out of the box, but he copy command has to be converted first to use setupTermstatus().
But the restore command is a much tougher problem. I'm not yet sure what the proper approach would be there. So, the log message will have to be good enough for now.
There was a problem hiding this comment.
I just want to point out that a progress bar for restores would not add a lot of value because all restores will succeed approximately at the same time. A progress bar would remain at "0%" for hours, and then it would suddenly wake-up, not much different than this PR. AFAIK, AWS doesn't provide any ETA for restores either. So I don't think heavy works are justified at all for this feature.
There was a problem hiding this comment.
True, then let's wait and see what feedback arrives. The progress bar also has a counter of the duration of the operation, that might be reassuring for users that restic isn't stuck entirely, but yeah the value of that is questionable.
internal/backend/s3/s3.go
Outdated
| if restore.OngoingRestore { | ||
| return false, true | ||
| } | ||
| if !restore.ExpiryTime.IsZero() { |
There was a problem hiding this comment.
Should we consider files that expire soon as in need of a new restore call? Otherwise files could expire when performing multiple restores with a bit of time inbetween.
There was a problem hiding this comment.
I wasn't sure if calling RestoreObject on an already-restored object would increase the expiry or just leave the object untouched, so I didn't want to write any explicit logic about that. I need to check if it is possible to do this.
Also, even if it worked, I am not sure how to define "expire soon". If it's < RestoreDays, then we may just call RestoreObject every time and not bother. If it's less, how much? < RestoreDays / 2 ? Sounds arbitrary. Letting user configure this value sounds too specific.
There was a problem hiding this comment.
Just tested it and it seems to work. RestoreDays can be increased as well as reduced with subsequent calls to RestoreObject.
Given that the expiry-date is rounded up to the next day (GMT) by AWS, I think it is safe to call RestoreObject again if the current expiry-date is set before now + RestoreDays. This guarantees RestoreDays is always respected (at least on AWS), and still avoids calling RestoreObject for nothing.
279dd37 to
abe70fe
Compare
MichaelEischer
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the quick changes! I've added a tiny fixup commit to fix a typo and correctly use feature.TestSetFlag().
| dstRepo restic.Repository, | ||
| packs restic.IDSet, | ||
| keepBlobs repackBlobSet, | ||
| p *progress.Counter, |
There was a problem hiding this comment.
True, then let's wait and see what feedback arrives. The progress bar also has a counter of the duration of the operation, that might be reassuring for users that restic isn't stuck entirely, but yeah the value of that is questionable.
…stic#5173) * feat(backends/s3): add warmup support before repacks and restores This commit introduces basic support for transitioning pack files stored in cold storage to hot storage on S3 and S3-compatible providers. To prevent unexpected behavior for existing users, the feature is gated behind new flags: - `s3.enable-restore`: opt-in flag (defaults to false) - `s3.restore-days`: number of days for the restored objects to remain in hot storage (defaults to `7`) - `s3.restore-timeout`: maximum time to wait for a single restoration (default to `1 day`) - `s3.restore-tier`: retrieval tier at which the restore will be processed. (default to `Standard`) As restoration times can be lengthy, this implementation preemptively restores selected packs to prevent incessant restore-delays during downloads. This is slightly sub-optimal as we could process packs out-of-order (as soon as they're transitioned), but this would really add too much complexity for a marginal gain in speed. To maintain simplicity and prevent resources exhautions with lots of packs, no new concurrency mechanisms or goroutines were added. This just hooks gracefully into the existing routines. **Limitations:** - Tests against the backend were not written due to the lack of cold storage class support in MinIO. Testing was done manually on Scaleway's S3-compatible object storage. If necessary, we could explore testing with LocalStack or mocks, though this requires further discussion. - Currently, this feature only warms up before restores and repacks (prune/copy), as those are the two main use-cases I came across. Support for other commands may be added in future iterations, as long as affected packs can be calculated in advance. - The feature is gated behind a new alpha `s3-restore` feature flag to make it explicit that the feature is still wet behind the ears. - There is no explicit user notification for ongoing pack restorations. While I think it is not necessary because of the opt-in flag, showing some notice may improve usability (but would probably require major refactoring in the progress bar which I didn't want to start). Another possibility would be to add a flag to send restores requests and fail early. See restic#3202 * ui: warn user when files are warming up from cold storage * refactor: remove the PacksWarmer struct It's easier to handle multiple handles in the backend directly, and it may open the door to reducing the number of requests made to the backend in the future.
|
I somehow didn't notice this was merged. Thank you very much for your time @MichaelEischer . I'll follow-up with the missing commands in a few days or weeks. |
Follow-up from restic#5173 See also restic#3202
Follow-up from restic#5173 See also restic#3202
Follow-up from restic#5173 See also restic#3202
Follow-up from restic#5173 See also restic#3202
Follow-up from restic#5173 See also restic#3202
What does this PR change? What problem does it solve?
This commit introduces basic support for transitioning pack files stored in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated behind new flags:
s3.enable-restore: opt-in flag (defaults to false)s3.restore-days: number of days for the restored objects to remain in hot storage (defaults to7)s3.restore-timeout: maximum time to wait for a single restoration (default to1 day)s3.restore-tier: retrieval tier at which the restore will be processed. (default toStandard)As restoration times can be lengthy, this implementation preemptively restores selected packs to prevent incessant restore-delays during downloads. This is slightly sub-optimal as we could process packs out-of-order (as soon as they're transitioned), but this would really add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of packs, no new concurrency mechanisms or goroutines were added. This just hooks gracefully into the existing routines.
Limitations:
s3-restorefeature flag to make it explicit that the feature is still wet behind the ears.Was the change previously discussed in an issue or on the forum?
Checklist
changelog/unreleased/that describes the changes for our users (see template).