Skip to content

Add possibility to specify extra repo for "hot" data#3235

Closed
aawsome wants to merge 3 commits intorestic:masterfrom
aawsome:add-repo-hot
Closed

Add possibility to specify extra repo for "hot" data#3235
aawsome wants to merge 3 commits intorestic:masterfrom
aawsome:add-repo-hot

Conversation

@aawsome
Copy link
Copy Markdown
Contributor

@aawsome aawsome commented Jan 23, 2021

What does this PR change? What problem does it solve?

Adds an option --repo-hot to specify a "hot repo" that is used for hot data, i.e. for all repository files except data pack files. The repo specified by --repo is still fully written and a complete restic repository, but only read if a pack data file is accessed. So, this can be stored in some "cold storage" or storage with very high latency while many restic commands still work. Usually the "hot repo" only takes a small fraction of the space used by the complete repository. See #3202 for the discussion. This PR implements the "cache-aproach".

The "hot repo" is marked by an extra data field in the Config file and some additional tests are added to check that only correct combiniations of "hot repo" and cold (i.e. standard) repo can be used. However, if a "hot repo" is openend by older restic versions, it will not give good errors, but seems like a broken repo.

Open points are:

  • depends on Simplify cache logic #2856 which is not reviewed yet
  • add checks to the check command that same files exist in hot and "complete" repo
  • add checks to the check --read-data command that read and check all files from the "complete" repo
  • add some test case, especially test that the "complete" repo is not read except for data pack files
  • add ability to create a "hot repo" from an existing "complete" repo

Usage is:

restic -r <cold-repo> --repo-hot <hot-repo> init
restic -r <cold-repo> --repo-hot <hot-repo> backup /path/to/backup
restic -r <cold-repo> --repo-hot <hot-repo> backup /path/to/backup // follow-up backup
restic -r <cold-repo> --repo-hot <hot-repo> snapshots
restic -r <cold-repo> --repo-hot <hot-repo> check
restic -r <cold-repo> --repo-hot <hot-repo> forget --prune --repack-cacheable-only  <policy>

All the above commands do not read anything from <cold-repo>, just list the files therein or remove them during prune. In <hot-repo>, only the metadata is saved.

Was the change discussed in an issue or in the forum before?

see #3202 and the references therein.
IMO this closes #2504, closes #2611, closes #2817

Checklist

  • I have read the Contribution Guidelines
  • I have enabled maintainer edits for this PR
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

@aawsome aawsome mentioned this pull request Jan 23, 2021
7 tasks
@aawsome aawsome force-pushed the add-repo-hot branch 2 times, most recently from d1c90a2 to 1429f44 Compare January 24, 2021 13:47
@aawsome aawsome force-pushed the add-repo-hot branch 2 times, most recently from 47cae72 to d464687 Compare January 25, 2021 15:37
@aawsome aawsome marked this pull request as ready for review January 25, 2021 15:37
@lenzls
Copy link
Copy Markdown

lenzls commented Nov 1, 2021

Just wanted to mention, that the PR in question is now merged. No stress though.

@sashokbg
Copy link
Copy Markdown

Hello and thank you for this great project ! When can the here-mentioned PR be merged ? Can I help somehow with it ?

@sashokbg
Copy link
Copy Markdown

@aawsome I have tested your code with OVH Cloud Archive (cold storage) and both backup and restore worked well.
The only drawback was that the first restore failed since cold storage files are not available immediately. It may be interesting to add some "unfreeze" feature ? I think S3 has the same behavior - you need to first initialize a retrieval job and wait for it to finish.

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented May 15, 2022

@sashokbg I made #3303 which would allow restore --dry-run --warm-up.
This should work with OVH Coldstorage and other cold backend which allow to "warm up" by just trying to access a file. I'm also using the combination of this PR with OVH, but I have to admit that I never actually had to really restore a snapshot as this is only my second backup additional to one with a local backend...

@sashokbg
Copy link
Copy Markdown

@aawsome all of this looks awesome good job ! Isn't the current PR merge-able already ?

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented Jul 1, 2022

@aawsome all of this looks awesome good job ! Isn't the current PR merge-able already ?

No, it is not. There has been changes in master in between which introduced conflicts which need to be resolved.
However, I won't work on this PR until there are good chances that it will get merged in near future.

So changes are that this feature is implemented in rustic first.

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented Jul 14, 2022

So changes are that this feature is implemented in rustic first.

This functionality is now available on rustic main, see rustic-rs/rustic#44, and will be part of the next rustic release.

@Kulgar
Copy link
Copy Markdown

Kulgar commented Aug 1, 2022

Awesome job! I'd like to see this PR merged too as I would like to use restic with ovh cloud archive too.

Anyway we could help?

@labkode
Copy link
Copy Markdown
Contributor

labkode commented Aug 30, 2022

Same here (CERN), this functionality is very important so we can consolidate the usage of restic for both hot data (disk-based Ceph S3) and cold (internal tape library). What is missing to get this PR merged and how can we contribute?
We do run a backup for a massive storage pool (tens of PBs). We have previously contributed to optimise the backup process in #2970.

@MichaelEischer
Copy link
Copy Markdown
Member

The main missing part is probably time to review it, think about the design and what is necessary to properly support cold storage in restic. My backlog for PRs to review and bugs to fix / things to improve is unfortunately way too long :-/ .

@labkode
Copy link
Copy Markdown
Contributor

labkode commented Sep 1, 2022

@MichaelEischer can you get in touch with me by mail at hugo.gonzalez.labrador@cern.ch, we may have a way to speed up that process for review.

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented Sep 1, 2022

@labkode Did you try out rustic where I already implemented this functionality?

@aawsome aawsome closed this Feb 24, 2024
@aawsome aawsome deleted the add-repo-hot branch February 24, 2024 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Scaleway C14 Glacier Class? S3 Glacier restic with "cold" storage (here: OVH Cloud Archive)

6 participants