Skip to content

Add flag --cache-all#2516

Closed
aawsome wants to merge 1 commit intorestic:masterfrom
aawsome:cache-all
Closed

Add flag --cache-all#2516
aawsome wants to merge 1 commit intorestic:masterfrom
aawsome:cache-all

Conversation

@aawsome
Copy link
Copy Markdown
Contributor

@aawsome aawsome commented Dec 15, 2019

What is the purpose of this change? What does it change?

Adds the flag --cache-all
When set, all files (including key, config, lock) are cached.
To do so, when the flag is set, the cache directory does not use repo ID
but the repo string given by -r.

Note: In order to correctly cache the config file, the PR #2505 is also required!

Was the change discussed in an issue or in the forum before?

See issue #2504.

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented Apr 18, 2020

@MichaelEischer @greatroar Thank you for your comments.

Actually I'm moving a bit towards the following re-implementation of the Cache structure:

  • use a Backend to actually store the Cached objects
  • make it possible to wrap multiple Caches into a CachedBackend which is finally used by restic.

This approach simplifies the caching code base and would allow much more options (and hence obsoletes this PR) like 2 layers of caches where one layer (caching all) could be on local disc but also on a remote storage.

It's still WIP; if you are interessted see https://github.com/aawsome/restic/tree/cache-as-repo

@MichaelEischer
Copy link
Copy Markdown
Member

I haven't thought that much about which functionality a cache should provide. It's probably something like the normal backend functionality plus features for cache invalidation. Sooner or later restic will need a method to remove files from the cache which were apparently damaged. Performancewise there is a difference between the cache implementation and the local backend, the former does not use fsync, while the latter does. And your cache backend will need some sort of size limiting, otherwise it will just end up as another copy of the repository, which wouldn't make that much sense.

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented Apr 19, 2020

Performancewise there is a difference between the cache implementation and the local backend, the former does not use fsync, while the latter does.

Thanks for that hint! I wasn't aware of this.

And your cache backend will need some sort of size limiting, otherwise it will just end up as another copy of the repository, which wouldn't make that much sense.

There is already a very effective sizing limit by not storing data blobs in the cache. In basically all real-life repos I've seen, the size of data blobs was more than 99% of the repo size. (see e.g. #2543)

@aawsome aawsome mentioned this pull request May 29, 2020
@aawsome aawsome marked this pull request as draft June 2, 2020 04:54
Adds the flag --cache-all
When set, all files (including key, config, lock) are cached.
To do so, when the flag is set, the cache directory does not use the
repo ID but the hash of the repo string given by -r.
@fd0
Copy link
Copy Markdown
Member

fd0 commented Nov 10, 2020

Is it a good idea to add this feature? What's the use case?

I was always very reluctant to cache the config file and the keys for security reasons...

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented Nov 10, 2020

Is it a good idea to add this feature? What's the use case?

I'm using it in combination with my cold storage repos. But I wanted to open a new issue soon to openly discuss the way how to support cold storage with restic. Then we can openly discuss the best way how to implement this and discuss a roadmap.

I just updated this as I'm using it now for over a year quite successfully. But it's still marked as draft as I think we should discuss the way first.

I was always very reluctant to cache the config file and the keys for security reasons...

Mhh... we have a zero trust model about the storage backend where those files are actually stored. And we assume that we can fully trust the environment where restic actually runs. So from a security point of view, I actually see no issues 😃

@rawtaz
Copy link
Copy Markdown
Contributor

rawtaz commented Nov 10, 2020

Using the repo string as cache directory name can't be great in the long run. Imagine if the hostname or some other part of the repo URL changes, the cache directory name will have to be changed as well. There's a reason the current design is using the non-changing repo ID as the name :)

@aawsome
Copy link
Copy Markdown
Contributor Author

aawsome commented Jan 23, 2021

closing this PR as #3235 IMO is more general and solves the same problem.

@aawsome aawsome closed this Jan 23, 2021
@aawsome aawsome deleted the cache-all branch February 24, 2024 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants