Smaller memory footprint for restic mount#2585
Smaller memory footprint for restic mount#2585greatroar wants to merge 3 commits intorestic:masterfrom
Conversation
internal/repository.Repository.LookupBlobSize now allows concurrent access. The fuse mountpoint may be slightly slower, but the memory use of "restic mount" is down by >25% for a repo of 54M files.
|
Hello, please could you test the performance with large files + 50gb. I already had problems listing large files, so the file size cache was implemented. |
|
I have a snapshot that contains a single 58GB file. When I ls that file, I get instant results: Is that enough of a test, or do you have a suggestion for further benchmarking? |
|
See also the discussion in #1680 About your PR:
About the test setting: Testing one single large files is not enough as the |
|
...when writing my last comment I realized that I should have checked whether reading tree blob is really the slow operation for listing files in FUSE. In fact it was not... Thanks @greatroar for leading me to the right part to look at. |
|
I had seen #2523, but decided to do a more localized optimization for the short term. I'd missed #1680. Re: performance on lots of small files, it's hard to get consistent measurements because the kernel/FUSE seems to be caching the responses from restic. The first ls takes seconds for a dir of 10k small files, the second .06s, both with master and this PR. That also suggests that an LRU cache, which I considered as a potential further optimization, will be useless. |
|
@greatroar Is it possible that you test your example with 10k of small files with #2587 ? |
|
@aawsome Here's the test. Setup code: Then mount, and do That gives, for master: For this PR: For #2587: No big changes. Still, I'm closing in favor of #2587 because its changes are more localized. |
What is the purpose of this change? What does it change?
restic mount used to cache the sizes of all blobs in the repo. By removing this cache, the memory footprint drops by 25%. To compensate for any loss of speed (which I didn't observe, but I only casually browsed through some old snapshots), the repo's index now allows concurrent access and its size lookup no longer needs to scan through all the packs.
Was the change discussed in an issue or in the forum before?
Not that I know.
Checklist
changelog/unreleased/that describes the changes for our users (template here)gofmton the code in all commits