Make restic mount faster and consume less memory#2587
Make restic mount faster and consume less memory#2587MichaelEischer merged 1 commit intorestic:masterfrom
restic mount faster and consume less memory#2587Conversation
|
It might be an option to implement the The main difference is the point in time when the user would be notified in case some blobs are missing. The current implementation in restic probably fails with an error before providing any details on the file node, with the change this is delayed until the first bytes are read. I think that the new semantic is actually what a user would expect: Reading a damaged file fails, but usually you are able to see the metadata. |
|
@MichaelEischer Thanks for searching the manual and offering the NodeOpener option which I'm using now. Memory footprint is however clearly reduced. |
|
@aawsome Happy to help, do you have any specific tests you'd like me to perform? I have several hundred-gig repos I can test with. |
|
@ProactiveServices Thank you very much! I'm especially interested in response times and CPU usage and comparison between "plain" restic and a version including this PR. I used |
|
With the last commit, also dir operations should be faster (and a bit less memory consuming)... |
|
rebased to master and added the changes suggested by @greatroar |
|
Rebased this PR after #2790 has been merged. |
|
Noticed that the new blob cache by @greatroar already implements locking; hence I removed the locks from |
|
rebased after #2787 has been merged. |
…emory - Add Open() functionality to dir - only access index for blobs when file is read - Implement NodeOpener and put one-time file stuff there - Add comment about locking as suggested by bazil.org/fuse => Thanks at Michael Eischer for suggesting the last two improvements
938d0ad to
f831694
Compare
MichaelEischer
left a comment
There was a problem hiding this comment.
LGTM. I've force-pushed the branch to retrigger the CI. Seems like the flaky rclone test error has changed into a flaky rclone test crash, after #2855.
What is the purpose of this change? What does it change?
In the FUSE implementation restic data structure have been read (meaning loading blobs from the repository) and internal data structures have been created when a FUSE "node" was created. This was the case, for instance, for all item within a dir when this dir was accessed. This lead to quite some memory usage and made the FUSE filesystem "unresponsive".
This PR changes this behavior and only reads restic data structure when the information is really needed. Moreover, the internal data structure have been optimized. Also concurrent operation within FUSE should now work correctly. (FUSE documentation says that access should be synced which wasn't)
Was the change discussed in an issue or in the forum before?
closes #1680
Thanks @greatroar for issuing #2585 which makes me think about whether the bottleneck really is only reading tree blobs.
Checklist
changelog/unreleased/that describes the changes for our users (template here)gofmton the code in all commits