-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Comparison of approaches to tackle index memory usage #2523
Description
NOTE TO CODE REVIEWERS / CORE CONTRIBUTORS
Please read this comment for a summary of changes and structured information about each of those changes. This might be a better starting point than reading through all of the comments in this issue.
Describe the issue
As stated in #1988 the memory consumption of the index is one of the biggest issues with memory consumption of restic. The main point is the actual implementation of in-memory index storage in internal/repository/index.go
I would therefore like to start a new issue to discuss strategies that could remidy the high memory consumption.
As a starting point I opened a branch index-alternatives which implements the following strategies:
- Standard: index as actually used in restic
- Reload: Don't save anything. For any index operation reload the file and use the actual implementation for the temporily created index.
- Low-memory index: Store only full index data for tree blobs. For data blobs only use a IDSet to save which data blobs are present. This allows most index operations. For the missing ones do as in Reload (at least for data blobs)
- Bolt: Use a DB on disc via bbolt
Index strategy can be chosen by
restic -i <default|reload|low-mem|bolt>
The branch is considered WIP. I added a general Interface FileIndex which should be fulfilled by new index strategies. Also the actual implementation just loads index files (file-per-file) into the standard Index struct and then reloads them into a new data structure allowing the GC to clean up. Newly created index files are not affected, so there should be only an effect for repositories with data in it.
Please note that most implementations are quick-and-dirty and miss serious error handling, clean-up etc.
More index strategies can be easily added by adding a new implementation for FileIndex and adding them in repository.go (where loading takes place) and cmd/restic/global.go (where the flag -i is interpreted)
I appreciate to get feedback on the proposed index strategies.
In this task I would also like to collect good test settings where different index strategies can be compared to each other.