-
Notifications
You must be signed in to change notification settings - Fork 102
Description
To accelerate operations, attic keeps some information in RAM:
- repository index (if a remote repo is used, this is allocated on remote side)
- chunks cache (telling which chunks we already have in the repo, for not storing same stuff twice)
- files cache (telling which filenames, mtimes, etc. we already have in the repo, so attic can just skip these files)
In this section (and also the paragraph above it), there are some [not completely clear] numbers about memory usage:
https://github.com/attic/merge/blob/merge/docs/internals.rst#indexes-memory-usage
So, if I understand correctly, this would be an estimate for the ram usage (for a local repo):
chunk_count ~= total_file_size / 65536
repo_index_usage = chunk_count * 40
chunks_cache_usage = chunk_count * 44
files_cache_usage = total_file_count * 240 + chunk_count * 80
mem_usage ~= repo_index_usage + chunks_cache_usage + files_cache_usage
= total_file_count * 240 + total_file_size / 400
All units are Bytes.
It is assuming every chunk is referenced exactly once and that typical chunk size is 64kiB.
E.g. backing up a total count of 1Mi files with a total size of 1TiB:
mem_usage = 1 * 2**20 * 240 + 1 * 2**40 / 400 = 2.8GiB
So, this will need 3GiB RAM just for attic. If you run attic on a NAS device (or other device with limited RAM), this might be already beyond the RAM you have available and will lead to paging (assuming you have enough swap space) and slowdown. If you don't have enough RAM+swap, attic will run into "malloc failed" or get killed by the OOM Killer.
For bigger servers, the problem will just appear a bit later:
- 10TiB data in 10Mi files will eat 28GiB of RAM
- 1TiB data in 100Mi files will eat 28GiB of RAM