-
Notifications
You must be signed in to change notification settings - Fork 102
Description
I have a pretty frequent backup task which is scheduled to run every 5 minutes and uses a remote attic repository. The task basically includes an attic list operation in order to get the time of the last backup, a create operation to actually take a new backup and a prune operation to delete older archives.
The problem is that the traffic that this task is generating is increasing linearly with time. I am not saying linearly with the number of archives, because the prune command has already started to prune older backups, so that at each execution one new archive is created and one old archive is deleted. What surprises me is that the traffic that is generated by the create command is neglectable in comparison to the traffic generated by the list and prune commands. The create command takes only 2-3 seconds, while both list and prune commands take several seconds (~ 2-3 minutes combined) and generate a lot of traffic.
The direction of that traffic is like this: remote repo -> target host
It looks like attic list and prune commands need to fetch a lot of data from the remote repository in order to do what they do, which does not make sense to me. The text for the listing of all archives (measured by piping the output of attic list through wc -c) is only ~125KB.
Here is some stats from a 'create' command:
Archive name: frequent-backup_2014-12-29-16:01
Archive fingerprint: f6b0794b37aa66dfe0f11abbf178057c5ed198afc5f1aabf117ec58a8f4f45d5
Start time: Mon Dec 29 16:01:33 2014
End time: Mon Dec 29 16:01:35 2014
Duration: 2.01 seconds
Number of files: 2862
Original size Compressed size Deduplicated size
This archive: 2.98 GB 520.24 MB 316.50 kB
All archives: 5.57 TB 968.40 GB 2.76 GB
Any thoughts?
If there is any extra information that would be helpful, let me know.