-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Globally limit concurrent access to files in the repo #1763
Description
Output of restic version
restic 0.8.3 (v0.8.3-221-g26757ae) compiled with go1.10.1 on windows/amd64
How did you run restic exactly?
restic -r i:\backup check --read-data
What backend/server/service did you use to store the repository?
external HDD-drive with usb3
Expected behavior
reading/checking the data with full speed of HDD
Actual behavior
backup need 2h of time
checking with --read-data option took 15h!
Steps to reproduce the behavior
restic -r i:\backup check --read-data
Do you have any idea what may have caused this?
if restic run, the windows-task-manager shows me a hdd-workload of 100% and reading only with 7MB/s
i found in the file internal/checker.go the variable defaultParallelism = 40
so that means 40 thread for reading on the disk.. right?
accessing 40 files at the same time is for a HDD too much workload.
Do you have an idea how to solve the issue?
if i set defaultParallelism=1 and compile it, restic need then much more time for starting:
'load indexes'
'check all packs'
'check snapshots, trees and blobs'
but after that.. reading/checking data ist much faster.
the windows-task-manager shows me a hdd-workload of 50% and reading speed is 45MB/s
with defaultParallelism=2
the windows-task-manager shows me a hdd-workload at 80-90% and
reading speed is 35-50MB/s
so i think to protect the hdd-drive i would prefer to set defaultParallelism=1 in that case.
my solution for faster start is to leave defaultParallelism=40 (or maybe 4 ?)
and add a defaultParallelism2=1 only for the ReadPacks-Function in the that file.
you musst me sorry for this description, beacause i am not a GIT-user and also not good in english
i changed for me this code and its working very well with a speed-up of factor 6
maybe you can change this in the master-branch?
EDIT:
after testing it on an SSD .. the cpu is now the bottleneck and here would be more worker way faster.
maybe a command-switch would be better like --read-worker 4 or --ssd
-> in this case more worker then cpu-threads are useless for hashing/checking ??
but for regular-hdd i would stay by 1 because the health of the hdd is more important fore me.
// run workers
const defaultParallelism2 = 1 // NEW
for i := 0; i < defaultParallelism2; i++ { // CHANGED
g.Go(func() error {
for {
var id restic.ID
var ok bool
select {
case <-ctx.Done():
return nil
case id, ok = <-ch:
if !ok {
return nil
}
}
err := checkPack(ctx, c.repo, id)
p.Report(restic.Stat{Blobs: 1})
if err == nil {
continue
}
select {
case <-ctx.Done():
return nil
case errChan <- err:
}
}
})
}
Did restic help you or made you happy in any way?
restic is superb! :-)