Skip to content

Globally limit concurrent access to files in the repo #1763

@cd1188

Description

@cd1188

Output of restic version

restic 0.8.3 (v0.8.3-221-g26757ae) compiled with go1.10.1 on windows/amd64

How did you run restic exactly?

restic -r i:\backup check --read-data

What backend/server/service did you use to store the repository?

external HDD-drive with usb3

Expected behavior

reading/checking the data with full speed of HDD

Actual behavior

backup need 2h of time
checking with --read-data option took 15h!

Steps to reproduce the behavior

restic -r i:\backup check --read-data

Do you have any idea what may have caused this?

if restic run, the windows-task-manager shows me a hdd-workload of 100% and reading only with 7MB/s

i found in the file internal/checker.go the variable defaultParallelism = 40
so that means 40 thread for reading on the disk.. right?

accessing 40 files at the same time is for a HDD too much workload.

Do you have an idea how to solve the issue?

if i set defaultParallelism=1 and compile it, restic need then much more time for starting:
'load indexes'
'check all packs'
'check snapshots, trees and blobs'

but after that.. reading/checking data ist much faster.
the windows-task-manager shows me a hdd-workload of 50% and reading speed is 45MB/s

with defaultParallelism=2
the windows-task-manager shows me a hdd-workload at 80-90% and
reading speed is 35-50MB/s

so i think to protect the hdd-drive i would prefer to set defaultParallelism=1 in that case.
my solution for faster start is to leave defaultParallelism=40 (or maybe 4 ?)
and add a defaultParallelism2=1 only for the ReadPacks-Function in the that file.

you musst me sorry for this description, beacause i am not a GIT-user and also not good in english
i changed for me this code and its working very well with a speed-up of factor 6
maybe you can change this in the master-branch?

EDIT:
after testing it on an SSD .. the cpu is now the bottleneck and here would be more worker way faster.
maybe a command-switch would be better like --read-worker 4 or --ssd
-> in this case more worker then cpu-threads are useless for hashing/checking ??
but for regular-hdd i would stay by 1 because the health of the hdd is more important fore me.

// run workers
const defaultParallelism2 = 1                    // NEW
for i := 0; i < defaultParallelism2; i++ {     // CHANGED
	g.Go(func() error {
		for {
			var id restic.ID
			var ok bool

			select {
			case <-ctx.Done():
				return nil
			case id, ok = <-ch:
				if !ok {
					return nil
				}
			}

			err := checkPack(ctx, c.repo, id)
			p.Report(restic.Stat{Blobs: 1})
			if err == nil {
				continue
			}

			select {
			case <-ctx.Done():
				return nil
			case errChan <- err:
			}
		}
	})
}

Did restic help you or made you happy in any way?

restic is superb! :-)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions