Adjust worker goroutines to number of backend connections#3611
Adjust worker goroutines to number of backend connections#3611MichaelEischer merged 2 commits intorestic:masterfrom
Conversation
|
This looks sensible. The main downside is that crash reports become much longer with more goroutines. However, shouldn't more channels be buffered to ensure smooth throughput? |
7f9dd91 to
e616205
Compare
I guess that's a price worth paying if it helps with the throughput.
The idea was to compensate for that by adding as many goroutines as there are CPU cores in methods where substantial processing is necessary for further IO. |
13f4c56 to
8c19be6
Compare
8c19be6 to
55a5f19
Compare
55a5f19 to
e3d4e2a
Compare
356580c to
85348e1
Compare
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks, repo.Connections() for IO-bound task and a combination if a task can be both. Streaming packs is treated as IO-bound as adding more worker cannot provide a speedup. Typical IO-bound tasks are download / uploading / deleting files. Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are a combination of both, e.g. for combined download and decode functions. In the latter case add both limits together. As the backends have their own concurrency limits restic still won't download more than repo.Connections() files in parallel, but the additional workers can decode already downloaded data in parallel.
85348e1 to
3af9c2c
Compare
|
This PR completes the upload pipeline reworking started by #3489 and several other PRs. It (hopefully) addresses some of the scalability issues that have showed up regarding backends with high latency. |
What does this PR change? What problem does it solve?
Adapt worker count based on whether an operation is CPU or IO-bound.
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks, repo.Connections() for IO-bound task and a combination if a task can be both.
Typical IO-bound tasks are download / uploading / deleting files. Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are a combination of both, e.g. for combined download and decode functions. In the latter case add both limits together. As the backends have their own concurrency limits restic still won't download more than
repo.Connections()files in parallel, but the additional workers can decode already downloaded data in parallel.This PR also includes the commits from #3489.
The last commit "Adapt concurrency to streaming check --read-data and prune" only makes sense together with #3484.
Alternatives:
The current construction still requires users to adjust one knob namely the connections limit. restic could just use e.g. the CPU count as a proxy for the system performance and use it to scale the number of backend connections. Each operation could then use a different scaling factor. This has the downside that the CPU count and network connection can be correlated (e.g. for servers in the cloud) but don't have to be. That heuristic thus would only work well in some cases.
Finding the optimal number of goroutines automatically could also be possible by using a self-tuning control loop, but that would complicate thing quite a bit.
Was the change previously discussed in an issue or on the forum?
Fixes #2162
Fixes #1467
Checklist
[ ] I have added tests for all code changes.I don't see how I could test this.[ ] I have added documentation for relevant changes (in the manual).The connections option should already be documented by Asynchronously upload pack files #3489.changelog/unreleased/that describes the changes for our users (see template).gofmton the code in all commits.