Skip to content

Adjust worker goroutines to number of backend connections#3611

Merged
MichaelEischer merged 2 commits intorestic:masterfrom
MichaelEischer:auto-concurrency
Jul 3, 2022
Merged

Adjust worker goroutines to number of backend connections#3611
MichaelEischer merged 2 commits intorestic:masterfrom
MichaelEischer:auto-concurrency

Conversation

@MichaelEischer
Copy link
Copy Markdown
Member

@MichaelEischer MichaelEischer commented Dec 29, 2021

What does this PR change? What problem does it solve?

Adapt worker count based on whether an operation is CPU or IO-bound.

Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks, repo.Connections() for IO-bound task and a combination if a task can be both.

Typical IO-bound tasks are download / uploading / deleting files. Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are a combination of both, e.g. for combined download and decode functions. In the latter case add both limits together. As the backends have their own concurrency limits restic still won't download more than repo.Connections() files in parallel, but the additional workers can decode already downloaded data in parallel.

This PR also includes the commits from #3489.
The last commit "Adapt concurrency to streaming check --read-data and prune" only makes sense together with #3484.

Alternatives:

The current construction still requires users to adjust one knob namely the connections limit. restic could just use e.g. the CPU count as a proxy for the system performance and use it to scale the number of backend connections. Each operation could then use a different scaling factor. This has the downside that the CPU count and network connection can be correlated (e.g. for servers in the cloud) but don't have to be. That heuristic thus would only work well in some cases.

Finding the optimal number of goroutines automatically could also be possible by using a self-tuning control loop, but that would complicate thing quite a bit.

Was the change previously discussed in an issue or on the forum?

Fixes #2162
Fixes #1467

Checklist

@greatroar
Copy link
Copy Markdown
Contributor

This looks sensible. The main downside is that crash reports become much longer with more goroutines. However, shouldn't more channels be buffered to ensure smooth throughput?

@MichaelEischer
Copy link
Copy Markdown
Member Author

The main downside is that crash reports become much longer with more goroutines.

I guess that's a price worth paying if it helps with the throughput.

However, shouldn't more channels be buffered to ensure smooth throughput?

The idea was to compensate for that by adding as many goroutines as there are CPU cores in methods where substantial processing is necessary for further IO.
In most places the IO-tasks are also generated by submitting a preexisting list on a channel. My expectation would be that the IO is "slow" enough that we shouldn't run into too much congestion on the channel.

Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks,
repo.Connections() for IO-bound task and a combination if a task can be
both. Streaming packs is treated as IO-bound as adding more worker
cannot provide a speedup.

Typical IO-bound tasks are download / uploading / deleting files.
Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are
a combination of both, e.g. for combined download and decode functions.
In the latter case add both limits together. As the backends have their
own concurrency limits restic still won't download more than
repo.Connections() files in parallel, but the additional workers can
decode already downloaded data in parallel.
@MichaelEischer
Copy link
Copy Markdown
Member Author

This PR completes the upload pipeline reworking started by #3489 and several other PRs. It (hopefully) addresses some of the scalability issues that have showed up regarding backends with high latency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

handle large prune much more efficent Make listPackWorkers a configuration option

2 participants