Fix CoW performance bug in `NIOThreadPool` work queue by gmilos · Pull Request #2669 · apple/swift-nio

gmilos · 2024-02-28T12:02:42Z

Fix CoW performance bug in NIOThreadPool work queue.

Motivation:

NIOThreadPool uses an enum to track it's state. If the pool is running, it'll be in State.running(CircularBuffer<WorkItem>). However this hard to mutate in place reliably. The current code certainly doesn't achieve that, which leads to CoW on each enqueue and dequeue.

Modifications:

I added a test to measure the behaviour before and after. It's called RunIfActiveBenchmark. It's run in 1 & 8 threads variants, with 100k tasks.
I encapsulated the CircularBuffer<WorkItem> in a class, thus providing a layer of indirection during mutations, which avoid CoW
I wrapped WorkItem which is a closure in a struct wrapped to avoid the cost of allocations discussed in: https://bugs.swift.org/browse/SR-15872

Result:

Performance is orders of magnitudes greater. The "after" (measured for Deque version, on my own laptop [given current perf testing CI infra breakage]), is:

measuring: runIfActive_1_thread_100k_tasks: 0.251138375, 0.247212625, 0.257556917, 0.2396585, 0.245091042, 0.240477125, 0.246961292, 0.244519042, 0.240759375, 0.240233208,
measuring: runIfActive_8_threads_100k_tasks: 0.26803725, 0.262704083, 0.263101083, 0.266763792, 0.269062791, 0.267278875, 0.265546208, 0.266732625, 0.265480375, 0.271898916,

The before takes way too long to run (hours).

weissi

Thank you! LGTM, letting @Lukasa merge this if he's happy

FranzBusch · 2024-02-28T12:50:11Z

+    /// Since `WorkItems` are embedded in `State` enum, it's hard to reliably avoid CoW otherwise.
+    ///
+    /// Also the closures are wrapped in a struct, to avoid the cost of allocation as discussed on https://bugs.swift.org/browse/SR-15872
+    final class WorkItems {


Instead of boxing the buffer here with a class we should be able to avoid the CoW by introducing a new state called modifying into the State enum and before mutating the queue we transition to that state and then back to the running state. We have used that approach in most other places recently and it avoids this class entirely.

The other upside of this pattern is that once we language gets mutating/inout switches in the future we can easily transition over to that and remove the modifying case again.

good catch! shoot, missed the class, yes, let's do this properly

The .modifying solution is 60% slower than the class. @FranzBusch @weissi do you still want me to do that?

60% is for the benchmarks, going from ~140ms per run, to ~230ms.

Where does the slow down come from and is sprinkling some @inlinable + @usableFromInline making it go faster?

Also consider moving from CircularBuffer to Deque from swift-collections.

Ah, this confused me also.

Turns out I had a different testing environment. When I was developing the original solution, and performance testing it. I was sitting on top of #2645, which also has an effect, as ELFs are used to notify completions.

In the "this is 60% slower" result, I was sitting on top of now rebased branch from this PR.

My bad. I'll update the results in the PR description to give the numbers for this branch alone (as they should have been). And then we'll see the 60% perf bump on once #2645 goes in.

@dnadoba I checked Deque and it didn't have any noticeable performance difference. Is it supposed to? And/or are we attempting to move away from CircularBuffer? Pushing Deque-s version, so that we have all the different paints on the PR :).

We are moving away from CircularBuffer. We are using Deque in all new code e.g. in our new custom AsyncSequences.
Both are heavily optimised but Deque is the way forward. This isn't urgent and we will keep CircularBuffer around but it isn't likely to be updated. All new performance optimisations, if any are possible, should be done on top of Deque as it is independent of NIO and therefore more generally useful.

@dnadoba got it, thanks. Will keep this in mind for the future. This code could be trivially switched to Deque so it's already in.

…dPool changes reduced the allocations required.

gmilos added 2 commits February 28, 2024 11:54

Benchmark for measuring NIOThreadPool dispatch

5aadb43

Avoid CoW cost for NIOThreadPool work queue

ae79abb

gmilos requested review from Lukasa and weissi February 28, 2024 12:02

weissi approved these changes Feb 28, 2024

View reviewed changes

Flip stray var to a let.

d787244

FranzBusch requested changes Feb 28, 2024

View reviewed changes

gmilos added 3 commits February 28, 2024 13:16

Use .modifying state to prevent CoW, not a class

e7980ba

Removing some obselete comment noise.

4b11624

Use Deque instead of CirclarBuffer in NIOThreadPool

9380a6b

FranzBusch approved these changes Feb 28, 2024

View reviewed changes

Ratchet down the # of allocs for read_10000_chunks_from_file. NIOThea…

b539882

…dPool changes reduced the allocations required.

gmilos merged commit c3b6f1f into apple:main Feb 28, 2024

Lukasa added the 🔨 semver/patch No public API change. label Feb 28, 2024

0xTim mentioned this pull request Jan 16, 2025

Add ChannelPipeline.SynchronousOperations.Position #3065

Merged

0xTim mentioned this pull request Feb 23, 2025

Equatable SocketAddressError #3119

Merged

0xTim mentioned this pull request Aug 13, 2025

Return response when header validation fails #3346

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CoW performance bug in `NIOThreadPool` work queue#2669

Fix CoW performance bug in `NIOThreadPool` work queue#2669
gmilos merged 7 commits intoapple:mainfrom
gmilos:gm-nio-thread-pool-queueing-opitimisations

gmilos commented Feb 28, 2024 •

edited

Loading

Uh oh!

weissi left a comment

Uh oh!

FranzBusch Feb 28, 2024

Uh oh!

weissi Feb 28, 2024

Uh oh!

gmilos Feb 28, 2024 •

edited

Loading

Uh oh!

FranzBusch Feb 28, 2024

Uh oh!

dnadoba Feb 28, 2024

Uh oh!

gmilos Feb 28, 2024

Uh oh!

gmilos Feb 28, 2024

Uh oh!

dnadoba Feb 28, 2024

Uh oh!

gmilos Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

gmilos commented Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation:

Modifications:

Result:

Uh oh!

weissi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gmilos Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gmilos commented Feb 28, 2024 •

edited

Loading

gmilos Feb 28, 2024 •

edited

Loading