Investigate worker overhead

Motivated by a desire for reduced latencies on the workers for [Actors](https://github.com/dask/distributed/pull/2133) (we found that 1ms things were taking 5ms) we added a thread that [statistically profiles the event loop](https://github.com/dask/distributed/pull/2144).  This showed overhead from a couple surprising sources:

1.  `psutil` and the `SystemMonitor`
2.  Tornado's `write_to_fd` which apparently isn't entirely non-blocking, see [this stack overflow question](https://stackoverflow.com/questions/51686171/why-does-tornado-spend-time-in-socket-senddata)
3.  Tornado's `add_callback` overhead, see [this stack overflow question](https://stackoverflow.com/questions/51582394/which-functions-are-free-when-profiling-tornado-asyncio/51595426?noredirect=1#comment90212561_51595426)

I'm not sure how best to address these.  There are probably a few approaches:

1.  Check that we're using psutil appropriately, and that there isn't some better way to regularly poll system use at high-ish frequency (currently we poll every 500ms)
2.  Quantify the cause of `add_callback`, and see if there aren't some occasions where we can reduce our use of Tornado
3.  Investigate other concurrency frameworks, like asyncio + uvloop.  This sounds neat, but is likely expensive for many reasons.  I did try using uvloop + asyncio + tornado but it wasn't very effective.  The overhead appears to be higher in this stack so that uvloop doesn't seem to do much good.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate worker overhead #2156

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Investigate worker overhead #2156

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions