Skip to content

Creating a LocalCluster on a multi-core machine can take a while #2450

@mrocklin

Description

@mrocklin

I'm playing with a machine that has 80 logical cores. In this case creating a LocalCluster or raw Client can take a while. I believe that this is because it tries creating 80 workers at once with forkserver.

In [1]: from dask.distributed import Client

In [2]: %time client = Client()
distributed.nanny - WARNING - Worker process still alive after 47 seconds, killing
distributed.nanny - WARNING - Worker process 15200 was killed by unknown signal
<things crash>

There are a few potential approaches to solve this:

  1. We can try to reduce the cost of creating a new process running a worker. My guess is that we're spending a long time importing libraries, but I'm not sure. Someone could investigate this.
  2. We can use the multiprocessing fork approach rather than forkserver (though dask breaks for other reasons here)
  3. We can change the mixture of threads and processes as we get to high number of cores
In [1]: from dask.distributed import Client

In [2]: %time client = Client(n_workers=8, threads_per_worker=10)
CPU times: user 984 ms, sys: 252 ms, total: 1.24 s
Wall time: 6.43 s

Probably we should do some combination of 1 and 3. Any thoughts or suggestions?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions