Skip to content

dask-worker --memory-limit not working? #1015

@matyasselmeci

Description

@matyasselmeci

Limiting the memory used by a dask-worker using the --memory-limit option seems to have no effect.

Using Python 2.7.5 on a Scientific Linux 7.3 host, with dask 0.14.1 and distributed 1.16.1, I set up a scheduler and a single worker as follows:

dask-scheduler --scheduler-file /tmp/schedfile &
sleep 5
dask-worker --no-nanny --no-bokeh --nprocs 1 --nthreads 1 --memory-limit=250e6 \
    --scheduler-file /tmp/schedfile &

I ran the following client code to test:

import dask, dask.distributed, dask.array
cli = dask.distributed.Client(scheduler_file='/tmp/schedfile')
x = dask.array.random.random((2000, 2000), chunks=(10, 10))
y = x.T * x
print(cli.gather(cli.compute(y.sum()))

and watched the memory usage via ps -e -o rss,command | grep python | grep -v grep.

I expected the RSS of the dask-worker process to stay below 250K (or just slightly above), but it was around 450K for most of the calculation. With larger array sizes, the RSS rises even more.

Is this a bug? If not, what is the best way to guarantee that a dask-worker does not go over a hard limit, memory-wise?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions