Skip to content

distributed.nanny.environ.MALLOC_TRIM_THRESHOLD_ is ineffective #5971

@crusaderky

Description

@crusaderky

Ubuntu 21.10 x86/64
distributed 2022.3.0

The MALLOC_TRIM_THRESHOLD_ env variable seems to be effective at making memory deallocation more reactive.
However, the config variable that sets it doesn't seem to do anything - which indicates that the variable is being set after the worker process is started, whereas it should be set before spawning it.

import dask.array
import distributed

client = distributed.Client(n_workers=1, memory_limit="2 GiB")

N = 7_000
S = 160 * 1024

a = dask.array.random.random(N * S // 8, chunks=S // 8)
a = a.persist()
distributed.wait(a)
del a

Result:
Managed: 0
Unmanaged: 1.16 GiB

import os
import dask.array
import dask.config
import distributed

os.environ["MALLOC_TRIM_THRESHOLD_"] = str(dask.config.get("distributed.nanny.environ.MALLOC_TRIM_THRESHOLD_"))
client = distributed.Client(n_workers=1, memory_limit="2 GiB")

N = 7_000
S = 160 * 1024

a = dask.array.random.random(N * S // 8, chunks=S // 8)
a = a.persist()
distributed.wait(a)
del a

Result:
Managed: 0
Unmanaged: 151 MiB

Production Workaround

Set the env variable on the shell, before starting dask-worker:

export MALLOC_TRIM_THRESHOLD_=65536
dask-worker <address>

Metadata

Metadata

Assignees

Labels

bugSomething is brokenmemory

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions