-
-
Notifications
You must be signed in to change notification settings - Fork 757
Description
I'm working on a change to the GUI that will show, for each worker, how much of the worker's RSS is being allocated to dask keys (as reported by the sum of the sizeof() outputs) vs. everything else.
I also intend to split the "everything else" into "old" and "new" memory, defined as follows:
- "dask keys": WorkerState.nbytes, in other words the sum of the output of sizeof() for each dask key stored (not counting those spilled to disk)
- "other old": minimum across the last 30 seconds (configurable) of the measures of (RSS - dask keys)
- "other new": latest measure of RSS - dask keys - "other old"
The idea is that this should let us smooth out temporary peaks caused by delays in gc, delays in the python memory manager releasing the RAM to the OS, and memory temporarily allocated by the task functions.
I intend to use this more stable measure of (dask keys + other old) in all future non-time-critical heuristics that require to measure the total RAM usage of a worker - namely, the larger rebalance() rewrite I'm busy with.
In the GUI, the top-left graph would change as follows:
At the moment, the bars are fully blue and represent the whole RSS.
After the change, for each worker you'd have 3 stacked bars (blue: keys, dark grey: other old, light grey: other new) which add up to the RSS.
The hover tooltip for the individual workers will also change to match the top textbox.
The bar currently changes color to yellow and red to alert of high memory usage; I'd change it to a yellow/red box around it or something similar.
In the "workers" tab, I plan to add 3 new columns next to "memory" to break it down.
XREF #4614 for the implementation of a drop-down to opt in/out of the extra columns to avoid making the table too crowded.
CC @jacobtomlinson, @jsignell , @jrbourbeau for opinions.
Note that all this excludes the dask keys currently spilled to disk. I'm considering adding a fourth bar on top for them.
