-
-
Notifications
You must be signed in to change notification settings - Fork 756
Description
The need for memory backpressure has been discussed at length, and a full solution is arguably necessary to solve many of the use cases in the pangeo community. Whilst efforts in that direction do seem be underway, I would like to suggest a simpler stop-gap solution for the shorter term. The basic issue is that workers can overeargerly open new data at a rate faster than other tasks free up memory, and is best understood via the figure under "root task overproduction" here.
All my tasks that consume memory are root tasks, so is there any way to deprioritize root tasks? This should be a lot simpler than solving the general problem, because it doesn't require (a) any records kept of how much memory each task uses or (b) workers to change their priorities over the course of a computation based on new information. All I need is for workers to be discouraged / prevented from opening new data until the task chains that are already begun are as complete as possible. This will obviously waste time as workers wait, but I would be happy to accept a considerable slowdown if it means that I definitely won't run out of memory.
This problem applies to any computation where the data is largest when opened and then steadily reduced (which is most pangeo workflows, with the important exception of the groupby stuff IIUC), but the opening task and the memory-relieving tasks can't be directly fused into one.
Is there some way of implementing this? I'm happy to provide an example use case if that would help, or to try hacking away at the distributed code, but I wanted to know if this is even possible first.
cc @gjoseph92 , and @rabernat @dcherian @TomAugspurger , because this is related to the discussion we had in the pangeo community meeting the other week.