Skip to content

Allow worker to prioritize tasks based on memory production/consumption#5251

Open
mrocklin wants to merge 3 commits intodask:mainfrom
mrocklin:worker-memory-priority
Open

Allow worker to prioritize tasks based on memory production/consumption#5251
mrocklin wants to merge 3 commits intodask:mainfrom
mrocklin:worker-memory-priority

Conversation

@mrocklin
Copy link
Copy Markdown
Member

This adds a worker.py::TaskPrefix class that tracks consumption and
production of all computed tasks, grouped by task prefix.

Then we use these values when determining priorities,
(de)prioritizing tasks that (produce)/consume five times more data than
they (consume)/produce.

See #5250 for background

This adds a worker.py::TaskPrefix class that tracks consumption and
production of all computed tasks, grouped by task prefix.

Then we use these values when determining priorities,
(de)prioritizing tasks that (produce)/consume five times more data than
they (consume)/produce.
@mrocklin
Copy link
Copy Markdown
Member Author

OK, I've added pausing, although it's ugly and not very future-reader friendly. This will have to be cleaned up, but it is probably simple enough for people to take a look at if they're interested.

test_resources.py rightfully complained
@jrbourbeau
Copy link
Copy Markdown
Member

Thanks for pushing this up @mrocklin. I'll plan to read through #5250 and review this tomorrow

@TomNicholas
Copy link
Copy Markdown

TomNicholas commented May 11, 2022

I'm coming up against workers over-eagerly consuming memory, and wondering what the status of this effort is?

Really I'm just looking for a way to get workers to deprioritise certain memory-consuming root tasks (xr.open_dataset tasks). I can open another issue if that's worthwhile.

@gjoseph92
Copy link
Copy Markdown
Collaborator

@TomNicholas see #5223 and #5555 for discussion of the underlying problem. We're focused on core stability issues (deadlocks) right now, so changes to the scheduling algorithm to address this are not getting attention at the moment.

Opening a separate issue to discuss workarounds for root task overproduction might be useful. There may be tricks you can play using worker resources, though I haven't had much success with them personally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants