Add min and max worker flags to adaptive cluster by jacobtomlinson · Pull Request #1618 · dask/distributed

jacobtomlinson · 2017-12-08T11:58:07Z

Description

Add min and max worker flags to adaptive cluster.

We've been using adaptive clusters on Kubernetes (over EC2) and have found an initial spin up time before tasks start calculating due to the first couple of scale up calls taking time to complete.

This PR adds a minimum number of workers to keep alive in the adaptive cluster meaning that tasks start processing immediately.

We've also added a maximum number of workers while we were at it.

Example usage

dask-scheduler --port 8786 --bokeh-port 8787 --min-workers 10 --max-workers 100 --preload /opt/scheduler/adaptive.py

mrocklin · 2017-12-08T15:10:13Z

Thanks for this, I'm glad to see people experimenting with these policies on Kubernetes. Two main comments regarding the code here:

This could use tests
It would be good to avoid modifying the scheduler directly. We've intentionally kept the Adaptive code separate from the Scheduler code and it would be useful to keep it that way in the short term. I expect other Adaptive strategies to arise and we can't have all of them adding attributes to the scheduler. Do you think it would be possible to put the min/max_workers values on the Adaptive class and specify them in your preload script?

For Kubernetes in particular I've been thinking that maybe the right thing to do here isn't to scale-up/scale-down but rather to set a target number of workers given what we know about the workload. For example, we can look at Scheduler.total_occupancy and get a pretty good estimate of how many CPU-seconds we need. This is larger in scope that this PR, but might be worth considering.

niallrobinson · 2017-12-08T15:51:51Z

Hi @mrocklin (I co-wrote the PR with @jacobtomlinson ),

That's fair - we thought you might say that 😄
We agree, but without a fairly significant refactor, we couldn't see how to do this, as a lot of the logic about add/removing nodes is contained in the Scheduler. (I guess this is what you're driving at with "larger scope than this PR")

(btw I'm presenting some of this stuff at AGU on Monday, if you're about - talk called "Jade: using on-demand cloud analysis to give scientists back their flow")

rbubley · 2017-12-08T16:13:13Z

I've also been looking at extending Adaptive, and there are various strategies for which scheduler.workers_to_close() is too inflexible.

Perhaps this function itself should be moved into the Adaptive class, as it that seems to be the only place it's used? This would mean that future improvements in Adaptive strategies would be less likely to need to touch the scheduler.

jacobtomlinson · 2017-12-08T16:15:16Z

I agree with @rbubley.

We did start with min_workers and max_workers in the adaptive class but found ourselves basically reimplementing workers_to_close as it wasn't flexible enough for what we were trying to do. Hence why we were editing the scheduler itself.

One thing I was thinking while working on this is that scaling up and down in the adaptive class in inconsistent. Scaling up tells you how many workers you should have total, scaling down effectively says "I've terminated these workers please garbage collect them". While this makes sense from a practical point perhaps it could be possible for the adaptive class to handle retiring the workers as well as removing them. And for scaling up perhaps it should be "please add three more", as that feels more consistent with "please remove these three".

mrocklin · 2017-12-08T16:17:52Z

I've raised a separate issue for adaptive kubernetes clusters here, just in case this conversation extends beyond the lifetime of this PR. #1619

mrocklin · 2017-12-08T16:30:19Z

We did start with min_workers and max_workers in the adaptive class but found ourselves basically reimplementing workers_to_close as it wasn't flexible enough for what we were trying to do.

Perhaps we can improve workers_to_close? What was restrictive?

jacobtomlinson · 2017-12-08T17:21:01Z

Replying from mobile. It was the fact that we don’t want to scale down if removing the list returned from `workers_to_close` brings us below our minimum. We can check the logic fine in the `should_scale_down` method and remove some from the list to avoid going below the minimum. However then when retiring the `workers_to_close` is called again by `retire_workers` and gets the list again.

…

On Fri, 8 Dec 2017 at 16:30, Matthew Rocklin ***@***.***> wrote: We did start with min_workers and max_workers in the adaptive class but found ourselves basically reimplementing workers_to_close as it wasn't flexible enough for what we were trying to do. Perhaps we can improve workers_to_close? What was restrictive? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1618 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABiUYtlQmPYCYyirDrLYPD6FH3dthvQKks5s-WQcgaJpZM4Q6_u2> .

mrocklin · 2017-12-08T17:24:19Z

I would be fine with a minimum keyword in the retire_workers method if that would solve things.

rbubley · 2017-12-10T01:12:56Z

A slightly more general way would be to have a list of do_not_retire workers.

If you only care about a numeric minimum you could just select some arbitrarily, but in some cases there may be distinguished workers that you want to keep alive: perhaps you get a discount if the same worker stays alive for a long time, or perhaps these workers are using harvested time from computers doing something else, etc.

This could be achieved with a function parameter, but could also be achieved by using the existing worker resources functionality - which in some ways might make more sense: they could declare themselves with a resource of “always-on”, or something.

mrocklin · 2017-12-10T02:04:54Z

Have you had concrete experience with systems that need this functionality or is this more hypothetical?

rbubley · 2017-12-10T21:10:20Z

I wouldn't say 'need': it's a want. Both of the examples I gave are live issues on our roadmap.

asford · 2017-12-14T07:49:23Z

Would you mind looking at #1632 and checking if the modifications made in that PR could be used in the resolution of this issue?

It would, at a high level, move the question of adaptive scaling behavior to Adaptive subclasses as previously suggested by providing an override hook in Adaptive but maintain compatibility with the current scheduler API.

As @rbubley suggested, it seems a touch strange that this logic lives in the scheduler but I can see a compelling argument for providing basic scale-down behavior in the scheduler interface, as is currently provided by workers_to_close and retire_workers. Adding the ability to hook into and override this behavior via Adaptive seems preferable to extending the core scheduler class with arbitrarily complex scaling logic.

jacobtomlinson · 2017-12-14T08:48:02Z

@asford I see #1632 as a complimentary PR to this one. I need to make some changes to this one anyway before it can be merged, however I get the feeling progress will be slow this week with AGU.

asford · 2017-12-14T16:56:18Z

It seems that one of the primary reasons this change touches scheduler.py is to provide support for command-line configuration for your autoscaling logic. I've run into the same issue, but don't think it's a great idea to overload scheduler as a configuration container and/or rely on modification of the dask-scheduler CLI.

Would a change allowing pass-through arguments to modules loaded via --preload be a workable solution for you? I'd envision this functioning in a manner similar to a compiler, in which subtool-specific flags may be specified on the command line. For example, clang -Xlinker --foo=bar passes --foo=bar as arguments to ld. In this case your interface might look like:

dask-scheduler --preload scheduler_autoscaling.py -Xpreload --min-workers=2 -Xpreload --max-workers=10

Where scheduler_autoscaling.py is a application specific plugin you've written that handles your scaling logic.

mrocklin · 2017-12-14T17:04:31Z

Passing through keyword arguments to the preload script makes sense to me. In general I'm usually in favor of improving the functioning of escape hatches like preload scripts.

asford · 2017-12-14T17:11:01Z

@jacobtomlinson Just to clarify, I think this PR and #1632 are compatible and I would be stoked to see all of this logic land somewhere in distributed together.

I've opened #1634 as a strawman implementation of the "command line passthrough" concept. Would you mind taking a look at that pull and seeing if it could be used as a base for this functionality?

jacobtomlinson · 2017-12-15T10:58:25Z

@asford Yes I think that would be a good option. Let me know when it's in and I'll update this PR to make use of it.

jhamman · 2018-02-13T19:45:35Z

I'm looking for an update on this issue. @jacobtomlinson, do you expect to revisit things here (#1632 was merged).

jacobtomlinson · 2018-02-14T10:42:00Z

@jhamman yes keen to revisit if @mrocklin is still happy with the proposal

mrocklin · 2018-02-14T13:08:04Z

In general I agree that adding min/max workers to Adaptive is a good idea

mrocklin · 2018-03-02T14:04:49Z

I've implemented min/max in Adaptive here: #1797

Note that this doesn't pass anything through the scheduler. This just focuses on the Adaptive piece in isolation.

jhamman · 2018-03-03T20:03:08Z

I suppose this can be closed now?

mrocklin · 2018-03-03T20:11:00Z

Thanks for starting this @jacobtomlinson !

Add min and max workers to adaptive cluster

46e31ca

mrocklin mentioned this pull request Dec 8, 2017

Adaptive cluster for Kubernetes #1619

Closed

asford mentioned this pull request Dec 14, 2017

Add support for "grouped" adaptive scaling and adaptive behavior overrides. #1632

Merged

3 tasks

mrocklin mentioned this pull request Mar 2, 2018

Adaptive min max #1797

Merged

mrocklin closed this Mar 3, 2018

jacobtomlinson deleted the adaptive-min-max branch March 5, 2018 09:19

Uh oh!

Conversation

jacobtomlinson commented Dec 8, 2017

Description

Example usage

Uh oh!

mrocklin commented Dec 8, 2017

Uh oh!

niallrobinson commented Dec 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rbubley commented Dec 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacobtomlinson commented Dec 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrocklin commented Dec 8, 2017

Uh oh!

mrocklin commented Dec 8, 2017

Uh oh!

jacobtomlinson commented Dec 8, 2017 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrocklin commented Dec 8, 2017

Uh oh!

rbubley commented Dec 10, 2017

Uh oh!

mrocklin commented Dec 10, 2017

Uh oh!

rbubley commented Dec 10, 2017

Uh oh!

asford commented Dec 14, 2017

Uh oh!

jacobtomlinson commented Dec 14, 2017

Uh oh!

asford commented Dec 14, 2017

Uh oh!

mrocklin commented Dec 14, 2017

Uh oh!

asford commented Dec 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacobtomlinson commented Dec 15, 2017

Uh oh!

jhamman commented Feb 13, 2018

Uh oh!

jacobtomlinson commented Feb 14, 2018

Uh oh!

mrocklin commented Feb 14, 2018

Uh oh!

mrocklin commented Mar 2, 2018

Uh oh!

jhamman commented Mar 3, 2018

Uh oh!

mrocklin commented Mar 3, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

niallrobinson commented Dec 8, 2017 •

edited

Loading

rbubley commented Dec 8, 2017 •

edited

Loading

jacobtomlinson commented Dec 8, 2017 •

edited

Loading

jacobtomlinson commented Dec 8, 2017 via email •

edited

Loading

asford commented Dec 14, 2017 •

edited

Loading