Additional optimizations to stealing#4445
Conversation
There are some cases where we enter the `if`, but wouldn't set `next_time`. So go back to giving this a default value.
| if finish == "processing": | ||
| self.put_key_in_stealable(ts) | ||
|
|
||
| if start == "processing": | ||
| elif start == "processing": |
There was a problem hiding this comment.
IIUC these can't both be true. So changed this to elif
There was a problem hiding this comment.
Indeed
distributed/distributed/scheduler.py
Lines 5518 to 5519 in e98d57d
out of curiosity, is elif faster?
There was a problem hiding this comment.
If one matches the first case initially, yes. Though admittedly it is not a huge difference
In [1]: %%timeit s = "a"
...: if s == "a":
...: pass
...: elif s == "b":
...: pass
...:
...:
20.9 ns ± 0.166 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [2]: %%timeit s = "a"
...: if s == "a":
...: pass
...: if s == "b":
...: pass
...:
...:
39.3 ns ± 0.172 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)| if finish == "processing": | ||
| self.put_key_in_stealable(ts) | ||
|
|
||
| if start == "processing": | ||
| elif start == "processing": |
There was a problem hiding this comment.
Indeed
distributed/distributed/scheduler.py
Lines 5518 to 5519 in e98d57d
out of curiosity, is elif faster?
| estimate the task duration to be 2x current-runtime, otherwise we set it | ||
| to be the average duration. | ||
| """ | ||
| exec_time: double = ws._executing.get(ts, 0) |
There was a problem hiding this comment.
Is it true that if ts is not in ws._executing, this will return 0 ?
There was a problem hiding this comment.
The main point is that we don't want to enter the if below (we want to enter the else instead). 0 was just a sensible default. Though we could also consider -1, which we have used elsewhere (as that is also an obviously bogus value for time).
duration is guaranteed to be positive (though maybe there is a vanishingly small chance that it is 0). In any event exec_time > 2 * duration would be False when exec_time is 0, which is what we are trying to accomplish here.
|
Looks good! Thanks @jakirkham |
|
Thanks all! 😄 |
Annotates variables in
reevaluate_occupancyto improve efficiency there. As well as delaying construction of values that may not be needed.Further optimizes
put_key_in_stealableby only retrievingwsandworkerifcost_multiplieris notNone, which is the only case where they are used. Avoiding grabbing them otherwise.Finally simplify check for executing task in
set_duration_estimateby just using.get(...)instead of checking for the key and then getting the value.