Skip to content

Additional optimizations to stealing#4445

Merged
quasiben merged 14 commits intodask:masterfrom
jakirkham:opt_steal_2
Jan 21, 2021
Merged

Additional optimizations to stealing#4445
quasiben merged 14 commits intodask:masterfrom
jakirkham:opt_steal_2

Conversation

@jakirkham
Copy link
Copy Markdown
Member

@jakirkham jakirkham commented Jan 20, 2021

Annotates variables in reevaluate_occupancy to improve efficiency there. As well as delaying construction of values that may not be needed.

Further optimizes put_key_in_stealable by only retrieving ws and worker if cost_multiplier is not None, which is the only case where they are used. Avoiding grabbing them otherwise.

Finally simplify check for executing task in set_duration_estimate by just using .get(...) instead of checking for the key and then getting the value.

@jakirkham jakirkham changed the title Delay getting worker info until needed Additional optimizations to stealing Jan 20, 2021
There are some cases where we enter the `if`, but wouldn't set
`next_time`. So go back to giving this a default value.
Comment on lines 75 to +77
if finish == "processing":
self.put_key_in_stealable(ts)

if start == "processing":
elif start == "processing":
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC these can't both be true. So changed this to elif

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed

if start == finish:
return {}

out of curiosity, is elif faster?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one matches the first case initially, yes. Though admittedly it is not a huge difference

In [1]: %%timeit s = "a"
   ...: if s == "a":
   ...:     pass
   ...: elif s == "b":
   ...:     pass
   ...: 
   ...: 
20.9 ns ± 0.166 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [2]: %%timeit s = "a"
   ...: if s == "a":
   ...:     pass
   ...: if s == "b":
   ...:     pass
   ...: 
   ...: 
39.3 ns ± 0.172 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

@jakirkham jakirkham requested a review from mrocklin January 20, 2021 22:47
Comment on lines 75 to +77
if finish == "processing":
self.put_key_in_stealable(ts)

if start == "processing":
elif start == "processing":
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed

if start == finish:
return {}

out of curiosity, is elif faster?

estimate the task duration to be 2x current-runtime, otherwise we set it
to be the average duration.
"""
exec_time: double = ws._executing.get(ts, 0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that if ts is not in ws._executing, this will return 0 ?

Copy link
Copy Markdown
Member Author

@jakirkham jakirkham Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main point is that we don't want to enter the if below (we want to enter the else instead). 0 was just a sensible default. Though we could also consider -1, which we have used elsewhere (as that is also an obviously bogus value for time).

duration is guaranteed to be positive (though maybe there is a vanishingly small chance that it is 0). In any event exec_time > 2 * duration would be False when exec_time is 0, which is what we are trying to accomplish here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakirkham

@quasiben
Copy link
Copy Markdown
Member

quasiben commented Jan 21, 2021

Looks good! Thanks @jakirkham

@quasiben quasiben merged commit 2bc8138 into dask:master Jan 21, 2021
@jakirkham jakirkham deleted the opt_steal_2 branch January 21, 2021 19:26
@jakirkham
Copy link
Copy Markdown
Member Author

Thanks all! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants