You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This code piece seems to be unreachable since ts.processing_on to be None is only possible for tasks with a different state than processing but tasks with a different state should not be stealable. The dict stealable_unknown_durations should therefore never be populated.
Tracing back the git history I found the following commit
which refactored the stealing code to use the taskprefix containers. There, the logic for the dict population changed semantically from catching a key error to checking for the existence of the worker state object. I would argue that my initial argument about the unreachability still holds.
before this, we can only see one commit which factors the work steal code out into an extension. Unfortunately, this commit also seemed to rewrite a huge piece of the logic which is why there is no 1:1 mapping available. Browsing the original scheduler based work stealing code, we can find a variable stealable_unknown_durations but it wasn't used anywhere https://github.com/dask/distributed/blob/e866e11936cc5c2e5bb5af48a529c93236043a71/distributed/scheduler.py
Hrm, I'm surprised by this, but not incredibly so. Tests pass and this reduces complexity, so I'm happy to merge. If this ends up having negative effects can I depend on you to fix this @fjetter ?
The good thing about this change is that if it goes south we should see it as an AttributeError: 'NoneType' object has no attribute 'processing' exception since for this sections to have any effect the code first needs to pass the ws is None guard. It should not show up as some weird, hard to track down performance degradation and I'm fairly confident that we should be able to track such an issue down easily to this change, i.e. if that's the case you may ping me and I'll take care of it :)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This code piece seems to be unreachable since
ts.processing_onto beNoneis only possible for tasks with a different state thanprocessingbut tasks with a different state should not bestealable. The dictstealable_unknown_durationsshould therefore never be populated.Tracing back the git history I found the following commit
8404684
which refactored the stealing code to use the taskprefix containers. There, the logic for the dict population changed semantically from catching a key error to checking for the existence of the worker state object. I would argue that my initial argument about the unreachability still holds.
before this, we can only see one commit which factors the work steal code out into an extension. Unfortunately, this commit also seemed to rewrite a huge piece of the logic which is why there is no 1:1 mapping available. Browsing the original scheduler based work stealing code, we can find a variable
stealable_unknown_durationsbut it wasn't used anywherehttps://github.com/dask/distributed/blob/e866e11936cc5c2e5bb5af48a529c93236043a71/distributed/scheduler.py
ref #3591