Conversation
b3e14ab to
62ded0f
Compare
|
The good: all files except scheduler.py are now passing mypy checks. |
8bda7e9 to
3de84a1
Compare
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @crusaderky!
As this is a relatively large change (though one I think there's interest in), we should be sure to get feedback from folks who regularly contribute to distributed. cc @fjetter, @jakirkham, @gjoseph92, @mrocklin, @jacobtomlinson, @madsbk thoughts on adding type annotations? (also feel free to rope others into this discussion)
|
I think this is a great idea, thanks @crusaderky! |
|
As discussed in the latest coiled meeting, I have reverted all changes to scheduler.py and the pre-commit hook, to publish them in a second iterative PR. |
|
Generally ok with this. However I do have one concern. In particular I'd like us to be aware that |
ian-r-rose
left a comment
There was a problem hiding this comment.
Thanks @crusaderky, this is awesome
|
|
||
| # cache clients for re-use in remote magic | ||
| remote_magic._clients = {} | ||
| remote_magic._clients = {} # type: ignore |
There was a problem hiding this comment.
| remote_magic._clients = {} # type: ignore | |
| remote_magic._clients: Dict[str, BlockingKernelClient] = {} |
There was a problem hiding this comment.
Thanks, fixed. We should always use lowercase dict[...], list[...], etc. and put from __future__ import annotations at the top of the module.
There was a problem hiding this comment.
Correction: can't.
distributed/_ipython_utils.py:133: error: Type cannot be declared in assignment to non-self attribute
distributed/_ipython_utils.py:133: error: "Callable[[Any, Any], Any]" has no attribute "_clients"
|
|
||
|
|
||
| LONG_VERSION_PY = {} | ||
| LONG_VERSION_PY: dict = {} |
There was a problem hiding this comment.
I think this file is missing a from __future__ import __annotations to make this work.
There was a problem hiding this comment.
that's only necessary if you use bracket-stile dict[...] or if the annotations create circular dependencies
|
|
||
|
|
||
| def _get_global_client(): | ||
| def _get_global_client() -> Client | None: |
There was a problem hiding this comment.
Note to myself and others: 3.10 syntax, but backwards compatible with from __future__ import annotations, AFAIK.
There was a problem hiding this comment.
correct. 3.10 syntax works on all python versions as long as your version of mypy is recent.
Notable exception: cast, since it is executed at runtime, needs Python 3.10 style annotations wrapped in a string.
| self, | ||
| minimum: int = 0, | ||
| maximum: int = math.inf, | ||
| maximum: int | float = math.inf, |
There was a problem hiding this comment.
Not a huge deal, but I wonder if we could narrow further to int | Literal[math.inf]
There was a problem hiding this comment.
Sadly, no: https://www.python.org/dev/peps/pep-0586/
The following are provisionally disallowed for simplicity. We can consider allowing them in future extensions of this PEP.
- Floats: e.g. Literal[3.14]. Representing Literals of infinity or NaN in a clean way is tricky; real-world APIs are unlikely to vary their behavior based on a float parameter.
| from array import array | ||
| from enum import Enum | ||
| from functools import partial | ||
| from types import ModuleType |
|
|
||
|
|
||
| def key_split_group(x): | ||
| def key_split_group(x) -> str: |
There was a problem hiding this comment.
Does it work to narrow x to str | bytes | tuple[Hashable] or similar? I'd love to have written down the real scope of a key.
There was a problem hiding this comment.
No, because (for some reason I didn't investigate) unexpected types are accepted and return a hardcoded "Other".
| return key_split_group(x.decode()) | ||
| else: | ||
| return key_split(x) | ||
| return "Other" |
There was a problem hiding this comment.
Note for reviewers: this avoids some redundant checks in key_split, which would itself just return "Other" here.
| if isinstance(preload, str): | ||
| preload = [preload] | ||
| if preload_argv and isinstance(preload_argv[0], str): | ||
| preload_argv = [cast("list[str]", preload_argv)] * len(preload) |
There was a problem hiding this comment.
I played around with this one as well to try to avoid the cast. Annoying that mypy is unable to narrow based on the above check.
As noted above, this is going to cause issues with Cythonization in the scheduler. If we can't leave that file alone, then unfortunately I don't think we are prepared to adopt MyPy yet. |
|
Please move all conversation specific to scheduler.py to #5348 |
| return super()._repr_html_(cluster_status=cluster_status) | ||
|
|
||
|
|
||
| clusters_to_close = weakref.WeakSet() |
There was a problem hiding this comment.
never populated
There was a problem hiding this comment.
Good catch! I believe we now handled here
distributed/distributed/deploy/spec.py
Lines 654 to 660 in 43d3866
| pass | ||
| async with Scheduler(dashboard_address=":0") as s: | ||
| s.add_plugin(plugin) | ||
| await plugin.start(s) |
There was a problem hiding this comment.
#5348 removes the plugins= parameter from Scheduler and Worker. It was exclusively used here.
There was a problem hiding this comment.
Maybe use Scheduler.register_scheduler_plugin instead? Just a suggestion, not meant to be a blocking comment
#5348 removes the plugins= parameter from Scheduler and Worker. It was exclusively used here.
There may be user code which uses plugins=
| from .utils import frame_split_size, msgpack_opts, pack_frames_prelude, unpack_frames | ||
|
|
||
| lazy_registrations = {} | ||
|
|
|
This for me is done - @ian-r-rose are there further comments? |
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @crusaderky! I left several small comments / questions, but overall this looks good to me
Also, just checking but there's an empty distributed/py.typed file being added here, is that intentional?
setup.cfg
Outdated
| [options] | ||
| zip_safe = False # https://mypy.readthedocs.io/en/latest/installed_packages.html | ||
|
|
There was a problem hiding this comment.
I'm wondering how this is different from zip_safe=False in setup.py
Line 110 in 43d3866
There was a problem hiding this comment.
Ah. I didn't notice it.
| # TODO: Use tornado's AnyThreadEventLoopPolicy, instead of class below, | ||
| # once tornado > 6.0.3 is available. | ||
| if WINDOWS and hasattr(asyncio, "WindowsSelectorEventLoopPolicy"): | ||
| if WINDOWS: |
There was a problem hiding this comment.
Just noting for future readers that asyncio.WindowsSelectorEventLoopPolicy was added in Python 3.7 https://docs.python.org/3/whatsnew/3.7.html
| """ | ||
| with dask.config.set( | ||
| {"distributed.worker.preload": text, "distributed.nanny.preload": text} | ||
| {"distributed.worker.preload": [text], "distributed.nanny.preload": [text]} |
There was a problem hiding this comment.
It violated distributed-schema.yaml. It's purely conincidental that process_preloads() will accept a single string.
|
|
||
| compressions = {None: {"compress": identity, "decompress": identity}} | ||
| if TYPE_CHECKING: | ||
| from typing_extensions import Literal |
There was a problem hiding this comment.
Should we include typing_extensions in our conda environment file(s)?
There was a problem hiding this comment.
No, because we are running everything through pre-commit. typing_extensions shouldn't be in the env files for the same reason flake8, black, isort and mypy aren't.
There was a problem hiding this comment.
To clarify: if you look at #5348, typing_extensions is installed there exclusively in the mypy virtualenv by .pre-conmmit-config.yaml.
There was a problem hiding this comment.
That makes sense, thanks for clarifying
| return super()._repr_html_(cluster_status=cluster_status) | ||
|
|
||
|
|
||
| clusters_to_close = weakref.WeakSet() |
There was a problem hiding this comment.
Good catch! I believe we now handled here
distributed/distributed/deploy/spec.py
Lines 654 to 660 in 43d3866
| if self.cluster: | ||
| return self.cluster.loop | ||
| else: | ||
| return IOLoop.current() |
There was a problem hiding this comment.
Is this another "mypy caught a bug" case?
There was a problem hiding this comment.
No. It caught an antipattern where the parent class expected properties that only exist in the children classes.
| logging_names: dict[str | int, int | str] = {} | ||
| logging_names.update(logging._levelToName) # type: ignore | ||
| logging_names.update(logging._nameToLevel) # type: ignore |
There was a problem hiding this comment.
We're loosing a copy() here, just wanted to double check that's okay
There was a problem hiding this comment.
We're replacing a copy() with an update() for the sake of readability. There's no functional difference.
| if TYPE_CHECKING: | ||
| try: | ||
| import ucp | ||
| from ucp import create_endpoint as ucx_create_endpoint | ||
| from ucp import create_listener as ucx_create_listener | ||
| except ImportError: | ||
| pass | ||
| else: | ||
| ucp = None # type: ignore | ||
| ucx_create_endpoint = None # type: ignore | ||
| ucx_create_listener = None # type: ignore | ||
|
|
There was a problem hiding this comment.
@pentschev @madsbk @quasiben could one of you review the changes to this module?
There was a problem hiding this comment.
Thanks for the ping here @jrbourbeau . We can't import ucp here, we have to do it from init_once, otherwise UCX will be imported before we can set any of its configurations. However, it seems it's only being imported here when TYPE_CHECKING is defined, when is that going to happen in Distributed?
There was a problem hiding this comment.
However, it seems it's only being imported here when TYPE_CHECKING is defined, when is that going to happen in Distributed?
Today we don't run any type checking as part of the development process, but after #5348 is merged we'll start running mypy as a pre-commit hook. Given that this ucp import will only happen during type checking, do you think the proposed changes here are okay?
There was a problem hiding this comment.
If it's guaranteed that TYPE_CHECKING is False for all runtime situations, then I think this is no problem.
There was a problem hiding this comment.
That should be the case. From https://docs.python.org/3/library/typing.html#typing.TYPE_CHECKING:
It is False at runtime
cc @crusaderky for confirmation
There was a problem hiding this comment.
In that case, we should be good for now. If there's any reason why TYPE_CHECKING is True at runtime, we'll need to make changes to cover that.
| _shutdown = _close | ||
|
|
There was a problem hiding this comment.
Just noting to future readers that there's a _shutdown method defined a few lines below
distributed/preloading.py
Outdated
| if TYPE_CHECKING: | ||
| from .core import Server | ||
|
|
There was a problem hiding this comment.
That's usually the reason for this pattern, yes. However, as far as I can tell it's not actually needed here, since Server doesn't depend on preloading, and everything that imports preloading has already imported Server somewhere.
distributed/active_memory_manager.py
Outdated
| if not candidates: | ||
| return None | ||
| return min(candidates, key=self.workers_memory.get) | ||
| return min(candidates, key=self.workers_memory.get) # type: ignore |
There was a problem hiding this comment.
I cringe a bit to suggest this, but self.workers_memory.__getitem__ passes muster here, since it will raise if the key doesn't exist
|
|
||
| if not self.new_spec: | ||
| raise ValueError("To scale by cores= you must specify cores per worker") | ||
| assert False, "unreachable" |
There was a problem hiding this comment.
This should be triggered if none of the above keys are in new_spec. Without this assertion, the proper return type of this function would be Optional[int]. Perhaps a more helpful error would be something similar to the ValueError above?
| @status.setter | ||
| def status(self, value): | ||
| raise NotImplementedError() | ||
|
|
There was a problem hiding this comment.
Was curious about this as well -- it's because this inherits from Worker, where status isn't read-only. Without this it complains that we are swapping a read-write property with a read-only one, which, fair enough.
Co-authored-by: James Bourbeau <jrbourbeau@users.noreply.github.com>
|
All review comments have been incorporated |
Yes, this file is used by mypy when it validates third party code that imports distributed, and it signals that mypy should fetch type annotations from the library itself (as opposed to look for a separate stubs package). |
|
@jakirkham earlier you expressed some hesitation around adopting |
|
Yeah I'm ok with that James. Thanks for checking :) |
jrbourbeau
left a comment
There was a problem hiding this comment.
Sounds good!
Thanks @crusaderky for your work on this and addressing all the review comments from @ian-r-rose and me
|
Woo! |
Partial work towards #2803.
This PR covers everything but scheduler.py and actually running mypy in CI.
Continues in #5348.