-
-
Notifications
You must be signed in to change notification settings - Fork 12k
NEP: update NEP 42 with discussion of type hinting applications #17447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As discussed in numpy#16759, the new DType classes provide a good path forward for making `ndarray` generic over DType. Update NEP 42 to discuss those applications in more detail.
| avoiding crowding the ``np`` namespace as discussed above. For a user | ||
| defined DType:: | ||
|
|
||
| class UserDtype(dtype): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
| class UserDtype(dtype): ... | |
| class UserScalar: ... | |
| class UserDtype(np.dtype[UserScalar]): ... |
or if we want the scalar type to be variable:
| class UserDtype(dtype): ... | |
| T = typing.TypeVar("T") | |
| class UserDtype(np.dtype[T]): ... |
EDIT1: Made np.generic superclass of UserScalar and use it as bound for the typevar.
EDIT2: Reverted the previous edit (see #17447 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure-in the NEP there is this line:
The notation works equally well with built-in and user-defined DTypes
but I'm not actually sure how the __class_getitem__ would work with a user-defined scalar type. I have more been expecting that people would define new DType classes directly (maybe not even defining a corresponding scalar type!), which is why I emphasized the DType class here and left the scalars out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@person142 it will work for DTypes requesting it (not necessarily for all DTypes). And in that case it will work by NumPy having global known_dtype mapping basically. The reason we need something in any case, is to support:
np.array([user_scalar])
without explicitly passing a dtype=, which I would like to support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Maybe I should include both examples then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update here c564fcb#diff-89eb8c11ba758f8d3a296f44372c3254R293 for user-defined scalar types. AFAICT supporting np.dtype[UserScalar] mapping to UserDtype will require adding a single overload inside the NumPy stubs, which seems ok since we'll already have the known_dtype mapping. LMK if you see a way around that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn’t it already a subclass in the example? (I meant for it to be.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was specifically talking about the suggestion in #17447 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does say "generic" currently. Further down, I say "Scalars will be expected to derive from a NumPy scalar", but honestly, I am not sure we need to stick with it. I do not see a huge advantage, because I do not believe in "inheriting" the dtype attributes. When you inherit it, you mix up contra and covariance for a container and AFAIK that just doesn't work.
The reason for saying "from a NumPy scalar", is because I do not like np.generic. It comes with expectations on memory layout on the NumPy side and quite frankly I don't want to use it. When we reach this stage, I would rather consider creating a np.generic baseclass without all of those expectations.
Does subclassing even help with typing? It seems like there is a solution that will always work and it also seems to me like users will always have to define this once for runtime and once for static typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or from a different angle: Does subclassing really matter, as opposed to what is pretty much equivalent to an ABC you can register with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or from a different angle: Does subclassing really matter, as opposed to what is pretty much equivalent to an ABC you can register with?
For static typing the only important thing is that there is some common baseclass or protocol for automatically annotating the likes of np.array([user_scalar]).
Exactly how this is accomplished doesn't really matter (well, like you said, it might matter a lot during runtime...):
- Directly subclassing
np.generic. - Adding
np.genericas a superclass only during type checking. - Adding a new ABC (which may or may not be available at runtime).
- Adding a new protocol (which, again, may or may not be available at runtime).
- etc...
When we reach this stage, I would rather consider creating a
np.genericbaseclass without all of those expectations.
+1 to this suggestion!
doc/neps/nep-0042-new-dtypes.rst
Outdated
| and is inspired by and potentially useful for type hinting. | ||
| to get the DType class corresponding to a scalar type. (Note that with | ||
| the acceptance of PEP 585 [2]_ there is precedent for | ||
| ``np.dtype[np.int64]`` referring to a *subclass* of ``np.dtype``.) The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ``np.dtype[np.int64]`` referring to a *subclass* of ``np.dtype``.) The | |
| ``np.dtype[np.int64]`` referring to a *subtype* of ``np.dtype``.) The |
e.g. list[int] is a subtype of list[object] but not a subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I meant subclass here, but it has apparently been too long since I have read PEP 585-it only mentions
>>> isinstance(list[str], types.GenericAlias)
True
i.e. the proxy classes for list[int] do not subclass list. I'll remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess both is correct here, I am actually not sure which one is more precise with respect to how "type" is used in typing :).
doc/neps/nep-0042-new-dtypes.rst
Outdated
| The notation works equally well with built-in and user-defined DTypes | ||
| and is inspired by and potentially useful for type hinting. | ||
| to get the DType class corresponding to a scalar type. (Note that with | ||
| the acceptance of PEP 585 [2]_ there is precedent for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| the acceptance of PEP 585 [2]_ there is precedent for | |
| the acceptance of :PEP:`585` there is precedent for |
Minor.
doc/neps/nep-0042-new-dtypes.rst
Outdated
| This getter eliminates the need to create an explicit name for every | ||
| DType, crowding the ``np`` namespace; the getter itself signifies the type. | ||
| DType, crowding the ``np`` namespace; the getter itself signifies the | ||
| type. It also opens up the possibility of making ``np.ndarray`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| type. It also opens up the possibility of making ``np.ndarray`` | |
| type. It also opens the possibility of making ``np.ndarray`` |
Minor.
doc/neps/nep-0042-new-dtypes.rst
Outdated
| DType, crowding the ``np`` namespace; the getter itself signifies the type. | ||
| DType, crowding the ``np`` namespace; the getter itself signifies the | ||
| type. It also opens up the possibility of making ``np.ndarray`` | ||
| generic over DType class via annotations like:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| generic over DType class via annotations like:: | |
| generic over DType class using annotations like:: |
Minor. Google style guide says of "via": "Don't use."
doc/neps/nep-0042-new-dtypes.rst
Outdated
| avoiding crowding the ``np`` namespace as discussed above. For a user | ||
| defined DType:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| avoiding crowding the ``np`` namespace as discussed above. For a user | |
| defined DType:: | |
| avoiding crowding the ``np`` namespace as discussed above. For a | |
| user-defined DType:: |
Minor.
doc/neps/nep-0042-new-dtypes.rst
Outdated
|
|
||
| class UserDtype(dtype): ... | ||
|
|
||
| one can simply do ``np.ndarray[UserDtype]``, keeping annotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| one can simply do ``np.ndarray[UserDtype]``, keeping annotations | |
| one can use ``np.ndarray[UserDtype]``, keeping annotations |
Minor. Google style guide recommends against "simply."
doc/neps/nep-0042-new-dtypes.rst
Outdated
| class UserDtype(dtype): ... | ||
|
|
||
| one can simply do ``np.ndarray[UserDtype]``, keeping annotations | ||
| concise in that case without introducing any extra boilerplate in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| concise in that case without introducing any extra boilerplate in | |
| concise in that case without introducing boilerplate in |
Minor.
doc/neps/nep-0042-new-dtypes.rst
Outdated
| to return a ``uint8`` or ``float32`` array respectively. This is | ||
| further described in the documentation for :func:`numpy.result_type`. | ||
| .. [2] https://www.python.org/dev/peps/pep-0585/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .. [2] https://www.python.org/dev/peps/pep-0585/ |
No longer needed with change proposed in https://github.com/numpy/numpy/pull/17447/files#r499271335
|
|
||
| Since getter calls won't be needed often, this is unlikely to be burdensome. | ||
| Classes can also offer concise alternatives. | ||
| in ``numpy.typing``, thus keeping annotations concise but still |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point including it in numpy.typing, might have to do this sooner rather than later to make solving the pickling issue simpler.
Also clean up the language a bit.
|
Thanks @person142 and @BvB93 , I don't feel like worrying too much about the |
As discussed in #16759, the new DType classes provide a good path forward for making
ndarraygeneric over DType. Update NEP 42 to discuss those applications in more detail.I'm sure more discussion will be needed here, but this is my attempt to get the ball rolling.
Ping @BvB93 @seberg.