ENH,API: Add a protocol for representing nested sequences#18155
ENH,API: Add a protocol for representing nested sequences#18155BvB93 wants to merge 5 commits intonumpy:masterfrom
Conversation
ecfdf18 to
dc33633
Compare
|
The generated documentation: |
|
I can see this being nice to have, so I am not opposed. But I am a bit surprised that we want to make nested sequences a common thing for most command? I.e. I would expect it to be mainly a valid input for |
This is true for scripts that utilize import numpy as np
# No easy way of grabbing the signature of `np.array` and putting it in `func`
# (besides manually copying it, that is)
def func(a): # ???
return np.array(a) * 10In practice this means that, unfortunately, we're stuck with quite a bit code duplication and thus the need for the likes of |
|
Closing this for now, as further testing has unfortunately revealed a number of detrimental mypy bugs/limitations 😕 :
from typing import TypeVar, List, overload
import numpy.typing as npt
@overload
def func1(a: npt.NestedSequence[bool]) -> bool: ...
@overload
def func1(a: npt.NestedSequence[int]) -> int: ...
int1 = [[1]]
int2: List[List[int]]
# note: Revealed type is 'builtins.int'
# This is ok
reveal_type(func1(int1))
# note: Revealed type is 'builtins.int'
# This is also ok
reveal_type(func1(int2))
# note: Revealed type is 'builtins.bool'
# This is bad; how did we end up at the `bool` overload all of a sudden????
reveal_type(func1([[1]]))
T = TypeVar("T")
def func2(a: npt.NestedSequence[T]) -> T: ...
# error: Argument 1 to "func2" has incompatible type "List[List[int]]"; expected "NestedSequence[<nothing>]"
reveal_type(func2(int1))
# error: Argument 1 to "func2" has incompatible type "List[List[int]]"; expected "NestedSequence[<nothing>]"
reveal_type(func2(int2))
# note: Revealed type is 'Any'
reveal_type(func2([[1]])) |
This PR adds (and exposes) a protocol representing nested sequences of arbitrary depth:
npt.NestedSequence.Despite the lack of formal support for recursion support in mypy (python/mypy#731), it turns out we don't
need this at all for representing recursive sequences. As this PR demonstrates, a simply protocol is sufficient here.
The advantage of the protocol versus the currently used union-based approach is trifold:
pre-defined number of nesting levels (e.g. 4).
levels not captured by aforementioned union (e.g. ENH: Add dtype support to the array comparison ops #18128 (comment)).
compared to a union consisting of many types.
The new
npt.NestedSequenceprotocol introduced herein is exposed to the publicnumpy.typingAPI,as I imagine it will be rather useful for representing array-like objects, be it either in NumPy or downstream.
Examples