Skip to content

BUG: change overloads to play nice with pyright.#22193

Merged
BvB93 merged 2 commits intonumpy:mainfrom
iantra:main
Sep 6, 2022
Merged

BUG: change overloads to play nice with pyright.#22193
BvB93 merged 2 commits intonumpy:mainfrom
iantra:main

Conversation

@iantra
Copy link
Copy Markdown
Contributor

@iantra iantra commented Sep 1, 2022

It seems Pyright just chooses the first matching type whenever there is ambiguity in type resolution, which leads to a NoReturn for numpy.cross in certain situations (See #22146.) As far as I can tell, there is no agreed-upon behavior or guidelines for dealing with ambiguity in type resolution, so this is valid behavior on Pyright's side (see microsoft/pyright#2521 for a discussion on this topic.)

I suppose the ideal solution would be to resolve the ambiguity, but AFAIK there is no way to do that for constructs like numpy.array([1, 2, 3]) on the numpy side, short of asking every user to include a dtype in these cases. I would love to be proven wrong though.

I think the second best solution is to have the more general overload also be the first one that matches, so that we make as few assumptions as possible about what might happen inside the function when given an NDArray[Any]. The other overloads were just changed so that they match the one for numpy.cross.

I would also consider changing the overload order in functions elsewhere in the library, if it makes sense.

I am very open to suggestions if someone has a better idea on how to deal with this.

Pyright just chooses the first matching type whenever there is ambiguity in type resolution, which leads to NoReturn for the cross function in certain situations. Other overloads were changed to match. See ticket numpy#22146
@charris
Copy link
Copy Markdown
Member

charris commented Sep 1, 2022

close/reopen

@charris charris closed this Sep 1, 2022
@charris charris reopened this Sep 1, 2022
@charris
Copy link
Copy Markdown
Member

charris commented Sep 1, 2022

@BvB93 Thoughts?

@seberg
Copy link
Copy Markdown
Member

seberg commented Sep 2, 2022

Not a typing expert, but I would think the solution is to add another overload for ArrayLike, ArrayLike -> NDarray[Any] indicating that an unknown dtype should generally pass. (That is probably not true, but the best guess when the dtype is not typed.)

@iantra
Copy link
Copy Markdown
Contributor Author

iantra commented Sep 5, 2022

After a bit more thought I realize this is a terrible solution for many reasons, but still may be a good starting point for dealing with the problem.

Not a typing expert, but I would think the solution is to add another overload for ArrayLike, ArrayLike -> NDarray[Any] indicating that an unknown dtype should generally pass. (That is probably not true, but the best guess when the dtype is not typed.)

I think the issue here is not that there is no matching overload for NDArray[Any], but rather that there are too many. Here's a very straightforward example of what I mean:

@overload
def fun(x: int) -> int:
    ...


@overload
def fun(x: Any) -> float:
    ...


x: Any = ...
y = fun(x)

reveal_type(y)

The problem with this code is that there is no "right" answer for what the type of y should be.

Some people (like me) would intuitively assume that since there is an explicitly defined overload for arguments of type Any, that's the one that would be selected and y would be a float. However, since Any matches any type, not just itself, that's not the only option.

Mypy takes the more conservative approach and assumes y is also Any, since there's no way of knowing what overload would actually be called until runtime.

Pyright takes the opposite approach and picks the first matching overload, meaning y ends up as an int. After all, x could be an int, so this is not wrong and it allows the pylance IDE autosuggestions to at least give you some options.

You might notice the issue with this code snippet is not how the type checkers deal with it, but rather a bug in the code itself. The code in the second overload is telling type checkers "if the input is anything other than an int, we return a float". That's what Any means, after all. Indeed, the code works correctly according to that definition -- changing it to b = "" makes both type checkers agree that y is a float. What we really want to say, however, is "if the input type is anything other than int, or if we don't know what the input type is, assume the return is a float". Unfortunately, there's no construct in python that I know of that is able to express this.

One option I've thought of is defining a new type that is unused anywhere else, like class _Unknown: ... and checking against that before anything else. This should match the "if we don't know what the input type is" part of our goal, at least for pyright, since it matches the Any input type (and nothing alse) and is the first defined overload. It wouldn't work with mypy, but the issues with mypy are less severe anyway since they assume the return to be Any already. This kind of feels like a hack, but it's the best idea I've got so far.

Some people (see here) have suggested adding the option to delete specific matches from a type to pyhton itself. This would be a neat solution, being able to say "match anything except the types we will define later", but unfortunately it seems to not have gained much traction.

Sorry for the wall of text and the non-conclusion, I'll keep thinking about this and see if I can come up with a better solution. In the meantime, feel free to comment with thoughts/suggestions/criticisms.

TL;DR: I messed up, the issue is way more complicated than originally assumed, I did come up with something that sort-of-works but I'm not yet sure if it's good.

@iantra iantra changed the title BUG: change overload order to play nice with pyright. BUG: change overloads to play nice with pyright. Sep 5, 2022
Copy link
Copy Markdown
Member

@BvB93 BvB93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right sorry it took some time to respond, and yes: as @iantra has noticed this is a non-trivial issues to fix due to Pyrights questionable way of dealing with overload ambiguity, especially when in the case of ndarray the semantics of the class are so very tightly linked to its (generic) data type.

While I am a bit concerned how viable this is a long term solution (which IMO is something that should be adressed upstream), your _Unknown solution does quite nicely adress the most problematic cases involving NoReturn.

Comment on lines +58 to +66
class _Unknown: ...

_ArrayLikeUnknown = Union[
_SupportsArray["dtype[_Unknown]"],
_NestedSequence[_SupportsArray["dtype[_Unknown]"]],
_Unknown,
_NestedSequence[_Unknown],
]

Copy link
Copy Markdown
Member

@BvB93 BvB93 Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, 3 things hre:

  • Can you move this to the other ArrayLike type aliases over in numpy/typing/_array_like? (and re-export it to numpy._typing)
  • I'd suggest the name _UnknownType rather than _Unknown
  • Could you add a small comment explaining its purpose, maybe in combination with a link to this PR for future reference?

Comment on lines +417 to +404
@overload
def outer(
a: _ArrayLikeObject_co,
b: _ArrayLikeObject_co,
out: None = ...,
) -> NDArray[object_]: ...
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overload should return to its previous position; I assume this is a leftover edit from something?

@iantra iantra force-pushed the main branch 2 times, most recently from 92e2904 to 4bb480f Compare September 5, 2022 15:11
@iantra
Copy link
Copy Markdown
Contributor Author

iantra commented Sep 5, 2022

Thanks for your time and comments @BvB93, I've gone ahead and fixed everything you mentioned (I think).

I do also have my concerns about longevity, especially if pyright decides to change how they handle this at some point, but it's the best I could come up with for now.

@BvB93 BvB93 merged commit 5bc5976 into numpy:main Sep 6, 2022
@BvB93
Copy link
Copy Markdown
Member

BvB93 commented Sep 6, 2022

Thanks @iantra, LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

4 participants