ENH: Make numpy scalars subtypes of their builtin counterpart by BvB93 · Pull Request #17117 · numpy/numpy

BvB93 · 2020-08-20T12:28:55Z

This pull request modifies the superclasses of six np.generic subclasses:

np.str_ & np.bytes_ are now, respectively, subclasses of str & bytes.
This relation is also true during runtime, so this is actually a fix rather than an enhancement. Moved to MAINT: Fix various issues with the np.generic annotations #17214.
np.floating & np.complexfloating are now, respectively, treated as subclasses of float & complex.
During runtime this relation does, strictly speaking, only hold for np.float64 and np.complex128.
np.integer is now treated as a subclass of int. This relation does not hold during runtime.

A few notes on a number of untouched classes:

np.timedelta64 & np.datetime64 are surprisingly incompatible with their counterparts in datetime.
bool can, unfortunately, not be subclassed (even in stub files).

Lastly, this pull request serves as an alternative to #17096.

Examples

import numpy as np

def func(a: float) -> None:
    pass

# These inputs are now all accepted
func(1.0)
func(1)
func(float32(1.0))
func(float64(1.0))
func(int64(1))

numpy/__init__.pyi

eric-wieser · 2020-08-20T13:34:55Z

At a glance, I think I prefered the alternative in #17096 - numpy scalars are frequently not a drop-in for python ones, and that type of mistake seems like something it would be nice for mypy to catch. With this PR, it forces it to be a runtime error.

eric-wieser · 2020-08-20T13:41:40Z

numpy/tests/typing/pass/scalars.py

As an example of why I think this is a mistake, mypy will allow this code even though it raises AttributeError:

def func_int(a: int) -> None: return a.to_bytes(1, byteorder="little") func_int(np.int8(1))

Note that there are already a few of such cases in mypy, as int does not implement all methods which exists in float.

>>> def func(a: float) -> bool: ... return a.is_integer() # NOTE: mypy considers `int` a subclass of `float` >>> func(1) Traceback (most recent call last): ... AttributeError: 'int' object has no attribute 'as_integer'

The number cases where the int / np.integer does work is substantially larger then those were it doesn't (just as with int & float), to the point where I feel a more pragmatic approach is justified.

Note that the problems with int vs float are slowly being resolved by adding the missing methods (https://bugs.python.org/issue26680)

numpy/__init__.pyi

…p.timedelta64` Partially reverts ed18dd33d0d9a3da65f2bf5600f3664b1d2a577a

Implements numpy#17117 (comment)

Co-Authored-By: Eric Wieser <eric-wieser@users.noreply.github.com>

BvB93 · 2020-09-01T14:49:22Z

Rebased to get rid of a merge conflict.

mattip · 2020-09-09T18:26:38Z

@BvB93 there is a merge conflict.

shoyer · 2020-09-09T19:08:15Z

I agree with @eric-wieser that I think this approach would be a mistake. NumPy scalars are not fully drop-in compatible with Python scalars, and pretending they are can lead to subtle bugs.

A few examples that come to mind:

NumPy int32 can overflow pretty easily vs. Python integers which never overflow.
The builtin round() gives different behavior for NumPy vs Python (preserving dtype vs converting to integer)

BvB93 · 2020-09-14T13:50:07Z

Closing this for now, as the consensus appears to be that the potential advantages do not outweigh the disadvantages.

BvB93 added 01 - Enhancement 41 - Static typing labels Aug 20, 2020

BvB93 commented Aug 20, 2020

View reviewed changes