-
-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Closed
Description
NumPy seems to attempt to iterate instead of calling __array__.
Consider:
class ArrayLike(object):
def __init__(self, array):
self.array = array
def __len__(self):
return len(self.array)
def __iter__(self):
print('calling __iter__')
return iter(self.array)
def __getitem__(self, index):
print('calling __getitem__ with index={}'.format(index))
return self.array[index]
def __array__(self, dtype=None):
print('calling __array__')
return np.asarray(self.array, dtype=dtype)>>> a = ArrayLike(np.arange(3))
>>> np.array(a)
calling __array__
array([0, 1, 2])
>>> list(a)
calling __iter__
[0, 1, 2]
>>> np.array([a])
calling __array__
calling __iter__
calling __iter__
array([[0, 1, 2]])
So actually, it looks like NumPy calls __array__, and then __iter__ twice!
Is there a good reason for this behavior? If not, maybe we can fix it? Either way, a pointer to the relevant place in the codebase would be appreciated.
This causes trouble for xarray users (pydata/xarray#1247) because iterating over xarray objects creates new xarray objects, which takes ~100 us each. That adds up fast for large arrays.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels