MAINT: Reduce allocation size of empty (0 size) arrays to 1 byte by mattip · Pull Request #21477 · numpy/numpy

mattip · 2022-05-09T14:21:11Z

Allocate a minimum 1 byte when the data chunk should be 0 bytes, to avoid undefined behaviour
Make allocation match the free calculation of how much memory to release. This closes empty, zeros, ones, full allocate product of non-zero size elements, even if final array is zero-sized #16410 and BUG(?): Tests print 'uh-oh, unmatched shift_free(ptr, 8) but allocated 720' #20289

eric-wieser · 2022-05-09T14:24:31Z

numpy/core/src/multiarray/ctors.c

-         * Note: always sync this with calls to PyDataMem_UserFREE
         */
        if (nbytes == 0) {
-            nbytes = descr->elsize ? descr->elsize : 1;


Can we ensure all the strides are set to zero to avoid any concerns of undefined behavior?

OK, but array_strides_set will allow resetting the strides to any values.

I have the following behavior when a 0-size array is reshaped in NumPy 1.23:

>>> import numpy as np >>> a = np.ones((2, 0, 3)) >>> a.strides (0, 0, 0) >>> b = a.reshape((3, 0, 2)) >>> b.strides (16, 16, 8)

Do you think reshape should be also fixed to return b with strides (0, 0, 0) here, @eric-wieser @mattip ? Thanks!

ref. cupy/cupy#6807

That does look like a bug. Please open a new issue for this, the PR here was merged and further discussion will get lost.

Thanks! I opened #21918.

seberg · 2022-05-09T16:07:24Z

I am wondering if we should worry about limiting the total size also in this case. But I cannot think of any place where it would actually be used (i.e. we assume that a subset of axis cannot overflow when combined).

So, I would feel a bit safer to keep that check unmodified, but I am OK to not worry about it now.

mattip · 2022-05-09T16:21:58Z

this case, that check

which ones?

seberg · 2022-05-09T16:29:27Z

Sorry, the current loop ignores 0s, so if you create an array of shape (0, 2**60, 2**60) it currently fails just like (2**60, 2**60, 0) fails.
With the change, I think this will not happen anymore? That should probably be OK, but I am wondering if there could be operations that merge the nonzero dimensions and then overflow.

mattip · 2022-05-10T08:29:46Z

Sorry, the current loop ignores 0s, so if you create an array of shape (0, 2**60, 2**60) it currently fails just like (2**60, 2**60, 0) fails.

I restored this behaviour.

There were a few places in the code that assumed strides[-1] == elsize, so I wonder if this change will break downstream code.

eric-wieser · 2022-05-10T08:56:30Z

There were a few places in the code that assumed strides[-1] == elsize, so I wonder if this change will break downstream code.

I'd hoped this was the type of thing that RELAXED_STRIDES_CHECKING was supposed to transition downstream code away from.

numpy/core/src/multiarray/getset.c

numpy/core/src/multiarray/ctors.c

seberg · 2022-05-10T09:16:46Z

Thanks for restoring the max-size thing!

Yeah, RELAXED_STRIDES should have caught most of such issues. In this case, the array is empty, so using its stride is pretty unlikely. In practice, I expect it only affects thingse if you later grow the array and assume the strides are still compatible.
Growing an array from empty to non-empty is hopefully exceedingly rare or non-existing outside of NumPy proper.

…th `size == 0`. In order to avoid undefined behavior, we need to ensure the generated strides are all zero.

numpy/core/src/multiarray/ctors.c

seberg · 2022-05-11T09:32:31Z

@mattip I just realized, I had that PyArray_NBYTES_ALLOCATED helper for the reduction code. I am a bit surprised the reduce tests don't print the mismatch stuff now due to that not also being updated?
Not worried though, but it would still be good to update that inline-function.

EDIT: I may just push that update here later though.

We have to reset all strides in the case of a subarray dtype which may have caused the result array to not be 1-D. Since the logic is a bit complicated now, restore the logic to only do this when it really is necessary.

mattip · 2022-05-11T14:40:04Z

I think the windows error path is

import numpy.core.multiarray as _nx, numpy as np
def ndim(a):
    return np.asarray(a).ndim

dt = np.float64
delta = 1.0
start = 0.0
num = 0

y = _nx.arange(0, num, dtype=dt).reshape((-1,) + (1,) * ndim(delta))
y = y * delta
y += start

which gives "RuntimeWarning: invalid value encountered in add". After arange, y.strides == (8,). After y = y * delta, y.strides == (0,)

seberg · 2022-05-11T15:30:38Z

Ah, this is probably an invalid access in our sum implementation. The reduce path would be taken here with n == 0 (unless we prevent that earlier, and I do not think we do):

numpy/numpy/core/src/umath/loops_arithm_fp.dispatch.c.src

Lines 523 to 532 in 62293d4

    
           NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_@kind@) 
        
           (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)) 
        
           { 
        
               if (IS_BINARY_REDUCE) { 
        
           #if @PW@ 
        
                   @type@ * iop1 = (@type@ *)args[0]; 
        
                   npy_intp n = dimensions[0]; 
        
                   *iop1 @OP@= @TYPE@_pairwise_sum(args[1], n, steps[1]); 
        
           #else

That path clearly always loads one data-point from the output (even for n == 0), and will do an operation for said data point. Even if that operation doesn't generate a NaN (which seems unlikely), that would give a warning if the loaded garbage happened to be a signalling NaN.

EDIT: Maybe all reduce checks should have an N >= 1 check?

seberg · 2022-05-11T16:45:53Z

So for the windows thing, I think the correct fix (which is a bug-fix in any case), is to never do an empty reduce-loop.

There are two ways to do that:

Modify the IS_BINARY_REDUCE macro to reject dimensions == 0 (or dimensions <= 1)
Guarantee that we never call a ufunc loop on 0 elements to begin with?

Since I do not think complex iteration paths can generate length 0 inner-loop calls, both seem very feasible to me, and I think I will make a PR for it. But I would also be happy with the second approach.

mattip · 2022-05-11T20:17:40Z

I think this PR is growing too large. The change to prevent ufuncs looping when PyArray_SIZE() == 0 should be merged first, then we can come back to this PR.

mattip · 2022-05-12T05:27:29Z

After a one-line check for 0-length loops in try_trivial_single_output_loop, the PR is passing. Maybe more is needed, would a valgrind check catch illegal indexing into 0-length arrays? If we still have doubts, it may be safer to allocate elemsize bytes rather than 1 byte.

seberg · 2022-05-12T05:38:45Z

If we fix it in the ufunc setup code, I think we should add an assert to IS_BINARY_REDUCE (which also documents the guarantee a bit). Otherwise, I don't think more is needed as such. I was trying to think of a good tests (and maybe have a quick look at accumulate).
Did you want to split out that change, anyway (Or I can do)?

mattip · 2022-05-12T06:49:50Z

I think we should add an assert to IS_BINARY_REDUCE

Is there precedence for this in the code? The macros are used like if (IS_BINARY_REDUCE()) so I am not sure how to add an assert to the macro itself

Did you want to split out that change, anyway (Or I can do)?

I originally did, but it turned out to be simpler than I thought, so I am not sure it is necessary as a separate PR.

seberg · 2022-05-12T06:57:55Z

Is there precedence for this in the code? The macros are used like if (IS_BINARY_REDUCE()) so I am not sure how to add an assert to the macro itself

Ah, I had not thought about it, but this should work (just tested briefly and seems good):

#define IS_BINARY_REDUCE (assert(*dimensions != 0),  \
        (args[0] == args[2])  \
        && (steps[0] == steps[2])  \
        && (steps[0] == 0))

Maybe a comment makes sense, that it uses the (expr, result) style to get to result only.

One thing here is that we still need to figure out the arr.dtype = ... code, that might get a bit messy. But lets just keep it for now, and I might add test in a followup.

EDIT: Well, unfortunatley, the assert is hit quickly even with the fix, so we probably need a few more smaller tweaks in the ufunc machinery to do this.

mattip · 2022-05-13T09:14:23Z

#21499 was merged, Is there any outstanding review for this PR (pushing on it a bit since it removes the annoying "uh oh" output when testing allocation policy changes)?

seberg · 2022-05-13T09:22:23Z

@mattip unfortunatley, I don't trust the arr.dtype = ... assignment code at all, this is the issue that the current version runs into, I believe:

arr = np.ones(1, dtype="i8")
arr = np.broadcast_to(arr, 10)
arr.dtype = "i1"
print(arr)  # oops

Unfortunately, I don't have a clear idea of how to fix that. Right now, I mainly see the option of making a "deep clean" here and rethinking the code almost as a whole, and I didn't try to think it through yet.
(I had briefly enterntained whether allocating a single element could help, but I don't think so.)

mattip · 2022-05-13T11:19:22Z

I pushed a fix which will allow a view of an empty array (np.prod(arr.shape) == 0), but restores the failure to set a dtype on a non-empty array when the strides do not match the elemsize. The check now requires np.prod, is that too expensive?

seberg · 2022-05-13T11:24:54Z

numpy/core/src/multiarray/getset.c

        /* resize on last axis only */
        int axis = PyArray_NDIM(self) - 1;
        if (PyArray_DIMS(self)[axis] != 1 &&
+                PyArray_SIZE(self) != 0 &&


Hah, this should work, yeah. So the one thing is that if the stride was previously 0, in theory it should remain 0 in this case (but gets replaced with elsize). I think I am OK with ignoring the potential UB problem that may follow that for now.

If we really want to fight it, I guess we would have to add asserts into array_dealloc to flush out places that need adjustment.

@eric-wieser how worried are you actually about the UB? If not very, I am tempted to suggest accepting that this path might lead to ~~this~~ UB.

@mattip if you really care about the prod, you cold order it to the end, since it will be hit very rarely then?
But honestly, this is still a Python "call", so I really doubt it matters at all.

seberg · 2022-05-17T10:03:20Z

OK, I am going to put this in a bit optimistically, since I think Matti was hoping to get this into the next release. If we got holes in this logic, it seems OK to fix it in bug-fix releases as well.

But if Eric (or anyone else) is a bit worried about the changes, we can revert again as well. However, right now, I also think that downstream testing might be good here, and merging will get us at least a bit of testing.

mattip · 2022-05-17T11:11:55Z

🚀

Since numpy/numpy#21477 the strides of a 0-sized array are 0 instead of the itemsize. This causes `array_core()` to erroneously attempt to index the array with numeric indices instead of empty slices. Since this is a special case (empty arrays), a simple solution seems to be for `array_core()` to return the array itself when the array is 0-sized. In order to maintain consistency of the `array_core()` function between the simulator and hardware targets, we make this change to both implementations.

In NumPy 1.23, the strides of empty arrays are 0 instead of the item size, due to numpy/numpy#21477 - however, `np.empty_like` seems to create non-zero-strided arrays from a zero-strided empty array, and copying to the host from a device array with zero strides fails a compatibility check in Numba. This commit works around the issue by calling `copy_to_host()` with no arguments, allowing Numba to create an array on the host that is compatible with the device array - the resulting implementation is functionally equivalent and slightly simpler, so I believe this change could remain permanant rather than requiring a revert later.

…7089) In NumPy 1.23, the strides of empty arrays are 0 instead of the item size, due to numpy/numpy#21477 - however, `np.empty_like` seems to create non-zero-strided arrays from a zero-strided empty array, and copying to the host from a device array with zero strides fails a compatibility check in Numba. This commit works around the issue by calling `copy_to_host()` with no arguments, allowing Numba to create an array on the host that is compatible with the device array - the resulting implementation is functionally equivalent and slightly simpler, so I believe this change could remain permanant rather than requiring a revert later.

…ask#7089) In NumPy 1.23, the strides of empty arrays are 0 instead of the item size, due to numpy/numpy#21477 - however, `np.empty_like` seems to create non-zero-strided arrays from a zero-strided empty array, and copying to the host from a device array with zero strides fails a compatibility check in Numba. This commit works around the issue by calling `copy_to_host()` with no arguments, allowing Numba to create an array on the host that is compatible with the device array - the resulting implementation is functionally equivalent and slightly simpler, so I believe this change could remain permanant rather than requiring a revert later.

eric-wieser reviewed May 9, 2022

View reviewed changes

eric-wieser reviewed May 10, 2022

View reviewed changes

numpy/core/src/multiarray/getset.c Outdated Show resolved Hide resolved

eric-wieser reviewed May 10, 2022

View reviewed changes

numpy/core/src/multiarray/ctors.c Outdated Show resolved Hide resolved

mattip mentioned this pull request May 10, 2022

ENH: Allow object and subarray dtypes in fromiter #20993

Merged

eric-wieser and others added 4 commits May 10, 2022 12:57

MAINT: Shrink the size of memory allocations to 1 bytes for arrays wi…

5003a7e

…th `size == 0`. In order to avoid undefined behavior, we need to ensure the generated strides are all zero.

if any dims is 0, allocate array of length 1

4cbc171

fix from review: make sure all strides are 0 when nbytes is 0

85cdecd

fixes from review, fixes to assumptions about strides

0f8bb86

mattip force-pushed the reduce-empty-allocation branch from 90ab6b9 to 0f8bb86 Compare May 10, 2022 10:21

eric-wieser reviewed May 10, 2022

View reviewed changes

numpy/core/src/multiarray/ctors.c Outdated Show resolved Hide resolved

mattip force-pushed the reduce-empty-allocation branch from a69de90 to 0f8bb86 Compare May 11, 2022 05:36

from review, unconditionally set the stride

e598b11

mattip force-pushed the reduce-empty-allocation branch from 92bbe08 to e598b11 Compare May 11, 2022 08:54

seberg added 2 commits May 11, 2022 13:37

BUG: Fix FromIter stride resetting logic

9e09a9a

We have to reset all strides in the case of a subarray dtype which may have caused the result array to not be 1-D. Since the logic is a bit complicated now, restore the logic to only do this when it really is necessary.

MAINT: Remove now unnecessary PyArray_NBYTES_ALLOCATED helper

5a1fc81

ENH: short-cut trivial ufunc loops when array.size == 0

ede6df5

mattip mentioned this pull request May 12, 2022

ENH: avoid looping when dimensions[0] == 0 or array.size == 0 #21499

Merged

BUG: ensure view of non-empty array with strides == (0,) raises

28cacc8

seberg reviewed May 13, 2022

View reviewed changes

seberg changed the title ~~Reduce empty allocation~~ MAINT: Reduce allocation size of empty (0 size) arrays to 1 byte May 17, 2022

github-actions bot added the 03 - Maintenance label May 17, 2022

seberg merged commit dd5ab7b into numpy:main May 17, 2022

seberg mentioned this pull request Jun 13, 2022

Testing NumPy 1.23 RC 3 on public CI. numba/numba#8153

Closed

This was referenced Jun 28, 2022

Fix for NumPy 1.23 cupy/cupy#6807

Merged

Reshaped 0-size arrays have non-zero strides #21918

Open

brandonwillard mentioned this pull request Sep 28, 2022

Broken GEMM C implementation when using NumPy >= 1.23 aesara-devs/aesara#1220

Closed

quasiben mentioned this pull request Sep 29, 2022

Catch BaseException on UCX read error dask/distributed#6996

Merged

2 tasks

gmarkall mentioned this pull request Sep 29, 2022

test_serialize_numba: Workaround issue with np.empty_like in NumPy 1.23 dask/distributed#7089

Merged

2 tasks

mattip deleted the reduce-empty-allocation branch December 27, 2022 16:52

seberg mentioned this pull request Jan 27, 2023

BUG: PyArray_SimpleNew with partial zero-dim array segfaults in numpy >= 1.23 #23115

Closed

Uh oh!

Conversation

mattip commented May 9, 2022

Uh oh!

eric-wieser May 9, 2022

Choose a reason for hiding this comment

Uh oh!

mattip May 9, 2022

Choose a reason for hiding this comment

Uh oh!

takagi Jun 28, 2022

Choose a reason for hiding this comment

Uh oh!

mattip Jun 30, 2022

Choose a reason for hiding this comment

Uh oh!

takagi Jul 4, 2022

Choose a reason for hiding this comment

Uh oh!

seberg commented May 9, 2022

Uh oh!

mattip commented May 9, 2022

Uh oh!

seberg commented May 9, 2022

Uh oh!

mattip commented May 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-wieser commented May 10, 2022

Uh oh!

Uh oh!

Uh oh!

seberg commented May 10, 2022

Uh oh!

Uh oh!

seberg commented May 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattip commented May 11, 2022

Uh oh!

seberg commented May 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented May 11, 2022

Uh oh!

mattip commented May 11, 2022

Uh oh!

mattip commented May 12, 2022

Uh oh!

seberg commented May 12, 2022

Uh oh!

mattip commented May 12, 2022

Uh oh!

seberg commented May 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattip commented May 13, 2022

Uh oh!

seberg commented May 13, 2022

Uh oh!

mattip commented May 13, 2022

Uh oh!

seberg May 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg May 13, 2022

Choose a reason for hiding this comment

Uh oh!

seberg commented May 17, 2022

Uh oh!

mattip commented May 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

mattip commented May 10, 2022 •

edited

Loading

seberg commented May 11, 2022 •

edited

Loading

seberg commented May 11, 2022 •

edited

Loading

seberg commented May 12, 2022 •

edited

Loading

seberg May 13, 2022 •

edited

Loading