-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
Closed
Labels
57 - Close?Issues which may be closable unless discussion continuedIssues which may be closable unless discussion continued
Description
Functions like np.zeros, np.empty, np.ones and np.full can be used to allocate large arrays easily. However, they always allocate memory, even if the array contains no data because one of the shape elements is 0. They, in particular, seem to allocate memory ignoring the 0s in the shape (np.product([x for x in shape if x != 0])).
If any of the shape elements in zero, there's no possibility of any memory accesses ever, so the allocation seems unnecessary?
If the shape has a zero, one can get an equivalent array using broadcast_to, which won't allocate.
Reproducing code example:
import numpy as np
import tracemalloc as tm
def test(f):
tm.start()
arr = f((0, 100, 0, 200, 0, 300, 0), dtype=np.uint8)
mem, _ = tm.get_traced_memory()
tm.stop()
print(f"{f.__name__}: allocated memory: {mem} bytes, array size: {arr.size}")
def full(shape, dtype):
return np.full(shape, 123, dtype=dtype)
test(np.zeros)
test(np.empty)
test(np.ones)
test(full)
# workaround, use broadcasting
def broadcast(shape, dtype):
return np.broadcast_to(dtype(), shape)
test(broadcast)zeros: allocated memory: 6000248 bytes, array size: 0
empty: allocated memory: 6000080 bytes, array size: 0
ones: allocated memory: 6000080 bytes, array size: 0
full: allocated memory: 6000080 bytes, array size: 0
broadcast: allocated memory: 3680 bytes, array size: 0
Numpy/Python version information:
numpy.__version__: 1.18.4sys.version: 3.6.9 (default, Jul 10 2019, 12:25:55) [GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.4)]
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
57 - Close?Issues which may be closable unless discussion continuedIssues which may be closable unless discussion continued