-
-
Notifications
You must be signed in to change notification settings - Fork 12k
Closed
Milestone
Description
A segmentation fault is generated when converting a Pandas dataframe to a NumPy array, with incompatible types, and trying to sum a slice.
I have bisected the commit that generated the regression: b946f09 (#18469, #18450) - cc @ahaldane @seberg
Reproducing code example:
Requirements:
numpy==1.20.2
pandas==1.2.4
Code:
import datetime
import numpy as np
import pandas
df = pandas.DataFrame(data={
'x': [datetime.datetime.now() for _ in range(3)],
'y': [0 for _ in range(3)],
})
np.sum(df.to_numpy()[:2])Error message:
GDB traceback:
$ gdb --args python test.py
GNU gdb (GDB) 10.1
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...
(No debugging symbols found in python)
(gdb) run
Starting program: /home/mike/.virtualenvs/numpy/bin/python test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7ffff4236640 (LWP 10386)]
[New Thread 0x7ffff3a35640 (LWP 10387)]
[New Thread 0x7fffef234640 (LWP 10388)]
[New Thread 0x7fffeea33640 (LWP 10389)]
[New Thread 0x7fffea232640 (LWP 10390)]
[New Thread 0x7fffe7a31640 (LWP 10391)]
[New Thread 0x7fffe5230640 (LWP 10392)]
[New Thread 0x7fffe2a2f640 (LWP 10393)]
[New Thread 0x7fffe022e640 (LWP 10394)]
[New Thread 0x7fffdda2d640 (LWP 10395)]
[New Thread 0x7fffdb22c640 (LWP 10396)]
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff6b925a8 in PyArray_Item_XDECREF () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
(gdb) bt
#0 0x00007ffff6b925a8 in PyArray_Item_XDECREF () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
#1 0x00007ffff6b82c85 in npyiter_clear_buffers () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
#2 0x00007ffff6b8529e in NpyIter_Deallocate () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
#3 0x00007ffff6c0e76f in PyUFunc_ReduceWrapper () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
#4 0x00007ffff6d1159b in PyUFunc_GenericReduction () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
#5 0x00007ffff6d13a90 in ufunc_reduce () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
#6 0x00007ffff7d1fed3 in ?? () from /usr/lib/libpython3.9.so.1.0
#7 0x00007ffff7d1ef0b in PyObject_Call () from /usr/lib/libpython3.9.so.1.0
#8 0x00007ffff7d02cc7 in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#9 0x00007ffff7cfbfdd in ?? () from /usr/lib/libpython3.9.so.1.0
#10 0x00007ffff7d0e01e in _PyFunction_Vectorcall () from /usr/lib/libpython3.9.so.1.0
#11 0x00007ffff7cfe29f in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#12 0x00007ffff7cfbfdd in ?? () from /usr/lib/libpython3.9.so.1.0
#13 0x00007ffff7d0e01e in _PyFunction_Vectorcall () from /usr/lib/libpython3.9.so.1.0
#14 0x00007ffff6ab1af4 in array_implement_array_function () from /home/mike/.virtualenvs/numpy/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
#15 0x00007ffff7d1ff22 in ?? () from /usr/lib/libpython3.9.so.1.0
#16 0x00007ffff7d063aa in _PyObject_MakeTpCall () from /usr/lib/libpython3.9.so.1.0
#17 0x00007ffff7d01c65 in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#18 0x00007ffff7cfbfdd in ?? () from /usr/lib/libpython3.9.so.1.0
#19 0x00007ffff7d0e01e in _PyFunction_Vectorcall () from /usr/lib/libpython3.9.so.1.0
#20 0x00007ffff7d01acf in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#21 0x00007ffff7cfbfdd in ?? () from /usr/lib/libpython3.9.so.1.0
#22 0x00007ffff7cfb9a1 in _PyEval_EvalCodeWithName () from /usr/lib/libpython3.9.so.1.0
#23 0x00007ffff7dc0913 in PyEval_EvalCode () from /usr/lib/libpython3.9.so.1.0
#24 0x00007ffff7dd0b0d in ?? () from /usr/lib/libpython3.9.so.1.0
#25 0x00007ffff7dcc3fb in ?? () from /usr/lib/libpython3.9.so.1.0
#26 0x00007ffff7c6b695 in ?? () from /usr/lib/libpython3.9.so.1.0
#27 0x00007ffff7c6aa49 in PyRun_SimpleFileExFlags () from /usr/lib/libpython3.9.so.1.0
#28 0x00007ffff7de253a in Py_RunMain () from /usr/lib/libpython3.9.so.1.0
#29 0x00007ffff7db3939 in Py_BytesMain () from /usr/lib/libpython3.9.so.1.0
#30 0x00007ffff7a2db25 in __libc_start_main () from /usr/lib/libc.so.6
#31 0x000055555555504e in _start ()
It's relevant to note that the sum should fail, and that it does as expected, when the sum is done for a single row (np.sum(df.to_numpy()[:1])), and all the rows using slicing (np.sum(df.to_numpy()[:3]) or not (np.sum(df.to_numpy())).
The segmentation fault is only present when the array slicing takes a subset of the array rows, greater than 1.
NumPy/Python version information:
- NumPy: 1.20.2
- Python: 3.9.3
- Within virtualenv (no Docker)
>>> import sys, numpy; print(numpy.__version__, sys.version)
1.20.2 3.9.3 (default, Apr 8 2021, 23:35:02)
[GCC 10.2.0]
momer