BUG: fix thread safety of array_getbuffer#30667
Conversation
This comment was marked as spam.
This comment was marked as spam.
ngoldbaum
left a comment
There was a problem hiding this comment.
Some comments and questions inline
| /* Fill in information (and add it to _buffer_info if necessary) */ | ||
| Py_BEGIN_CRITICAL_SECTION(self); | ||
| info = _buffer_get_info( | ||
| &((PyArrayObject_fields *)self)->_buffer_info, obj, flags); |
There was a problem hiding this comment.
_buffer_get_info calls _buffer_info_new unconditionally, which in turn allocates space for the buffer linked list using PyObject_Malloc. Doesn't that release critical sections?
It also looks like there are a bunch of code paths for void data types that call into the C API in a manner that might release critical sections. I'm referring to the HASFIELDS and HASSUBARRAY code paths. Is that a problem?
Also, It's a preexisting issue in this code, but isn't it incorrect to use PyObject_Malloc there? I think it should be using RawMalloc instead, since it's not a PyObject subtype.
There was a problem hiding this comment.
Yeah, I think I did that for a while. I think it should be PyMem_Malloc, since RawMalloc won't use the python allocator at all (which we want though, since it is faster).
There was a problem hiding this comment.
_buffer_get_info calls _buffer_info_new unconditionally, which in turn allocates space for the buffer linked list using PyObject_Malloc. Doesn't that release critical sections?
memory allocations APIs never release the thread state so the critical sections are never released.
It also looks like there are a bunch of code paths for void data types that call into the C API in a manner that might release critical sections. I'm referring to the HASFIELDS and HASSUBARRAY code paths. Is that a problem?
The code is reading values from tuples and converting ints to C ints and vice-versa. All that is safe because any allocated objects is local and not shared across the thread which would cause critical section to get released.
There was a problem hiding this comment.
All that is safe because any allocated objects is local and not shared across the thread which would cause critical section to get released.
It has puzzled me a bit before when they can get released, but happy if this should be safe.
(It might be cool to have a clearer description in the C-API docs, but that is a different matter. For myself, I decided right now that I'll assume pretty much anything may release it.)
Things might get a bit more complicated if we have an array that is both C- and F-contiguous and is exported both ways. However, I don't think it changes anything w.r.t. to this.
b2d5cd1 to
1d0f432
Compare
seberg
left a comment
There was a problem hiding this comment.
Thanks @kumaraditya303, looks right to me and the test look good too. Let's get it in!
| /* Fill in information (and add it to _buffer_info if necessary) */ | ||
| Py_BEGIN_CRITICAL_SECTION(self); | ||
| info = _buffer_get_info( | ||
| &((PyArrayObject_fields *)self)->_buffer_info, obj, flags); |
There was a problem hiding this comment.
All that is safe because any allocated objects is local and not shared across the thread which would cause critical section to get released.
It has puzzled me a bit before when they can get released, but happy if this should be safe.
(It might be cool to have a clearer description in the C-API docs, but that is a different matter. For myself, I decided right now that I'll assume pretty much anything may release it.)
Things might get a bit more complicated if we have an array that is both C- and F-contiguous and is exported both ways. However, I don't think it changes anything w.r.t. to this.
Add critical sections around buffer info creation/mutation.
BUG: fix thread safety of `array_getbuffer` (#30667)
Fixes #30648