Skip to content

BUG: fix thread safety of array_getbuffer#30667

Merged
seberg merged 7 commits intonumpy:mainfrom
kumaraditya303:get_buffer_fix
Jan 27, 2026
Merged

BUG: fix thread safety of array_getbuffer#30667
seberg merged 7 commits intonumpy:mainfrom
kumaraditya303:get_buffer_fix

Conversation

@kumaraditya303
Copy link
Copy Markdown
Contributor

Fixes #30648

@kumaraditya303 kumaraditya303 added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label Jan 17, 2026
@Harshit-shaw

This comment was marked as spam.

Copy link
Copy Markdown
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments and questions inline

/* Fill in information (and add it to _buffer_info if necessary) */
Py_BEGIN_CRITICAL_SECTION(self);
info = _buffer_get_info(
&((PyArrayObject_fields *)self)->_buffer_info, obj, flags);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_buffer_get_info calls _buffer_info_new unconditionally, which in turn allocates space for the buffer linked list using PyObject_Malloc. Doesn't that release critical sections?

It also looks like there are a bunch of code paths for void data types that call into the C API in a manner that might release critical sections. I'm referring to the HASFIELDS and HASSUBARRAY code paths. Is that a problem?

Also, It's a preexisting issue in this code, but isn't it incorrect to use PyObject_Malloc there? I think it should be using RawMalloc instead, since it's not a PyObject subtype.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I did that for a while. I think it should be PyMem_Malloc, since RawMalloc won't use the python allocator at all (which we want though, since it is faster).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_buffer_get_info calls _buffer_info_new unconditionally, which in turn allocates space for the buffer linked list using PyObject_Malloc. Doesn't that release critical sections?

memory allocations APIs never release the thread state so the critical sections are never released.

It also looks like there are a bunch of code paths for void data types that call into the C API in a manner that might release critical sections. I'm referring to the HASFIELDS and HASSUBARRAY code paths. Is that a problem?

The code is reading values from tuples and converting ints to C ints and vice-versa. All that is safe because any allocated objects is local and not shared across the thread which would cause critical section to get released.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that is safe because any allocated objects is local and not shared across the thread which would cause critical section to get released.

It has puzzled me a bit before when they can get released, but happy if this should be safe.
(It might be cool to have a clearer description in the C-API docs, but that is a different matter. For myself, I decided right now that I'll assume pretty much anything may release it.)

Things might get a bit more complicated if we have an array that is both C- and F-contiguous and is exported both ways. However, I don't think it changes anything w.r.t. to this.

Copy link
Copy Markdown
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kumaraditya303, looks right to me and the test look good too. Let's get it in!

/* Fill in information (and add it to _buffer_info if necessary) */
Py_BEGIN_CRITICAL_SECTION(self);
info = _buffer_get_info(
&((PyArrayObject_fields *)self)->_buffer_info, obj, flags);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that is safe because any allocated objects is local and not shared across the thread which would cause critical section to get released.

It has puzzled me a bit before when they can get released, but happy if this should be safe.
(It might be cool to have a clearer description in the C-API docs, but that is a different matter. For myself, I decided right now that I'll assume pretty much anything may release it.)

Things might get a bit more complicated if we have an array that is both C- and F-contiguous and is exported both ways. However, I don't think it changes anything w.r.t. to this.

@seberg seberg added the 09 - Backport-Candidate PRs tagged should be backported label Jan 27, 2026
@seberg seberg merged commit 5a18ce3 into numpy:main Jan 27, 2026
75 checks passed
charris pushed a commit to charris/numpy that referenced this pull request Jan 27, 2026
Add critical sections around buffer info creation/mutation.
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jan 27, 2026
charris added a commit that referenced this pull request Jan 27, 2026
BUG: fix thread safety of `array_getbuffer` (#30667)
@kumaraditya303 kumaraditya303 deleted the get_buffer_fix branch February 1, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

00 - Bug 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: data races in _buffer_get_info when multiple threads try to access buffer of ndarray

5 participants