Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-112529: Use GC heaps for GC allocations in free-threaded builds #114157

Merged
merged 2 commits into from Jan 20, 2024

Conversation

colesbury
Copy link
Contributor

@colesbury colesbury commented Jan 16, 2024

The free-threaded build's garbage collector implementation will need to find GC objects by traversing mimalloc heaps. This hooks up the allocation calls with the correct heaps by using a thread-local "current_obj_heap" variable. There is no effect on the default (with GIL) build.

The free-threaded build's garbage collector implementation will need to
find GC objects by traversing mimalloc heaps. This hooks up the
allocation calls with the correct heaps by using a thread-local
"current_obj_heap" variable.
@colesbury
Copy link
Contributor Author

@DinoV, this is the follow on from #113263. In particular, it hooks up the part you didn't like :-)

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

_PyObject_MallocWithType and _PyObject_ReallocWithType looks very redundent.
We can extract them as something like this, but I am not sure whether it is worth doing this or if it will cause any performance issues. If possible, extracting them as macros will be beneficial to understanding the code.

# psudo
switch_heap_per_type(tp);
malloc or realloc
switch_heap(_Py_MIMALLOC_HEAP_OBJECT);

Copy link
Contributor

@DinoV DinoV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not a big fan of the smuggling through thread state, but I don't have a better solution other than extending the existing PyMemAllocator API in some way or another.

I'd love it if we could use ctx here but that's not how ctx works and if someone intercepts the allocator we won't get our ctx. I wonder if it'd be worth adding a GC domain that would take an extra void* (which would avoid the problem with "what do we do if we need another heap in the future" problem). Pinging @vstinner who I believe wasn't around the last time this was discussed to see if he might have further opinions.

#endif
void *mem = PyObject_Malloc(size);
#ifdef Py_GIL_DISABLED
m->current_object_heap = &m->heaps[_Py_MIMALLOC_HEAP_OBJECT];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this shouldn't be strictly necessary? We should always be going through this API to allocate otherwise we might get the wrong heap. We could keep this NULL in debug builds and assert that it's not NULL when we hit the allocator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can call PyObject_Malloc() directly -- that doesn't go through this API. Only calls that might allocate GC objects need to go through this. So we have to restore current_object_heap to the non-GC object heap.

You could structure this a bit differently and have PyObject_Malloc() also go through this sort of thing, but I think that would end up being more complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you call PyObject_Malloc directly aren't things now broken with the pre-header? It seems that Py_TPFLAGS_MANAGED_DICT and Py_TPFLAGS_MANAGED_WEAKREF are both documented as being available to extension objects which would be the primary callers of PyObject_Malloc. Maybe we at least need to document that if you're using these flags you need to call PyType_GenericAlloc in Doc/c-api/typeobj.rst.

I suppose before this you'd be probably broken if you called PyObject_Malloc, but it would have crashed and burned pretty obviously, and you could figure out to call _PyType_PreHeaderSize. Now it seems critical that you call PyType_GenericAlloc or _PyObject_MallocWithType or you'll get really subtle crashes in multi-threaded scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can call PyObject_Malloc() for non-GC objects, but not for GC-objects. GC objects have had "pre-headers" (PyGC_Head) before Py_TPFLAGS_MANAGED_DICT was introduced.

These restrictions have been in place since Python 2.2a3:
https://github.com/python/cpython/blob/72abb8c5d487ead9eb115fec8132ccef5ba189e5/Misc/HISTORY#L24391-L24398

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, true, I hadn't considered that the pre-header implies GC, that makes sense!

@corona10
Copy link
Member

@colesbury Please let me know if the PR is ready to be merged. cc @DinoV

@colesbury
Copy link
Contributor Author

Yes, it's ready to be merged

@corona10 corona10 merged commit 1d6d5e8 into python:main Jan 20, 2024
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants