Add a convenience for making local streams in python#3355
Merged
Conversation
andresy
approved these changes
Apr 2, 2026
4 tasks
3 tasks
This was referenced Apr 13, 2026
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR started as a small PR to make it easy to create thread local streams in Python. But I think we had the teardown logic quite wrong but it didn't show up until running the server which does the computation in a background dedicated thread.
Basically, thread local objects are torn down when the thread exits which is likely to be before the interpreter exits. In that case the destructors of C++ objects will run and try to clean up Python objects without holding the GIL. So the fix is the standard hold the GIL and clean up in destructor situation.
Regarding the main thread though, the interpreter will probably be torn down before the thread local objects. So when the destructors run there will be no python and no GIL and segfault. So we register exit handlers that clean up and the C++ objects check if there is cleanup to do or do nothing.
The above means that correct usage of MLX requires either manual clean up of some kind, or all threads to join the main thread before exiting.