[Backport][v1.75.x][Python] aio: skip grpc/aio shutdown if py interpreter is finalizing#40649
Merged
sergiitk merged 1 commit intogrpc:v1.75.xfrom Sep 17, 2025
Merged
Conversation
Member
Author
|
Note: do not merge until 1.75.0 is out. This is intended for 1.75.1. |
…rpc#40447) This PR changes the logic of `shutdown_grpc_aio` to skip `_actual_aio_shutdown` python interpreter is already [being finalized](https://docs.python.org/3.14/glossary.html#term-interpreter-shutdown) (cleaning up resources, destroying objects, preparing for program exit, etc). `_actual_aio_shutdown` involves `PollerCompletionQueue` shutdown, followed by core [`grpc_shutdown`](https://grpc.github.io/grpc/core/grpc_8h.html#a35f55253e80714c17f4f3a0657e06f1b) API call. Reasoning: 1. During finalizations, in come cases resources we're accessing may already be freed, and the order is not deterministic. Some of the resources being unloaded prior the `_actual_aio_shutdown` call: `_global_aio_state`, `AsyncIOEngine` enum, or even python libraries like `sys`. This leads to errors like `AttributeError: 'NoneType' object has no attribute 'POLLER'`. 2. `PollerCompletionQueue.shutdown()` will try to wait on its poller thread to finish gracefully. In py3.14, `PythonFinalizationError` is raised when `Thread.join()` is called during finalization. I think the logic here is similar to (1): these threads may have already been deallocated. Note that in some cases users were able to prevent `_actual_aio_shutdown` from being called by manually calling `init_grpc_aio` prior to initializing any grpc objects. This resulted in an incorrect positive refcount, which prevents `_actual_aio_shutdown` from being run. Before the above finalization check was added this side-effect was sometimes misused to avoid deadlock on finialization (grpc#22365). This PR: - Fixes grpc#39520 - Fixes grpc#22365 - Fixes grpc#38679 - Fixes grpc#33342 - Fixes grpc#36655 Closes grpc#40447 COPYBARA_INTEGRATE_REVIEW=grpc#40447 from sergiitk:fix/aio/shutdown 11114f6 PiperOrigin-RevId: 804971756
eaf018a to
5326c8c
Compare
sreenithi
approved these changes
Sep 17, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #40447 to v1.75.x.
This PR changes the logic of
shutdown_grpc_aioto skip_actual_aio_shutdownpython interpreter is already being finalized (cleaning up resources, destroying objects, preparing for program exit, etc)._actual_aio_shutdowninvolvesPollerCompletionQueueshutdown, followed by coregrpc_shutdownAPI call.Reasoning:
_actual_aio_shutdowncall:_global_aio_state,AsyncIOEngineenum, or even python libraries likesys. This leads to errors likeAttributeError: 'NoneType' object has no attribute 'POLLER'.PollerCompletionQueue.shutdown()will try to wait on its poller thread to finish gracefully. In py3.14,PythonFinalizationErroris raised whenThread.join()is called during finalization. I think the logic here is similar to (1): these threads may have already been deallocated.Note that in some cases users were able to prevent
_actual_aio_shutdownfrom being called by manually callinginit_grpc_aioprior to initializing any grpc objects. This resulted in an incorrect positive refcount, which prevents_actual_aio_shutdownfrom being run. Before the above finalization check was added this side-effect was sometimes misused to avoid deadlock on finialization (#22365).This PR:
PythonFinalizationError: cannot join thread at interpreter shutdownwhen using Python 3.14 with workaround #39520