-
Notifications
You must be signed in to change notification settings - Fork 97
Open
Labels
lang: pythonIssues specific to Python.Issues specific to Python.priority: p3Desirable enhancement or fix. May not be included in next release.Desirable enhancement or fix. May not be included in next release.type: docsImprovement to the documentation for an API.Improvement to the documentation for an API.
Description
Background
I was helping a customer in https://stackoverflow.com/q/64880779/101923 who was attempting to use the multiprocessing module with the BigQuery Storage API. Now that I know to look for it, I see the note in the client library documentation landing page.
Because this client uses grpcio library, it is safe to share instances across threads. In multiprocessing scenarios, the best practice is to create client instances after the invocation of os.fork() by multiprocessing.Pool or multiprocessing.Process.
Problems encountered
- Note claims that multiprocessing can be done. This is (probably?) accurate, but difficult to follow without an example. Customers have to create all worker pools / processes before creating any client objects. This isn't always feasible. For example, with the BigQuery Storage API, the ideal number of workers depends on the number of streams returned by the session create request.
- I wish there was a link to more information about why grpcio has trouble with multiprocessing.
- This note is awkwardly placed between reference documentation links. It's been in the templates for a few months now and I just recently discovered it.
- This note is not linkable -- would be better on its own page or at least having an anchor tag that can be linked to.
Proposed solution
- Add an anchor tag to the note / make a separate page that can be linked to.
Some options:
- We tell people not to use
multiprocessingat all and link to known issues. - We create some code examples that do use
multiprocessingsuccessfully. Make sure these are run in system tests to catch regressions ingrpcio.
References
- GRPC note about multiprocessing https://github.com/grpc/grpc/blob/master/doc/fork_support.md (out of date?)
- Enable fork support for Python by default -- Enable fork support for Python by default grpc/grpc#19158
- Hanging bug with Python fork -- Client-side Python fork support can hang on Python 3.7 grpc/grpc#18075
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
lang: pythonIssues specific to Python.Issues specific to Python.priority: p3Desirable enhancement or fix. May not be included in next release.Desirable enhancement or fix. May not be included in next release.type: docsImprovement to the documentation for an API.Improvement to the documentation for an API.