Skip to content

[Python] stronger / more discoverable discouragement of using multiprocessing module is needed #902

@tswast

Description

@tswast

Background

I was helping a customer in https://stackoverflow.com/q/64880779/101923 who was attempting to use the multiprocessing module with the BigQuery Storage API. Now that I know to look for it, I see the note in the client library documentation landing page.

Because this client uses grpcio library, it is safe to share instances across threads. In multiprocessing scenarios, the best practice is to create client instances after the invocation of os.fork() by multiprocessing.Pool or multiprocessing.Process.

Problems encountered

  • Note claims that multiprocessing can be done. This is (probably?) accurate, but difficult to follow without an example. Customers have to create all worker pools / processes before creating any client objects. This isn't always feasible. For example, with the BigQuery Storage API, the ideal number of workers depends on the number of streams returned by the session create request.
  • I wish there was a link to more information about why grpcio has trouble with multiprocessing.
  • This note is awkwardly placed between reference documentation links. It's been in the templates for a few months now and I just recently discovered it.
  • This note is not linkable -- would be better on its own page or at least having an anchor tag that can be linked to.

Proposed solution

  • Add an anchor tag to the note / make a separate page that can be linked to.

Some options:

  1. We tell people not to use multiprocessing at all and link to known issues.
  2. We create some code examples that do use multiprocessing successfully. Make sure these are run in system tests to catch regressions in grpcio.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    lang: pythonIssues specific to Python.priority: p3Desirable enhancement or fix. May not be included in next release.type: docsImprovement to the documentation for an API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions