Skip to content

get_client returns different Clients in different worker threads #5466

@gjoseph92

Description

@gjoseph92

Similar to #3827, get_client is not thread-safe—calling it within a task returns different Client instances in different threads.

def test_get_client_threadsafe_sync():
    def f(x):
        return get_client().id

    with cluster(nworkers=1, worker_kwargs={"nthreads": 4}) as (scheduler, workers):
        with Client(scheduler["address"]) as client:
            futures = client.map(f, range(100))
            ids = client.gather(futures)
            assert len(set(ids)) == 1
# >         assert len(set(ids)) == 1
# E         assert 4 == 1
# E           +4
# E           -1

If you are using a Client within a task, there's a good chance that Futures it produces will not be usable within another task. For example, if you create some Futures in one task, return them as the result of the task, and then try to reuse the Futures in a subsequent task, you may or may not find that you can't get_client().gather(...) those Futures.

This is a somewhat uncommon use-case, but if you are doing it, it's very difficult to debug.

FWIW doing

client = get_worker().client

seems to avoid the issue.

Environment:

  • Distributed version: a1b67b8
  • Python version: 3.9.5
  • Operating System: macOS
  • Install method (conda, pip, source): source

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions