Skip to content

Functional change for get_worker #7696

@quasiben

Description

@quasiben

In #7580 we removed the ability for get_worker to return non-task worker functions to return the worker. This is a confusing statement -- essentially for the following is no longer supported

from dask.distributed import Client, get_worker

def main():
    c = Client()
    def foo():
        worker = get_worker()
        print(worker)

    c.run(foo)

if __name__ == "__main__":
    main()

client.run runs functions on each worker but there is no task so get_worker returns None. This is functionality RAPIDS folks have been using like the following:

https://github.com/rapidsai/raft/blob/05d899b36b76545d2439dbe47e4659d644ced227/python/raft-dask/raft_dask/common/comms.py#L410-L414

We are fixing this by relying on the dask_worker arg for client.run functions:

If your function takes an input argument named dask_worker then that variable will be populated with the worker itself.

We have a fix for this in Dask-CUDA and working on one for RAFT. I think we should include a statement like the following to warn legacy users of the change get_worker:

If you need to get the worker in a non-task function like client.run please use dask_worker in the function argument (xref: client.run)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions