Skip to content

Transferring Actor handle between workers makes Actor unreleasable #4936

@gjoseph92

Description

@gjoseph92

When worker A calls gather_dep on an Actor task, it gets sent an Actor handle by worker B where the Actor is running. When that handle is deserialized on worker A, it gets a Client and creates a Future reference holding onto that Actor's key. The scheduler now notes that worker A's Client desires that key.

When the actual user's Client tries to release the Actor, the scheduler notes that worker A's Client still holds a reference to it, so it is not released.

More complex case:

A user submits a task where one of the dependencies is marked as an Actor, like:

with dask.annotate(workers=workers[0]):
    counter = dask.delayed(Counter)()
with dask.annotate(workers=workers[1]):
    intermediate = dask.delayed(lambda c: None)(counter)
with dask.annotate(workers=workers[0]):
    final = dask.delayed(lambda x, c: x)(intermediate, counter)

final.compute(actors=counter, optimize_graph=False)

In this case, the user doesn't even hold a reference to the Actor. But when the final task completes and the scheduler runs _propagate_forgotten to release its dependencies (including Counter), it sees that some Client holds a reference to the Counter, so it doesn't release it—when in fact the client holding the reference is workers[1]'s Actor handle.

This is what's causing test failures in #4925, now that we're more likely to schedule tasks on workers that don't hold any dependencies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions