Skip to content

Actor class that is exported from driver will not be exported from workers in next Ray session. #4843

@robertnishihara

Description

@robertnishihara

The following script will hang.

import ray

@ray.remote
class Actor(object):
    def method(self):
        pass

@ray.remote
def f(actor_class):
    handle = actor_class.remote()
    ray.get(handle.method.remote())

ray.init(num_cpus=2)

a = Actor.remote()
ray.get(a.method.remote())

ray.shutdown()

ray.init(num_cpus=2)

ray.get(f.remote(Actor))  # This line hangs.

ray.shutdown()

Right now, the Actor actor class has a field _last_export_session (which is a counter), which keeps track of when the actor class definition was last exported (to workers through the GCS). If it was exported in a given Ray session (that is, the span between a call to ray.init and ray.shutdown), then in a subsequent Ray session, the definition will still be exported again from the driver, but not from workers, so in situations where only the workers are in a position to export the definition (as in the example above), it won't get exported.

The trouble with the counter is that the counter on the workers is not set correctly. This can be fixed by using a driver ID instead of a counter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething that is supposed to be working; but isn't

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions