Skip to content

[core] Reference counting not working as expected in local mode. #8917

@allenyin55

Description

@allenyin55

What is the problem?

The following script works in regular mode but fails with a ref counting error when run in local mode.

Ray version and other system information (Python version, TensorFlow version, OS):
Ray: 0.8.5
Python: 3.7.4
OS: MacOS 10.15.3

Reproduction (REQUIRED)

import ray
import time
ray.init(local_mode=True)
@ray.remote
class A:
    def __init__(self):
        self.tasks = []
    def get_tasks(self):
        return self.tasks
    def do_work(self):
        object_id = work.remote()
        self.tasks.append(object_id)
@ray.remote
def work():
    time.sleep(10)
@ray.remote
def check_tasks():
    A = ray.util.get_actor("test_actor")
    tasks = ray.get(A.get_tasks.remote())
    print(tasks)
handle = A.options(detached=True, name="test_actor").remote()
handle.do_work.remote()
check_tasks.remote()

This script fails with the following error

F0612 11:25:53.486153 277904832 reference_count.cc:732]  Check failed: !owner_address.worker_id.IsNil()
*** Check failure stack trace: ***
    @        0x1022e3762  google::LogMessage::~LogMessage()
    @        0x101f49135  ray::RayLog::~RayLog()
    @        0x101c65b31  ray::ReferenceCounter::AddNestedObjectIdsInternal()
    @        0x101c6d5aa  ray::ReferenceCounter::AddNestedObjectIds()
    @        0x101c0e1cd  ray::CoreWorker::AllocateReturnObjects()
    @        0x101b6dad3  __pyx_f_3ray_7_raylet_10CoreWorker_store_task_outputs()
    @        0x101bb5ad5  __pyx_f_3ray_7_raylet_execute_task()
    @        0x101ba36c7  __pyx_f_3ray_7_raylet_task_execution_handler()
    @        0x101bbd9fb  std::__1::__function::__func<>::operator()()
    @        0x101c0329d  ray::CoreWorker::ExecuteTask()
    @        0x101c0b0f5  ray::CoreWorker::ExecuteTaskLocalMode()
    @        0x101c0c869  ray::CoreWorker::SubmitActorTask()
    @        0x101b8ac53  __pyx_pw_3ray_7_raylet_10CoreWorker_43submit_actor_task()
    @        0x1012e34e8  _PyMethodDef_RawFastCallKeywords
    @        0x1012efe24  _PyMethodDescr_FastCallKeywords
    @        0x10141fd65  call_function
    @        0x10141c9ed  _PyEval_EvalFrameDefault
    @        0x10141134a  _PyEval_EvalCodeWithName
    @        0x1012e3313  _PyFunction_FastCallKeywords
    @        0x10141fc67  call_function
    @        0x10141da7e  _PyEval_EvalFrameDefault
    @        0x10141134a  _PyEval_EvalCodeWithName
    @        0x1012e3313  _PyFunction_FastCallKeywords
    @        0x10141fc67  call_function
    @        0x10141d9c5  _PyEval_EvalFrameDefault
    @        0x10141134a  _PyEval_EvalCodeWithName
    @        0x1012e3313  _PyFunction_FastCallKeywords
    @        0x10141fc67  call_function
    @        0x10141c9ed  _PyEval_EvalFrameDefault
    @        0x10141134a  _PyEval_EvalCodeWithName
    @        0x1012e3313  _PyFunction_FastCallKeywords
    @        0x10141fc67  call_function

If we cannot run your script, we cannot fix your issue.

  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Issue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn't

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions