Skip to content

[Core] [Bug] Recursive ray.util.multiprocessing.Pool deadlock #21488

@yogeveran

Description

@yogeveran

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Core

What happened + What you expected to happen

Using from ray.util.multiprocessing import Pool recursively generates a deadlock.
Generating ray actors recursively does not get stuck.
I assume a solution of the same principle as this could solve the issue:
#1920

I gave the code generating the error below:

2022-01-09 15:20:50,690 INFO services.py:1265 -- View the Ray dashboard at omitted
2022-01-09 15:21:10,746 WARNING worker.py:1215 -- The actor or task with ID ffffffffffffffffe7315b869e1f68be487f8f1301000000 cannot be scheduled right now. You can ignore this message if this Ray cluster is expected to auto-scale or if you specified a runtime_env for this actor or task, which may take time to install. Otherwise, this is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increasing the resources available to this Ray cluster.
Required resources for this actor or task: {CPU: 1.000000}
Available resources on this node: {0.000000/1.000000 CPU, 31.779781 GiB/31.779781 GiB memory, 1.000000/1.000000 GPU, 15.889890 GiB/15.889890 GiB object_store_memory, 1.000000/1.000000 node:172.17.0.2, 1.000000/1.000000 accelerator_type:RTX}
In total there are 0 pending tasks and 1 pending actors on this node.

Versions / Dependencies

ray version 1.9.1

Reproduction script

import ray
from ray.util.multiprocessing import Pool


def poolit_a(idx):
    with Pool(ray_address='auto') as pool:
        return list(pool.map(np.sqrt, np.arange(0, 2, 1)))



def poolit_b():
    with Pool(ray_address='auto') as pool:
        return list(pool.map(poolit_a, range(2, 4, 1)))


if __name__ == '__main__':
    try:
        ray.init(num_cpus=1)
        print(poolit_b())
    finally:
        ray.shutdown()

Anything else

Every run.

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

Labels

QSQuantsight triage labelbugSomething that is supposed to be working; but isn'ttriageNeeds triage (eg: priority, bug/not-bug, and owning component)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions