-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Description
Search before asking
- I searched the issues and found no similar issues.
Ray Component
Ray Core
What happened + What you expected to happen
Using from ray.util.multiprocessing import Pool recursively generates a deadlock.
Generating ray actors recursively does not get stuck.
I assume a solution of the same principle as this could solve the issue:
#1920
I gave the code generating the error below:
2022-01-09 15:20:50,690 INFO services.py:1265 -- View the Ray dashboard at omitted
2022-01-09 15:21:10,746 WARNING worker.py:1215 -- The actor or task with ID ffffffffffffffffe7315b869e1f68be487f8f1301000000 cannot be scheduled right now. You can ignore this message if this Ray cluster is expected to auto-scale or if you specified a runtime_env for this actor or task, which may take time to install. Otherwise, this is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increasing the resources available to this Ray cluster.
Required resources for this actor or task: {CPU: 1.000000}
Available resources on this node: {0.000000/1.000000 CPU, 31.779781 GiB/31.779781 GiB memory, 1.000000/1.000000 GPU, 15.889890 GiB/15.889890 GiB object_store_memory, 1.000000/1.000000 node:172.17.0.2, 1.000000/1.000000 accelerator_type:RTX}
In total there are 0 pending tasks and 1 pending actors on this node.
Versions / Dependencies
ray version 1.9.1
Reproduction script
import ray
from ray.util.multiprocessing import Pool
def poolit_a(idx):
with Pool(ray_address='auto') as pool:
return list(pool.map(np.sqrt, np.arange(0, 2, 1)))
def poolit_b():
with Pool(ray_address='auto') as pool:
return list(pool.map(poolit_a, range(2, 4, 1)))
if __name__ == '__main__':
try:
ray.init(num_cpus=1)
print(poolit_b())
finally:
ray.shutdown()
Anything else
Every run.
Are you willing to submit a PR?
- Yes I am willing to submit a PR!