-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Closed
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'ttriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)
Description
What is the problem?
Ray version and other system information (Python version, TensorFlow version, OS): Windows, PyTorch
Sometimes ray.get_gpu_ids() does not list any gpus when num_gpus=1 and I get the following index error
(pid=19720) ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=2656, ip=192.168.50.5)
(pid=19720) File "python\ray\_raylet.pyx", line 535, in ray._raylet.execute_task
(pid=19720) File "python\ray\_raylet.pyx", line 485, in ray._raylet.execute_task.function_executor
(pid=19720) File "C:\Users\Julius\Anaconda3\envs\minerl-rllib\lib\site-packages\ray\_private\function_manager.py", line 563, in actor_method_executor
(pid=19720) return method(__ray_actor, *args, **kwargs)
(pid=19720) File "C:\Users\Julius\Anaconda3\envs\minerl-rllib\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 550, in __init__
(pid=19720) self._build_policy_map(
(pid=19720) File "C:\Users\Julius\Anaconda3\envs\minerl-rllib\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 1345, in _build_policy_map
(pid=19720) self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
(pid=19720) File "C:\Users\Julius\Anaconda3\envs\minerl-rllib\lib\site-packages\ray\rllib\policy\policy_map.py", line 127, in create_policy
(pid=19720) self[policy_id] = class_(observation_space, action_space,
(pid=19720) File "C:\Users\Julius\Anaconda3\envs\minerl-rllib\lib\site-packages\ray\rllib\policy\policy_template.py", line 256, in __init__
(pid=19720) self.parent_cls.__init__(
(pid=19720) File "C:\Users\Julius\Anaconda3\envs\minerl-rllib\lib\site-packages\ray\rllib\policy\torch_policy.py", line 159, in __init__
(pid=19720) self.device = self.devices[0]
(pid=19720) IndexError: list index out of range
Reproduction (REQUIRED)
Please provide a short code snippet (less than 50 lines if possible) that can be copy-pasted to reproduce the issue. The snippet should have no external library dependencies (i.e., use fake or mock data / environments):
Not exactly sure what is required to reproduce it... will update when I find out
If the code snippet cannot be run by itself, the issue will be closed with "needs-repro-script".
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'ttriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)