-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Closed
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'ttuneTune-related issuesTune-related issues
Description
What is the problem?
Trivial issue, which does not actually impact performance as far as I can tell, but it seems if you launch a cluster with a configuration that sets gpus = 0, then there might be a check which avoids this error message...
(pid=4628) Exception in thread Thread-2:
(pid=4628) Traceback (most recent call last):
(pid=4628) File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
(pid=4628) self.run()
(pid=4628) File "/home/ubuntu/algo/lib/python3.6/site-packages/ray/tune/utils/util.py", line 89, in run
(pid=4628) self._read_utilization()
(pid=4628) File "/home/ubuntu/algo/lib/python3.6/site-packages/ray/tune/utils/util.py", line 65, in _read_utilization
(pid=4628) for gpu in GPUtil.getGPUs():
(pid=4628) File "/home/ubuntu/algo/lib/python3.6/site-packages/GPUtil/GPUtil.py", line 102, in getGPUs
(pid=4628) deviceIds = int(vals[i])
(pid=4628) ValueError: invalid literal for int() with base 10: "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running."
Ray version and other system information (Python version, TensorFlow version, OS):
Ray 0.8.1, TF 2.1, RHEL 7.7
Reproduction (REQUIRED)
Please provide a script that can be run to reproduce the issue. The script should have no external library dependencies (i.e., use fake or mock data / environments):
If we cannot run your script, we cannot fix your issue.
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'ttuneTune-related issuesTune-related issues