[rllib] tensorflow 1.14 doesn't work with GPUs any longer

### What is the problem?

Using a recent nightly build of Ray/RLlib, you can't train using GPUs with TensorFlow 1.14 due to an API mismatch.

rollout_worker.py assumes that tensorflow has a function `list_physical_devices` but in 1.14, it's only `experimental_list_devices`, so you get 

`
AttributeError: module 'tensorflow._api.v1.config' has no attribute 'list_physical_devices'
`

Here's the code in question in rollout_worker.py:

```python
if (ray.is_initialized() and
                ray.worker._mode() != ray.worker.LOCAL_MODE):
            # Check available number of GPUs
            if not ray.get_gpu_ids():
                logger.debug(
                    "Creating policy evaluation worker {}".format(
                        worker_index) +
                    " on CPU (please ignore any CUDA init errors)")
            elif (policy_config["framework"] in ["tf2", "tf", "tfe"] and
                  not tf.config.list_physical_devices("GPU")) or \
                    (policy_config["framework"] == "torch" and
                     not torch.cuda.is_available()):
                raise RuntimeError(
                    "GPUs were assigned to this worker by Ray, but "
                    "your DL framework ({}) reports GPU acceleration is "
                    "disabled. This could be due to a bad CUDA- or {} "
                    "installation.".format(
                        policy_config["framework"],
                        policy_config["framework"]))
```

vs the API in `tensorflow/_api/v1/config/__init__.py`:

```python
from tensorflow.python.eager.context import list_devices as experimental_list_devices
```

and here's the full stacktrace:

```Failure # 1 (occurred at 2020-07-22_08-30-52)
Traceback (most recent call last):
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 471, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 430, in fetch_result
    result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/worker.py", line 1532, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train() (pid=7711, ip=10.128.0.4)
  File "python/ray/_raylet.pyx", line 433, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 468, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 472, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 473, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 426, in ray._raylet.execute_task.function_executor
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 88, in __init__
    Trainer.__init__(self, config, env, logger_creator)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 475, in __init__
    super().__init__(config, logger_creator)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/tune/trainable.py", line 232, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 639, in setup
    self._init(self.config, self.env_creator)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 102, in _init
    env_creator, self._policy, config, self.config["num_workers"])
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 709, in _make_workers
    logdir=self.logdir)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 67, in __init__
    RolloutWorker, env_creator, policy, 0, self._local_config)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 296, in _make_worker
    extra_python_environs=extra_python_environs)
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 415, in __init__
    not tf.config.list_physical_devices("GPU")) or \
  File "/home/andrew/miniconda3/envs/ray_nightly_tf14/lib/python3.7/site-packages/tensorflow/python/util/deprecation_wrapper.py", line 106, in __getattr__
    attr = getattr(self._dw_wrapped_module, name)
AttributeError: module 'tensorflow._api.v1.config' has no attribute 'list_physical_devices'
```

### Reproduction (REQUIRED)
Ray: latest nightly wheel as of 2020-07-22
TensorFlow: 1.14 
Python: 3.7
OS: Ubuntu 20.04

```python
from ray import tune
from ray.rllib.agents.ppo import PPOTrainer
tune.run(PPOTrainer,
         config={
             "env": "CartPole-v0",
             "num_workers": 4,
             "num_envs_per_worker": 2,
             "num_gpus": 0.5,
             "num_gpus_per_worker": 0.1,
         })
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] tensorflow 1.14 doesn't work with GPUs any longer #9631

What is the problem?

Reproduction (REQUIRED)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[rllib] tensorflow 1.14 doesn't work with GPUs any longer #9631

Description

What is the problem?

Reproduction (REQUIRED)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions