[RLlib] Bugs in PyTorch version of DDPG

Hello,

I spotted some bugs in the implementation of DDPG in PyTorch.
* The gradient clipping is not implemented correctly, it uses the 'grad_norm_clipping' parameter instead of 'grad_clip' and the function `ray.rllib.utils.torch_ops.minimize_and_clip` which is not correct. All the others algorithms seems to use `ray.rllib.agents.a3c.a3c_torch_policy.apply_grad_clipping`. I propose to replace minimize_and_clip by the A3C function in torch_ops.

* In the PyTorch model of DDPG the parameters to bound the action space (range and minimum) are not tracked by PyTorch as they are not registered as parameters. This means that they are not converted to cuda tensors resulting in an error.

* The target model is placed on the gpu even if ray was not configure to use the gpu.

I will make a PR with everything. But I don't know if I should replace minimize_and_clip. 

- [x] I have verified my script runs in a clean environment and reproduces the issue.
- [x]  I have verified the issue also occurs with the [latest wheels](https://docs.ray.io/en/latest/installation.html).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Bugs in PyTorch version of DDPG #9667

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RLlib] Bugs in PyTorch version of DDPG #9667

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions