[rllib] Port DDPG to the build_tf_policy pattern#5242
[rllib] Port DDPG to the build_tf_policy pattern#5242ericl merged 32 commits intoray-project:masterfrom
Conversation
This reverts commit 5f64551.
|
Test FAILed. |
fd50692 to
10be568
Compare
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test PASSed. |
What do these changes do?
This ports DDPG to the policy builder pattern. This is the last major algorithm that needed to be ported.
Pendulum performance seems to be on par. @joneswong could you check if parameter noise exploration still works as expected? There was a lot of changes around handling in that code.
fyi @qxcv @gehring
Related issue number
Closes #4822
Closes #4788
Linter
scripts/format.shto lint the changes in this PR.