Cheetah-Trainer

This is am implementation of the Automated Residual Reinforcement Learning algorithm.

Requirement

- python3.7
- tensorflow
- tf2rl
- pybullet
- gym
- wandb

How to use

Basic Usage

python main.py --gait sine --policy TD3 --optimiser TBPSA

The gait argument represents the gait pattern; it can be line/sine/rose/triangle. The policy argument determines the RL agent; it can be SAC/TD3 The optimiser argument chooses the parameter optimisers; it can be BO/CMA/TBPSA

More Arguments state-mode: This could change the state representation of the RL module. leg-action-mode: This could change the action representation of the RL module. optimisation-mask: This could change the parameter search space of the black-box optimiser. num-history-observation: whether used stacked states as RL observation.

Acknowledgment

** Parts of this implementation are based on tf2rl. **

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Cheetah-Gym @ 882c729		Cheetah-Gym @ 882c729
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
ddpg.py		ddpg.py
main.py		main.py
on_policy_trainer.py		on_policy_trainer.py
ppo.py		ppo.py
pureBB.py		pureBB.py
sac.py		sac.py
td3.py		td3.py
tfp_gaussian_actor.py		tfp_gaussian_actor.py
trainer.py		trainer.py
vpg.py		vpg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cheetah-Trainer

Requirement

How to use

Acknowledgment

About

Uh oh!

Releases

Packages

Languages

License

Chenaah/Cheetah-Trainer

Folders and files

Latest commit

History

Repository files navigation

Cheetah-Trainer

Requirement

How to use

Acknowledgment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages