Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control

Skand Peri, Akhil Perincherry, Bikram Pandit, Stefan Lee

CoRL 2025 (Oral Presentation)
[arXiv] • [Project Page]

Abstract: Efficient robot control often requires balancing task performance with energy expenditure. A common approach in reinforcement learning (RL) is to penalize energy use directly as part of the reward function. This requires carefully tuning weight terms to avoid undesirable trade-offs where energy minimization harms task success. In this work, we propose a hyperparameter-free gradient optimization method to minimize energy expenditure without conflicting with task performance. Inspired by recent works in multitask learning, our method applies policy gradient projection between task and energy objectives to derive policy updates that minimize energy expenditure in ways that do not impact task performance. We evaluate this technique on standard locomotion benchmarks of DM-Control and HumanoidBench and demonstrate a reduction of 64% energy usage while maintaining comparable task performance. Further, we conduct experiments on a Unitree GO2 quadruped showcasing Sim2Real transfer of energy efficient policies. Our method is easy to implement in standard RL pipelines with minimal code changes, is applicable to any policy gradient method, and offers a principled alternative to reward shaping for energy efficient control policies.

Setup instructions

Clone the repository:

git clone --recurse-submodules https://github.com/pvskand/PEGrad.git
cd PEGrad

Create conda environment:

conda env create -f pegrad-env.yml
conda activate pegrad-env

Install uv (if not already installed):
```
curl -LsSf https://astral.sh/uv/install.sh | sh
```
Or via pip: pip install uv
Install dependencies using uv (much faster than pip):
```
uv pip install -e .
```

Set up Humanoid Bench:

cd src/pegrad/leanrl/envs/humanoid-bench
uv pip install -e .
cd ../../

Training with PEGrad

Before training, you need to set up your Weights & Biases account and project. You can do this by first running:

 wandb login

Then, change the entity and project in the config.yaml file to your own.

entity: your_wandb_entity
project: your_wandb_project

To train with default configurations on Humanoid Bench h1-walk-v0, run:

 python -m leanrl.sac.sac_pegrad

To train on other Humanoid Bench environments, run:

 python -m leanrl.sac.sac_pegrad env_id=humanoidbench/h1-run-v0 seed=1

To train on quadruped-run environment in DM-Control, run:

 python -m leanrl.sac.sac_pegrad env_id=dmcontrol/quadruped-run

To train on dog-run environment in DM-Control, run:

 python -m leanrl.sac.sac_pegrad env_id=dmcontrol/dog-run

FAQ

In case of the following GLFW error:
```
GLFW error 65537: b'The GLFW library is not initialized'
```
Try setting the environment variable MUJOCO_GL to egl.
```
export MUJOCO_GL=egl
```

License

PEGrad is licensed under the MIT License.

Acknowledgments

We thank the authors of the following projects for their work:

Citation

If you find this work useful, please consider citing:

@article{PEGRAD,
  author = {Peri, Skand and Perincherry*, Akhil and Pandit*, Bikram and Lee, Stefan},
  title = {Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control},
  journal = {Conference on Robot Learning},
  year = {2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
src/pegrad		src/pegrad
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pegrad-env.yml		pegrad-env.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control

Skand Peri, Akhil Perincherry, Bikram Pandit, Stefan Lee

CoRL 2025 (Oral Presentation)
[arXiv] • [Project Page]

Table of Contents

Setup instructions

Training with PEGrad

FAQ

License

Acknowledgments

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control

Skand Peri, Akhil Perincherry*, Bikram Pandit*, Stefan Lee

CoRL 2025 (Oral Presentation) [arXiv] • [Project Page]

Table of Contents

Setup instructions

Training with PEGrad

FAQ

License

Acknowledgments

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Skand Peri, Akhil Perincherry, Bikram Pandit, Stefan Lee

CoRL 2025 (Oral Presentation)
[arXiv] • [Project Page]

Packages