Skip to content

pvskand/PEGrad

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control

CoRL 2025 (Oral Presentation)
[arXiv][Project Page]


tests


Abstract: Efficient robot control often requires balancing task performance with energy expenditure. A common approach in reinforcement learning (RL) is to penalize energy use directly as part of the reward function. This requires carefully tuning weight terms to avoid undesirable trade-offs where energy minimization harms task success. In this work, we propose a hyperparameter-free gradient optimization method to minimize energy expenditure without conflicting with task performance. Inspired by recent works in multitask learning, our method applies policy gradient projection between task and energy objectives to derive policy updates that minimize energy expenditure in ways that do not impact task performance. We evaluate this technique on standard locomotion benchmarks of DM-Control and HumanoidBench and demonstrate a reduction of 64% energy usage while maintaining comparable task performance. Further, we conduct experiments on a Unitree GO2 quadruped showcasing Sim2Real transfer of energy efficient policies. Our method is easy to implement in standard RL pipelines with minimal code changes, is applicable to any policy gradient method, and offers a principled alternative to reward shaping for energy efficient control policies.


Table of Contents


Setup instructions

  1. Clone the repository:

    git clone --recurse-submodules https://github.com/pvskand/PEGrad.git
    cd PEGrad
  2. Create conda environment:

    conda env create -f pegrad-env.yml
    conda activate pegrad-env
  3. Install uv (if not already installed):

    curl -LsSf https://astral.sh/uv/install.sh | sh

    Or via pip: pip install uv

  4. Install dependencies using uv (much faster than pip):

    uv pip install -e .
  5. Set up Humanoid Bench:

    cd src/pegrad/leanrl/envs/humanoid-bench
    uv pip install -e .
    cd ../../

Training with PEGrad

Before training, you need to set up your Weights & Biases account and project. You can do this by first running:

 wandb login

Then, change the entity and project in the config.yaml file to your own.

entity: your_wandb_entity
project: your_wandb_project

To train with default configurations on Humanoid Bench h1-walk-v0, run:

 python -m leanrl.sac.sac_pegrad 

To train on other Humanoid Bench environments, run:

 python -m leanrl.sac.sac_pegrad env_id=humanoidbench/h1-run-v0 seed=1

To train on quadruped-run environment in DM-Control, run:

 python -m leanrl.sac.sac_pegrad env_id=dmcontrol/quadruped-run

To train on dog-run environment in DM-Control, run:

 python -m leanrl.sac.sac_pegrad env_id=dmcontrol/dog-run

FAQ

  1. In case of the following GLFW error:
    GLFW error 65537: b'The GLFW library is not initialized'
    Try setting the environment variable MUJOCO_GL to egl.
    export MUJOCO_GL=egl

License

PEGrad is licensed under the MIT License.


Acknowledgments

We thank the authors of the following projects for their work:


Citation

If you find this work useful, please consider citing:

@article{PEGRAD,
  author = {Peri, Skand and Perincherry*, Akhil and Pandit*, Bikram and Lee, Stefan},
  title = {Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control},
  journal = {Conference on Robot Learning},
  year = {2025},
}

About

Official implementation of Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control, CoRL 2025

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages