Robust humanoid robot vehicle ingress with a finite state machine integrated with deep reinforcement learning

Wang, Chenzheng; Chen, Xuechao; Yu, Zhangguo; Dong, Yue; Chen, Kehong; Gergondet, Pierre

doi:10.1007/s13042-024-02407-w

Robust humanoid robot vehicle ingress with a finite state machine integrated with deep reinforcement learning

Original Article
Published: 16 October 2024

Volume 16, pages 2537–2551, (2025)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Chenzheng Wang¹,
Xuechao Chen¹,
Zhangguo Yu¹,
Yue Dong¹,
Kehong Chen¹ &
…
Pierre Gergondet²

502 Accesses
4 Citations
Explore all metrics

Abstract

Ingress task is crucial in a humanoid robot’s attempt to drive a land vehicle and reach its destination fast. Previous work is inefficient in granting robots the ability to enter a vehicle from random starting positions and orientations or withstand elasticity in vehicles, which are both hard to model. Deep Reinforcement Learning (DRL) could be introduced to address these issues. Previous applications of DRL in humanoid control tend to use consistent reward terms for the whole control process, which is not suitable for the ingress task with many distinctive states. This letter proposes a novel Finite State Machine control method integrated with Deep Reinforcement Learning for the humanoid ingress task. It collects the robot’s status at the end of each state and immediately adjusts its next move. It has a 97% ingress success rate with random initial displacement and vehicle elasticity in simulation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from €37.37 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Institutional subscriptions

Robust humanoid vehicle ingress with regression and whole body control

Article 24 June 2025

Learning positioning policies for mobile manipulation operations with deep reinforcement learning

Article Open access 17 March 2023

An End-to-End Deep Reinforcement Learning Model Based on Proximal Policy Optimization Algorithm for Autonomous Driving of Off-Road Vehicle

Data availability

The code and data used in this paper are available upon request to the first author or the corresponding author.

References

Spenko M, Buerger S, Iagnemma K (2018) The DARPA robotics challenge finals: humanoid robots to the rescue, vol 121. Springer
Book MATH Google Scholar
Sohn K, Jang G (2020) Ground vehicle driving by full sized humanoid. J Intell Robot Syst 99(2):407–425
Article Google Scholar
Kawaharazuka K, Tsuzuki K, Koga Y et al (2020) Toward autonomous driving by musculoskeletal humanoids: a study of developed hardware and learning-based software. IEEE Robot Autom Mag 27(3):84–96
Article Google Scholar
JRL (2018) mc_rtc. https://github.com/jrl-umi3218/mc_rtc. Accessed 19 May 2024
Hoffman M, Shahriari B, Aslanides J, et al (2020) Acme: a research framework for distributed reinforcement learning. arXiv preprint arXiv:2006.00979arXiv:abs/2006.00979
Todorov E, Erez T, Tassa Y (2012) Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 5026–5033
Sohn K, Oh P (2016) Optimization of humanoid’s motions under multiple constraints in vehicle ingress task. Intel Serv Robot 9(1):31–48
Article MATH Google Scholar
Sohn K (2019) Optimization of vehicle mounting motions and its application to full-sized humanoid, drc-hubo. J Intell Robot Syst 95(1):19–46
Article MathSciNet MATH Google Scholar
Schulman J, Levine S, Abbeel P, et al (2015) Trust region policy optimization. In: International Conference on machine learning, PMLR, pp 1889–1897
Schulman J, Wolski F, Dhariwal P, et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Lillicrap TP, Hunt JJ, Pritzel A, et al (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971
Yang C, Yuan K, Heng S et al (2020) Learning natural locomotion behaviors for humanoid robots using human bias. IEEE Robot Autom Lett 5(2):2610–2617
Article MATH Google Scholar
Rodriguez D, Behnke S (2021) Deepwalk: omnidirectional bipedal gait by deep reinforcement learning. In: 2021 IEEE International Conference on robotics and automation (ICRA), IEEE, pp 3033–3039
Peng XB, Abbeel P, Levine S et al (2018) Deepmimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans Graph (TOG) 37(4):1–14
MATH Google Scholar
Sferrazza C, Huang DM, Lin X, et al (2024) Humanoidbench: Simulated humanoid benchmark for whole-body locomotion and manipulation. arXiv preprint arXiv:2403.10506
Radosavovic I, Xiao T, Zhang B et al (2024) Real-world humanoid locomotion with reinforcement learning. Sci Robot 9(89):eadi9579
Article Google Scholar
Johannink T, Bahl S, Nair A, et al (2019) Residual reinforcement learning for robot control. In: 2019 International Conference on robotics and automation (ICRA), IEEE, pp 6023–6029
Li Z, Cheng X, Peng XB, et al (2021) Reinforcement learning for robust parameterized locomotion control of bipedal robots. In: 2021 IEEE International Conference on robotics and automation (ICRA), IEEE, pp 2811–2817
Muzio AF, Maximo MR, Yoneyama T (2022) Deep reinforcement learning for humanoid robot behaviors. J Intell Robot Syst 105(1):12
Article MATH Google Scholar
Guadarrama-Olvera JR, Kajita S, Cheng G (2022) Preemptive foot compliance to lower impact during biped robot walking over unknown terrain. IEEE Robot Autom Lett 7(3):8006–8011
Article Google Scholar
Kheddar A, Caron S, Gergondet P et al (2019) Humanoid robots in aircraft manufacturing: The airbus use cases. IEEE Robot Autom Mag 26(4):30–45
Article MATH Google Scholar
Murooka M, Kumagai I, Morisawa M et al (2021) Humanoid loco-manipulation planning based on graph search and reachability maps. IEEE Robot Autom Lett 6(2):1840–1847
Article MATH Google Scholar
Bouyarmane K, Chappellet K, Vaillant J et al (2018) Quadratic programming for multirobot and task-space force control. IEEE Trans Rob 35(1):64–77
Article MATH Google Scholar
Bouyarmane K, Kheddar A (2011) Using a multi-objective controller to synthesize simulated humanoid robot motion with changing contact configurations. In: 2011 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 4414–4419
Kajita S, Morisawa M, Miura K, et al (2010) Biped walking stabilization based on linear inverted pendulum tracking. In: 2010 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 4489–4496
Escande A, Miossec S, Benallegue M et al (2014) A strictly convex hull for computing proximity distances with continuous gradients. IEEE Trans Rob 30(3):666–678
Article MATH Google Scholar
Khalil HK (2015) Nonlinear control, vol 406. Pearson, New York
MATH Google Scholar
Popov I, Heess N, Lillicrap T, et al (2017) Data-efficient deep reinforcement learning for dexterous manipulation. arXiv preprint arXiv:1704.03073
Lobos-Tsunekawa K, Leiva F, Ruiz-del Solar J (2018) Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robot Autom Lett 3(4):3247–3254
Article MATH Google Scholar
Barth-Maron G, Hoffman MW, Budden D, et al (2018) Distributed distributional deterministic policy gradients. arXiv preprint arXiv:1804.08617
Singh RP, Gergondet P, Kanehiro F (2022) mc-mujoco: Simulating articulated robots with fsm controllers in mujoco. arXiv preprint arXiv:2209.00274

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62073041, and in part by “111” Project under Grant B08043.

Author information

Authors and Affiliations

School of Mechatronic Engineering, Beijing Institute of Technology, Beijing, 100043, China
Chenzheng Wang, Xuechao Chen, Zhangguo Yu, Yue Dong & Kehong Chen
CNRS-AIST Joint Robotics Laboratory, IRL, Tsukuba, 3218, Japan
Pierre Gergondet

Authors

Chenzheng Wang
View author publications
Search author on:PubMed Google Scholar
Xuechao Chen
View author publications
Search author on:PubMed Google Scholar
Zhangguo Yu
View author publications
Search author on:PubMed Google Scholar
Yue Dong
View author publications
Search author on:PubMed Google Scholar
Kehong Chen
View author publications
Search author on:PubMed Google Scholar
Pierre Gergondet
View author publications
Search author on:PubMed Google Scholar

Contributions

Chenzheng Wang contributed to the conception and design of the study. Pierre Gergondet and Chenzheng Wang implemented source codes for the actual controller. Yue Dong and Kehong Chen were responsible for data collection. Xuechao Chen and Zhangguo Yu assumed advisor roles during the research and also contributed to manuscript revision. The final manuscript was revised and approved by all authors

Corresponding author

Correspondence to Xuechao Chen.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, C., Chen, X., Yu, Z. et al. Robust humanoid robot vehicle ingress with a finite state machine integrated with deep reinforcement learning. Int. J. Mach. Learn. & Cyber. 16, 2537–2551 (2025). https://doi.org/10.1007/s13042-024-02407-w

Download citation

Received: 14 October 2023
Accepted: 30 September 2024
Published: 16 October 2024
Version of record: 16 October 2024
Issue date: April 2025
DOI: https://doi.org/10.1007/s13042-024-02407-w

Keywords

Profiles

Xuechao Chen View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+

from €37.37 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Institutional subscriptions

Robust humanoid robot vehicle ingress with a finite state machine integrated with deep reinforcement learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust humanoid vehicle ingress with regression and whole body control

Learning positioning policies for mobile manipulation operations with deep reinforcement learning

An End-to-End Deep Reinforcement Learning Model Based on Proximal Policy Optimization Algorithm for Autonomous Driving of Off-Road Vehicle

Explore related subjects

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now