Skip to main content
Log in

Robust humanoid robot vehicle ingress with a finite state machine integrated with deep reinforcement learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Ingress task is crucial in a humanoid robot’s attempt to drive a land vehicle and reach its destination fast. Previous work is inefficient in granting robots the ability to enter a vehicle from random starting positions and orientations or withstand elasticity in vehicles, which are both hard to model. Deep Reinforcement Learning (DRL) could be introduced to address these issues. Previous applications of DRL in humanoid control tend to use consistent reward terms for the whole control process, which is not suitable for the ingress task with many distinctive states. This letter proposes a novel Finite State Machine control method integrated with Deep Reinforcement Learning for the humanoid ingress task. It collects the robot’s status at the end of each state and immediately adjusts its next move. It has a 97% ingress success rate with random initial displacement and vehicle elasticity in simulation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from €37.37 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Fig. 1
The alternative text for this image may have been generated using AI.
Fig. 2
The alternative text for this image may have been generated using AI.
Fig. 3
The alternative text for this image may have been generated using AI.
Fig. 4
The alternative text for this image may have been generated using AI.
Fig. 5
The alternative text for this image may have been generated using AI.
Fig. 6
The alternative text for this image may have been generated using AI.
Fig. 7
The alternative text for this image may have been generated using AI.
Fig. 8
The alternative text for this image may have been generated using AI.
Fig. 9
The alternative text for this image may have been generated using AI.
Fig. 10
The alternative text for this image may have been generated using AI.
Fig. 11
The alternative text for this image may have been generated using AI.
Fig. 12
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Data availability

The code and data used in this paper are available upon request to the first author or the corresponding author.

References

  1. Spenko M, Buerger S, Iagnemma K (2018) The DARPA robotics challenge finals: humanoid robots to the rescue, vol 121. Springer

    Book  MATH  Google Scholar 

  2. Sohn K, Jang G (2020) Ground vehicle driving by full sized humanoid. J Intell Robot Syst 99(2):407–425

    Article  Google Scholar 

  3. Kawaharazuka K, Tsuzuki K, Koga Y et al (2020) Toward autonomous driving by musculoskeletal humanoids: a study of developed hardware and learning-based software. IEEE Robot Autom Mag 27(3):84–96

    Article  Google Scholar 

  4. JRL (2018) mc_rtc. https://github.com/jrl-umi3218/mc_rtc. Accessed 19 May 2024

  5. Hoffman M, Shahriari B, Aslanides J, et al (2020) Acme: a research framework for distributed reinforcement learning. arXiv preprint arXiv:2006.00979arXiv:abs/2006.00979

  6. Todorov E, Erez T, Tassa Y (2012) Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 5026–5033

  7. Sohn K, Oh P (2016) Optimization of humanoid’s motions under multiple constraints in vehicle ingress task. Intel Serv Robot 9(1):31–48

    Article  MATH  Google Scholar 

  8. Sohn K (2019) Optimization of vehicle mounting motions and its application to full-sized humanoid, drc-hubo. J Intell Robot Syst 95(1):19–46

    Article  MathSciNet  MATH  Google Scholar 

  9. Schulman J, Levine S, Abbeel P, et al (2015) Trust region policy optimization. In: International Conference on machine learning, PMLR, pp 1889–1897

  10. Schulman J, Wolski F, Dhariwal P, et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

  11. Lillicrap TP, Hunt JJ, Pritzel A, et al (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971

  12. Yang C, Yuan K, Heng S et al (2020) Learning natural locomotion behaviors for humanoid robots using human bias. IEEE Robot Autom Lett 5(2):2610–2617

    Article  MATH  Google Scholar 

  13. Rodriguez D, Behnke S (2021) Deepwalk: omnidirectional bipedal gait by deep reinforcement learning. In: 2021 IEEE International Conference on robotics and automation (ICRA), IEEE, pp 3033–3039

  14. Peng XB, Abbeel P, Levine S et al (2018) Deepmimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans Graph (TOG) 37(4):1–14

    MATH  Google Scholar 

  15. Sferrazza C, Huang DM, Lin X, et al (2024) Humanoidbench: Simulated humanoid benchmark for whole-body locomotion and manipulation. arXiv preprint arXiv:2403.10506

  16. Radosavovic I, Xiao T, Zhang B et al (2024) Real-world humanoid locomotion with reinforcement learning. Sci Robot 9(89):eadi9579

    Article  Google Scholar 

  17. Johannink T, Bahl S, Nair A, et al (2019) Residual reinforcement learning for robot control. In: 2019 International Conference on robotics and automation (ICRA), IEEE, pp 6023–6029

  18. Li Z, Cheng X, Peng XB, et al (2021) Reinforcement learning for robust parameterized locomotion control of bipedal robots. In: 2021 IEEE International Conference on robotics and automation (ICRA), IEEE, pp 2811–2817

  19. Muzio AF, Maximo MR, Yoneyama T (2022) Deep reinforcement learning for humanoid robot behaviors. J Intell Robot Syst 105(1):12

    Article  MATH  Google Scholar 

  20. Guadarrama-Olvera JR, Kajita S, Cheng G (2022) Preemptive foot compliance to lower impact during biped robot walking over unknown terrain. IEEE Robot Autom Lett 7(3):8006–8011

    Article  Google Scholar 

  21. Kheddar A, Caron S, Gergondet P et al (2019) Humanoid robots in aircraft manufacturing: The airbus use cases. IEEE Robot Autom Mag 26(4):30–45

    Article  MATH  Google Scholar 

  22. Murooka M, Kumagai I, Morisawa M et al (2021) Humanoid loco-manipulation planning based on graph search and reachability maps. IEEE Robot Autom Lett 6(2):1840–1847

    Article  MATH  Google Scholar 

  23. Bouyarmane K, Chappellet K, Vaillant J et al (2018) Quadratic programming for multirobot and task-space force control. IEEE Trans Rob 35(1):64–77

    Article  MATH  Google Scholar 

  24. Bouyarmane K, Kheddar A (2011) Using a multi-objective controller to synthesize simulated humanoid robot motion with changing contact configurations. In: 2011 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 4414–4419

  25. Kajita S, Morisawa M, Miura K, et al (2010) Biped walking stabilization based on linear inverted pendulum tracking. In: 2010 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 4489–4496

  26. Escande A, Miossec S, Benallegue M et al (2014) A strictly convex hull for computing proximity distances with continuous gradients. IEEE Trans Rob 30(3):666–678

    Article  MATH  Google Scholar 

  27. Khalil HK (2015) Nonlinear control, vol 406. Pearson, New York

    MATH  Google Scholar 

  28. Popov I, Heess N, Lillicrap T, et al (2017) Data-efficient deep reinforcement learning for dexterous manipulation. arXiv preprint arXiv:1704.03073

  29. Lobos-Tsunekawa K, Leiva F, Ruiz-del Solar J (2018) Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robot Autom Lett 3(4):3247–3254

    Article  MATH  Google Scholar 

  30. Barth-Maron G, Hoffman MW, Budden D, et al (2018) Distributed distributional deterministic policy gradients. arXiv preprint arXiv:1804.08617

  31. Singh RP, Gergondet P, Kanehiro F (2022) mc-mujoco: Simulating articulated robots with fsm controllers in mujoco. arXiv preprint arXiv:2209.00274

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62073041, and in part by “111” Project under Grant B08043.

Author information

Authors and Affiliations

Authors

Contributions

Chenzheng Wang contributed to the conception and design of the study. Pierre Gergondet and Chenzheng Wang implemented source codes for the actual controller. Yue Dong and Kehong Chen were responsible for data collection. Xuechao Chen and Zhangguo Yu assumed advisor roles during the research and also contributed to manuscript revision. The final manuscript was revised and approved by all authors

Corresponding author

Correspondence to Xuechao Chen.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Chen, X., Yu, Z. et al. Robust humanoid robot vehicle ingress with a finite state machine integrated with deep reinforcement learning. Int. J. Mach. Learn. & Cyber. 16, 2537–2551 (2025). https://doi.org/10.1007/s13042-024-02407-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s13042-024-02407-w

Keywords

Profiles

  1. Xuechao Chen