Abstract
Ingress task is crucial in a humanoid robot’s attempt to drive a land vehicle and reach its destination fast. Previous work is inefficient in granting robots the ability to enter a vehicle from random starting positions and orientations or withstand elasticity in vehicles, which are both hard to model. Deep Reinforcement Learning (DRL) could be introduced to address these issues. Previous applications of DRL in humanoid control tend to use consistent reward terms for the whole control process, which is not suitable for the ingress task with many distinctive states. This letter proposes a novel Finite State Machine control method integrated with Deep Reinforcement Learning for the humanoid ingress task. It collects the robot’s status at the end of each state and immediately adjusts its next move. It has a 97% ingress success rate with random initial displacement and vehicle elasticity in simulation.












Similar content being viewed by others
Data availability
The code and data used in this paper are available upon request to the first author or the corresponding author.
References
Spenko M, Buerger S, Iagnemma K (2018) The DARPA robotics challenge finals: humanoid robots to the rescue, vol 121. Springer
Sohn K, Jang G (2020) Ground vehicle driving by full sized humanoid. J Intell Robot Syst 99(2):407–425
Kawaharazuka K, Tsuzuki K, Koga Y et al (2020) Toward autonomous driving by musculoskeletal humanoids: a study of developed hardware and learning-based software. IEEE Robot Autom Mag 27(3):84–96
JRL (2018) mc_rtc. https://github.com/jrl-umi3218/mc_rtc. Accessed 19 May 2024
Hoffman M, Shahriari B, Aslanides J, et al (2020) Acme: a research framework for distributed reinforcement learning. arXiv preprint arXiv:2006.00979arXiv:abs/2006.00979
Todorov E, Erez T, Tassa Y (2012) Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 5026–5033
Sohn K, Oh P (2016) Optimization of humanoid’s motions under multiple constraints in vehicle ingress task. Intel Serv Robot 9(1):31–48
Sohn K (2019) Optimization of vehicle mounting motions and its application to full-sized humanoid, drc-hubo. J Intell Robot Syst 95(1):19–46
Schulman J, Levine S, Abbeel P, et al (2015) Trust region policy optimization. In: International Conference on machine learning, PMLR, pp 1889–1897
Schulman J, Wolski F, Dhariwal P, et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Lillicrap TP, Hunt JJ, Pritzel A, et al (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971
Yang C, Yuan K, Heng S et al (2020) Learning natural locomotion behaviors for humanoid robots using human bias. IEEE Robot Autom Lett 5(2):2610–2617
Rodriguez D, Behnke S (2021) Deepwalk: omnidirectional bipedal gait by deep reinforcement learning. In: 2021 IEEE International Conference on robotics and automation (ICRA), IEEE, pp 3033–3039
Peng XB, Abbeel P, Levine S et al (2018) Deepmimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans Graph (TOG) 37(4):1–14
Sferrazza C, Huang DM, Lin X, et al (2024) Humanoidbench: Simulated humanoid benchmark for whole-body locomotion and manipulation. arXiv preprint arXiv:2403.10506
Radosavovic I, Xiao T, Zhang B et al (2024) Real-world humanoid locomotion with reinforcement learning. Sci Robot 9(89):eadi9579
Johannink T, Bahl S, Nair A, et al (2019) Residual reinforcement learning for robot control. In: 2019 International Conference on robotics and automation (ICRA), IEEE, pp 6023–6029
Li Z, Cheng X, Peng XB, et al (2021) Reinforcement learning for robust parameterized locomotion control of bipedal robots. In: 2021 IEEE International Conference on robotics and automation (ICRA), IEEE, pp 2811–2817
Muzio AF, Maximo MR, Yoneyama T (2022) Deep reinforcement learning for humanoid robot behaviors. J Intell Robot Syst 105(1):12
Guadarrama-Olvera JR, Kajita S, Cheng G (2022) Preemptive foot compliance to lower impact during biped robot walking over unknown terrain. IEEE Robot Autom Lett 7(3):8006–8011
Kheddar A, Caron S, Gergondet P et al (2019) Humanoid robots in aircraft manufacturing: The airbus use cases. IEEE Robot Autom Mag 26(4):30–45
Murooka M, Kumagai I, Morisawa M et al (2021) Humanoid loco-manipulation planning based on graph search and reachability maps. IEEE Robot Autom Lett 6(2):1840–1847
Bouyarmane K, Chappellet K, Vaillant J et al (2018) Quadratic programming for multirobot and task-space force control. IEEE Trans Rob 35(1):64–77
Bouyarmane K, Kheddar A (2011) Using a multi-objective controller to synthesize simulated humanoid robot motion with changing contact configurations. In: 2011 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 4414–4419
Kajita S, Morisawa M, Miura K, et al (2010) Biped walking stabilization based on linear inverted pendulum tracking. In: 2010 IEEE/RSJ International Conference on intelligent robots and systems, IEEE, pp 4489–4496
Escande A, Miossec S, Benallegue M et al (2014) A strictly convex hull for computing proximity distances with continuous gradients. IEEE Trans Rob 30(3):666–678
Khalil HK (2015) Nonlinear control, vol 406. Pearson, New York
Popov I, Heess N, Lillicrap T, et al (2017) Data-efficient deep reinforcement learning for dexterous manipulation. arXiv preprint arXiv:1704.03073
Lobos-Tsunekawa K, Leiva F, Ruiz-del Solar J (2018) Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robot Autom Lett 3(4):3247–3254
Barth-Maron G, Hoffman MW, Budden D, et al (2018) Distributed distributional deterministic policy gradients. arXiv preprint arXiv:1804.08617
Singh RP, Gergondet P, Kanehiro F (2022) mc-mujoco: Simulating articulated robots with fsm controllers in mujoco. arXiv preprint arXiv:2209.00274
Funding
This work was supported in part by the National Natural Science Foundation of China under Grant 62073041, and in part by “111” Project under Grant B08043.
Author information
Authors and Affiliations
Contributions
Chenzheng Wang contributed to the conception and design of the study. Pierre Gergondet and Chenzheng Wang implemented source codes for the actual controller. Yue Dong and Kehong Chen were responsible for data collection. Xuechao Chen and Zhangguo Yu assumed advisor roles during the research and also contributed to manuscript revision. The final manuscript was revised and approved by all authors
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, C., Chen, X., Yu, Z. et al. Robust humanoid robot vehicle ingress with a finite state machine integrated with deep reinforcement learning. Int. J. Mach. Learn. & Cyber. 16, 2537–2551 (2025). https://doi.org/10.1007/s13042-024-02407-w
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s13042-024-02407-w

