Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains

2405.17227

Published 5/28/2024 by Shangqun Yu, Nisal Perera, Daniel Marew, Donghyun Kim

Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains

Abstract

This paper addresses the challenge of terrain-adaptive dynamic locomotion in humanoid robots, a problem traditionally tackled by optimization-based methods or reinforcement learning (RL). Optimization-based methods, such as model-predictive control, excel in finding optimal reaction forces and achieving agile locomotion, especially in quadruped, but struggle with the nonlinear hybrid dynamics of legged systems and the real-time computation of step location, timing, and reaction forces. Conversely, RL-based methods show promise in navigating dynamic and rough terrains but are limited by their extensive data requirements. We introduce a novel locomotion architecture that integrates a neural network policy, trained through RL in simplified environments, with a state-of-the-art motion controller combining model-predictive control (MPC) and whole-body impulse control (WBIC). The policy efficiently learns high-level locomotion strategies, such as gait selection and step positioning, without the need for full dynamics simulations. This control architecture enables humanoid robots to dynamically navigate discrete terrains, making strategic locomotion decisions (e.g., walking, jumping, and leaping) based on ground height maps. Our results demonstrate that this integrated control architecture achieves dynamic locomotion with significantly fewer training samples than conventional RL-based methods and can be transferred to different humanoid platforms without additional training. The control architecture has been extensively tested in dynamic simulations, accomplishing terrain height-based dynamic locomotion for three different robots.

Create account to get full access

Overview

This paper presents a novel approach for learning generic and dynamic locomotion of humanoid robots across a variety of discrete terrains.
The proposed method uses deep reinforcement learning to enable humanoid robots to navigate various challenging environments, including uneven ground, stairs, and obstacles.
The authors demonstrate the effectiveness of their approach through extensive simulation experiments and real-world robot experiments.

Plain English Explanation

The paper describes a new way for humanoid robots to move around in different types of environments. Humanoid robots are robots that are designed to look and move like humans, with two legs, two arms, and a torso. The researchers developed a machine learning algorithm that allows these robots to navigate a wide range of terrains, such as uneven ground, stairs, and obstacles.

The key idea is to use a technique called deep reinforcement learning. This involves training the robot to learn how to move effectively by trial and error, with the robot receiving feedback on whether its actions are successful or not. Over time, the robot learns to make the right decisions to navigate different environments.

The researchers tested their approach extensively in computer simulations as well as with real humanoid robots. The results show that the robots are able to adapt and move efficiently across a variety of challenging terrains, which is an important capability for robots to have in the real world.

Technical Explanation

The paper presents a deep reinforcement learning approach for enabling generic and dynamic locomotion of humanoid robots across discrete terrains. The key contributions are:

Multi-Task Learning: The authors develop a multi-task learning framework that allows a single policy to be learned for navigating a wide range of discrete terrains, including uneven ground, stairs, and obstacles.
Terrain-Adaptive Control: The policy is designed to be terrain-adaptive, enabling the robot to dynamically adjust its gait and balance in response to changes in the environment.
Sim-to-Real Transfer: The authors demonstrate successful transfer of the learned policy from simulation to real-world humanoid robot experiments, showing the robustness and generalization capability of their approach.

The system uses a deep neural network to map the robot's sensory inputs (e.g., joint angles, contact forces) to appropriate control actions (e.g., joint torques) for navigating the terrain. The network is trained using proximal policy optimization, a state-of-the-art reinforcement learning algorithm.

Through extensive simulation experiments and real-world robot trials, the authors demonstrate the effectiveness of their approach in enabling humanoid robots to navigate a variety of challenging terrains with robust and adaptive locomotion.

Critical Analysis

The paper presents a promising approach for enabling humanoid robots to navigate complex environments, which is an important capability for real-world deployment. However, there are a few potential limitations and areas for further research:

Sim-to-Real Gap: While the authors demonstrate successful transfer from simulation to the real world, there may still be a gap in performance due to unmodeled real-world factors. Further research is needed to address this challenge and improve the sim-to-real transfer.
Scalability to More Terrains: The current approach is evaluated on a limited set of discrete terrains. Extending the multi-task learning framework to handle an even broader range of terrain types and conditions would be valuable.
Computational Efficiency: The deep neural network policy may be computationally intensive, which could limit its deployment on resource-constrained robot platforms. Investigating more efficient policy representations or on-device optimization techniques could help address this concern.
Safety and Robustness: While the approach demonstrates robust locomotion, further research is needed to ensure the safety and reliability of the system, particularly when deployed in real-world scenarios with unpredictable hazards.

Overall, the paper presents an important step forward in the field of legged locomotion for humanoid robots, and the proposed techniques could have significant implications for the development of more capable and versatile robotic systems.

Conclusion

This paper introduces a novel deep reinforcement learning approach for enabling humanoid robots to navigate a variety of discrete terrains, including uneven ground, stairs, and obstacles. The authors demonstrate the effectiveness of their method through extensive simulation experiments and real-world robot trials, showcasing the robots' ability to adapt their locomotion dynamically to the changing environment.

The proposed framework represents an important advancement in the field of legged robotics, as it provides a pathway for developing more versatile and capable humanoid robots that can operate in a wide range of real-world settings. While the paper highlights some potential limitations and areas for further research, the overall approach represents a significant step forward in the quest to create robots that can move and interact with the world as adeptly as humans.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🏅

Agile and versatile bipedal robot tracking control through reinforcement learning

Jiayi Li, Linqi Ye, Yi Cheng, Houde Liu, Bin Liang

The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To replicate this balance ability and body agility, this paper proposes a versatile controller for bipedal robots. This controller achieves ankle and body trajectory tracking across a wide range of gaits using a single small-scale neural network, which is based on a model-based IK solver and reinforcement learning. We consider a single step as the smallest control unit and design a universally applicable control input form suitable for any single-step variation. Highly flexible gait control can be achieved by combining these minimal control units with high-level policy through our extensible control interface. To enhance the trajectory-tracking capability of our controller, we utilize a three-stage training curriculum. After training, the robot can move freely between target footholds at varying distances and heights. The robot can also maintain static balance without repeated stepping to adjust posture. Finally, we evaluate the tracking accuracy of our controller on various bipedal tasks, and the effectiveness of our control framework is verified in the simulation environment.

4/15/2024

cs.RO cs.LG

Learning Robust Autonomous Navigation and Locomotion for Wheeled-Legged Robots

Joonho Lee, Marko Bjelonic, Alexander Reske, Lorenz Wellhausen, Takahiro Miki, Marco Hutter

Autonomous wheeled-legged robots have the potential to transform logistics systems, improving operational efficiency and adaptability in urban environments. Navigating urban environments, however, poses unique challenges for robots, necessitating innovative solutions for locomotion and navigation. These challenges include the need for adaptive locomotion across varied terrains and the ability to navigate efficiently around complex dynamic obstacles. This work introduces a fully integrated system comprising adaptive locomotion control, mobility-aware local navigation planning, and large-scale path planning within the city. Using model-free reinforcement learning (RL) techniques and privileged learning, we develop a versatile locomotion controller. This controller achieves efficient and robust locomotion over various rough terrains, facilitated by smooth transitions between walking and driving modes. It is tightly integrated with a learned navigation controller through a hierarchical RL framework, enabling effective navigation through challenging terrain and various obstacles at high speed. Our controllers are integrated into a large-scale urban navigation system and validated by autonomous, kilometer-scale navigation missions conducted in Zurich, Switzerland, and Seville, Spain. These missions demonstrate the system's robustness and adaptability, underscoring the importance of integrated control systems in achieving seamless navigation in complex environments. Our findings support the feasibility of wheeled-legged robots and hierarchical RL for autonomous navigation, with implications for last-mile delivery and beyond.

5/6/2024

cs.RO cs.LG cs.SY eess.SY

Learning H-Infinity Locomotion Control

Junfeng Long, Wenye Yu, Quanyi Li, Zirui Wang, Dahua Lin, Jiangmiao Pang

Stable locomotion in precipitous environments is an essential task for quadruped robots, requiring the ability to resist various external disturbances. Recent neural policies enhance robustness against disturbances by learning to resist external forces sampled from a fixed distribution in the simulated environment. However, the force generation process doesn't consider the robot's current state, making it difficult to identify the most effective direction and magnitude that can push the robot to the most unstable but recoverable state. Thus, challenging cases in the buffer are insufficient to optimize robustness. In this paper, we propose to model the robust locomotion learning process as an adversarial interaction between the locomotion policy and a learnable disturbance that is conditioned on the robot state to generate appropriate external forces. To make the joint optimization stable, our novel $H_{infty}$ constraint mandates the bound of the ratio between the cost and the intensity of the external forces. We verify the robustness of our approach in both simulated environments and real-world deployment, on quadrupedal locomotion tasks and a more challenging task where the quadruped performs locomotion merely on hind legs. Training and deployment code will be made public.

6/13/2024

cs.RO

Adaptive Force-Based Control of Dynamic Legged Locomotion over Uneven Terrain

Mohsen Sombolestan, Quan Nguyen

Agile-legged robots have proven to be highly effective in navigating and performing tasks in complex and challenging environments, including disaster zones and industrial settings. However, these applications normally require the capability of carrying heavy loads while maintaining dynamic motion. Therefore, this paper presents a novel methodology for incorporating adaptive control into a force-based control system. Recent advancements in the control of quadruped robots show that force control can effectively realize dynamic locomotion over rough terrain. By integrating adaptive control into the force-based controller, our proposed approach can maintain the advantages of the baseline framework while adapting to significant model uncertainties and unknown terrain impact models. Experimental validation was successfully conducted on the Unitree A1 robot. With our approach, the robot can carry heavy loads (up to 50% of its weight) while performing dynamic gaits such as fast trotting and bounding across uneven terrains.

4/9/2024

cs.RO