Humanoid Parkour Learning

2406.10759

Published 6/18/2024 by Ziwen Zhuang, Shenzhe Yao, Hang Zhao

Abstract

Parkour is a grand challenge for legged locomotion, even for quadruped robots, requiring active perception and various maneuvers to overcome multiple challenging obstacles. Existing methods for humanoid locomotion either optimize a trajectory for a single parkour track or train a reinforcement learning policy only to walk with a significant amount of motion references. In this work, we propose a framework for learning an end-to-end vision-based whole-body-control parkour policy for humanoid robots that overcomes multiple parkour skills without any motion prior. Using the parkour policy, the humanoid robot can jump on a 0.42m platform, leap over hurdles, 0.8m gaps, and much more. It can also run at 1.8m/s in the wild and walk robustly on different terrains. We test our policy in indoor and outdoor environments to demonstrate that it can autonomously select parkour skills while following the rotation command of the joystick. We override the arm actions and show that this framework can easily transfer to humanoid mobile manipulation tasks. Videos can be found at https://humanoid4parkour.github.io

Create account to get full access

Overview

This paper presents a novel approach to teaching humanoid robots how to perform parkour-style movements, which involve dynamic and acrobatic locomotion over obstacles and uneven terrain.
The researchers developed a deep reinforcement learning framework that allows humanoid robots to learn a diverse set of parkour skills, including jumping, climbing, and balancing, without the need for extensive human supervision or demonstration.
The system was evaluated on a simulated humanoid robot, demonstrating its ability to adapt to various environmental conditions and execute complex parkour maneuvers with high degrees of agility and control.

Plain English Explanation

The researchers have created a system that teaches humanoid robots how to do parkour. Parkour is a type of athletic movement where people jump, climb, and balance over obstacles and uneven surfaces. The researchers used a deep learning approach, which means the robot learns these skills on its own rather than being directly programmed. The robot is able to adapt and perform a variety of parkour-style movements in different environments, showing a high level of agility and control. This is an important step towards developing robots that can move and navigate through complex, unstructured environments more effectively.

Technical Explanation

The paper presents a deep reinforcement learning framework for training humanoid robots to perform parkour-style locomotion. The key components of the system include:

Learning Generic Dynamic Locomotion for Humanoids Across Discrete Terrain: A model-free reinforcement learning approach that allows the robot to learn a diverse set of dynamic locomotion skills, including jumping, climbing, and balancing, directly from interaction with the environment.
HumanPlus: Humanoid Shadowing and Imitation from Humans: A human-robot interaction framework that enables the humanoid robot to observe and imitate human parkour demonstrations, leveraging human expertise to bootstrap the learning process.
Agile and Versatile Bipedal Robot Tracking and Control through Contact-Adaptive Whole-Body Behaviors: Whole-body control and tracking algorithms that allow the robot to execute complex parkour maneuvers with high degrees of agility and precision.

The system was evaluated on a simulated humanoid robot navigating through various parkour environments, demonstrating its ability to learn and execute diverse parkour skills, including jumping over gaps, climbing up ledges, and balancing on narrow beams.

Critical Analysis

The paper presents a compelling approach to teaching humanoid robots parkour skills, which could have significant implications for developing more agile and adaptable robotic systems. However, the research was conducted solely in a simulated environment, and further work is needed to translate these findings to real-world robotic platforms.

Additionally, the paper does not address potential safety and ethical concerns associated with deploying such highly dynamic robotic systems in unstructured, public spaces where humans may be present. Careful consideration of these issues will be necessary as the technology continues to advance.

Conclusion

This research represents an important step towards developing humanoid robots with enhanced locomotive capabilities, enabling them to navigate and operate in complex, unstructured environments more effectively. By leveraging deep reinforcement learning and human-robot interaction, the researchers have shown that it is possible to teach humanoid robots a diverse set of parkour skills, expanding the boundaries of what is possible for robotic locomotion. As the technology continues to evolve, it will be crucial to address the safety and ethical implications of deploying such highly dynamic robotic systems in real-world settings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains

Shangqun Yu, Nisal Perera, Daniel Marew, Donghyun Kim

This paper addresses the challenge of terrain-adaptive dynamic locomotion in humanoid robots, a problem traditionally tackled by optimization-based methods or reinforcement learning (RL). Optimization-based methods, such as model-predictive control, excel in finding optimal reaction forces and achieving agile locomotion, especially in quadruped, but struggle with the nonlinear hybrid dynamics of legged systems and the real-time computation of step location, timing, and reaction forces. Conversely, RL-based methods show promise in navigating dynamic and rough terrains but are limited by their extensive data requirements. We introduce a novel locomotion architecture that integrates a neural network policy, trained through RL in simplified environments, with a state-of-the-art motion controller combining model-predictive control (MPC) and whole-body impulse control (WBIC). The policy efficiently learns high-level locomotion strategies, such as gait selection and step positioning, without the need for full dynamics simulations. This control architecture enables humanoid robots to dynamically navigate discrete terrains, making strategic locomotion decisions (e.g., walking, jumping, and leaping) based on ground height maps. Our results demonstrate that this integrated control architecture achieves dynamic locomotion with significantly fewer training samples than conventional RL-based methods and can be transferred to different humanoid platforms without additional training. The control architecture has been extensively tested in dynamic simulations, accomplishing terrain height-based dynamic locomotion for three different robots.

5/28/2024

cs.RO

HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation

Carmelo Sferrazza, Dun-Ming Huang, Xingyu Lin, Youngwoon Lee, Pieter Abbeel

Humanoid robots hold great promise in assisting humans in diverse environments and tasks, due to their flexibility and adaptability leveraging human-like morphology. However, research in humanoid robots is often bottlenecked by the costly and fragile hardware setups. To accelerate algorithmic research in humanoid robots, we present a high-dimensional, simulated robot learning benchmark, HumanoidBench, featuring a humanoid robot equipped with dexterous hands and a variety of challenging whole-body manipulation and locomotion tasks. Our findings reveal that state-of-the-art reinforcement learning algorithms struggle with most tasks, whereas a hierarchical learning approach achieves superior performance when supported by robust low-level policies, such as walking or reaching. With HumanoidBench, we provide the robotics community with a platform to identify the challenges arising when solving diverse tasks with humanoid robots, facilitating prompt verification of algorithms and ideas. The open-source code is available at https://humanoid-bench.github.io.

6/21/2024

cs.RO cs.AI cs.LG

HumanPlus: Humanoid Shadowing and Imitation from Humans

Zipeng Fu, Qingqing Zhao, Qi Wu, Gordon Wetzstein, Chelsea Finn

One of the key arguments for building robots that have similar form factors to human beings is that we can leverage the massive human data for training. Yet, doing so has remained challenging in practice due to the complexities in humanoid perception and control, lingering physical gaps between humanoids and humans in morphologies and actuation, and lack of a data pipeline for humanoids to learn autonomous skills from egocentric vision. In this paper, we introduce a full-stack system for humanoids to learn motion and autonomous skills from human data. We first train a low-level policy in simulation via reinforcement learning using existing 40-hour human motion datasets. This policy transfers to the real world and allows humanoid robots to follow human body and hand motion in real time using only a RGB camera, i.e. shadowing. Through shadowing, human operators can teleoperate humanoids to collect whole-body data for learning different tasks in the real world. Using the data collected, we then perform supervised behavior cloning to train skill policies using egocentric vision, allowing humanoids to complete different tasks autonomously by imitating human skills. We demonstrate the system on our customized 33-DoF 180cm humanoid, autonomously completing tasks such as wearing a shoe to stand up and walk, unloading objects from warehouse racks, folding a sweatshirt, rearranging objects, typing, and greeting another robot with 60-100% success rates using up to 40 demonstrations. Project website: https://humanoid-ai.github.io/

6/18/2024

cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY

HYPERmotion: Learning Hybrid Behavior Planning for Autonomous Loco-manipulation

Jin Wang, Rui Dai, Weijie Wang, Luca Rossini, Francesco Ruscelli, Nikos Tsagarakis

Enabling robots to autonomously perform hybrid motions in diverse environments can be beneficial for long-horizon tasks such as material handling, household chores, and work assistance. This requires extensive exploitation of intrinsic motion capabilities, extraction of affordances from rich environmental information, and planning of physical interaction behaviors. Despite recent progress has demonstrated impressive humanoid whole-body control abilities, they struggle to achieve versatility and adaptability for new tasks. In this work, we propose HYPERmotion, a framework that learns, selects and plans behaviors based on tasks in different scenarios. We combine reinforcement learning with whole-body optimization to generate motion for 38 actuated joints and create a motion library to store the learned skills. We apply the planning and reasoning features of the large language models (LLMs) to complex loco-manipulation tasks, constructing a hierarchical task graph that comprises a series of primitive behaviors to bridge lower-level execution with higher-level planning. By leveraging the interaction of distilled spatial geometry and 2D observation with a visual language model (VLM) to ground knowledge into a robotic morphology selector to choose appropriate actions in single- or dual-arm, legged or wheeled locomotion. Experiments in simulation and real-world show that learned motions can efficiently adapt to new tasks, demonstrating high autonomy from free-text commands in unstructured scenes. Videos and website: hy-motion.github.io/

6/24/2024

cs.RO cs.AI cs.LG