Agile and versatile bipedal robot tracking control through reinforcement learning

2404.08246

Published 4/15/2024 by Jiayi Li, Linqi Ye, Yi Cheng, Houde Liu, Bin Liang

🏅

Abstract

The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To replicate this balance ability and body agility, this paper proposes a versatile controller for bipedal robots. This controller achieves ankle and body trajectory tracking across a wide range of gaits using a single small-scale neural network, which is based on a model-based IK solver and reinforcement learning. We consider a single step as the smallest control unit and design a universally applicable control input form suitable for any single-step variation. Highly flexible gait control can be achieved by combining these minimal control units with high-level policy through our extensible control interface. To enhance the trajectory-tracking capability of our controller, we utilize a three-stage training curriculum. After training, the robot can move freely between target footholds at varying distances and heights. The robot can also maintain static balance without repeated stepping to adjust posture. Finally, we evaluate the tracking accuracy of our controller on various bipedal tasks, and the effectiveness of our control framework is verified in the simulation environment.

Create account to get full access

Overview

The paper proposes a versatile controller for bipedal robots that can execute a wide range of learned and unlearned movements while maintaining balance.
The controller uses a small-scale neural network based on a model-based inverse kinematics solver and reinforcement learning.
The approach treats a single step as the smallest control unit and designs a universally applicable control input form.
The controller is trained using a three-stage curriculum to enhance its trajectory-tracking capabilities.
The robot can move between target footholds, maintain static balance, and perform various bipedal tasks effectively in simulation.

Plain English Explanation

Humans have an incredible ability to perform complex, dynamic movements like dancing and gymnastics while maintaining their balance. This suggests that the balance mechanism in biological beings is not tied to specific movement patterns. Instead, it allows for the execution of both learned and unlearned movements, as long as the overall body coordination is maintained.

To replicate this balance and agility in bipedal robots, the researchers developed a versatile controller. This controller uses a small neural network that combines a model-based inverse kinematics solver and reinforcement learning. The key idea is to treat a single step as the smallest control unit and design a universal control input form that can handle a wide range of step variations.

By combining these minimal control units with high-level policies, the researchers were able to achieve highly flexible gait control. To further enhance the trajectory-tracking capabilities, the controller was trained using a three-stage curriculum.

After training, the robot demonstrated impressive abilities. It could move freely between target footholds at varying distances and heights, and it could also maintain static balance without repeated stepping to adjust its posture. The researchers evaluated the tracking accuracy of the controller on various bipedal tasks, and the effectiveness of the control framework was verified in a simulation environment.

Technical Explanation

The core of the proposed controller is a small-scale neural network that integrates a model-based inverse kinematics (IK) solver and reinforcement learning. By treating a single step as the smallest control unit, the researchers designed a universally applicable control input form that can handle a wide range of gait variations.

To enhance the trajectory-tracking capability of the controller, the researchers employed a three-stage training curriculum. The first stage focused on learning basic stepping motions, the second stage introduced more complex step variations, and the final stage combined the learned skills into a versatile gait controller.

After training, the robot demonstrated impressive abilities in simulation. It could freely move between target footholds at varying distances and heights, and it could also maintain static balance without repeated stepping adjustments. The researchers evaluated the tracking accuracy of the controller on various bipedal tasks, such as navigating through obstacles and performing dynamic maneuvers.

Critical Analysis

The paper presents a promising approach to achieving versatile and agile bipedal locomotion in robots. The researchers' insight to decouple balance from specific movement patterns is a key contribution, as it aligns with the remarkable capabilities observed in human athletes.

However, the paper does not provide extensive details on the specific neural network architecture or the reinforcement learning techniques used. Additionally, the evaluation is limited to simulation environments, and it would be valuable to see how the controller performs in real-world experiments with a physical robot.

Furthermore, the paper does not address potential issues with the scalability of the approach, such as the ability to handle more complex environments or adapt to unexpected disturbances. Exploring the robustness and generalization of the controller in more challenging scenarios would be an important area for future research.

Conclusion

The proposed versatile controller for bipedal robots represents an important step towards replicating the balance and agility observed in human athletes. By decoupling balance from specific movement patterns and using a small-scale neural network, the researchers have developed a flexible control framework that can handle a wide range of gaits and maintain balance through minor whole-body coordination.

While the evaluation is limited to simulation, the results are promising and demonstrate the potential of this approach for enabling more dynamic and adaptable bipedal locomotion in robots. Further research is needed to address scalability, robustness, and real-world performance, but this work provides a valuable foundation for advancing the field of legged robotics and potentially inspiring new breakthroughs in human-like movement capabilities.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains

Shangqun Yu, Nisal Perera, Daniel Marew, Donghyun Kim

This paper addresses the challenge of terrain-adaptive dynamic locomotion in humanoid robots, a problem traditionally tackled by optimization-based methods or reinforcement learning (RL). Optimization-based methods, such as model-predictive control, excel in finding optimal reaction forces and achieving agile locomotion, especially in quadruped, but struggle with the nonlinear hybrid dynamics of legged systems and the real-time computation of step location, timing, and reaction forces. Conversely, RL-based methods show promise in navigating dynamic and rough terrains but are limited by their extensive data requirements. We introduce a novel locomotion architecture that integrates a neural network policy, trained through RL in simplified environments, with a state-of-the-art motion controller combining model-predictive control (MPC) and whole-body impulse control (WBIC). The policy efficiently learns high-level locomotion strategies, such as gait selection and step positioning, without the need for full dynamics simulations. This control architecture enables humanoid robots to dynamically navigate discrete terrains, making strategic locomotion decisions (e.g., walking, jumping, and leaping) based on ground height maps. Our results demonstrate that this integrated control architecture achieves dynamic locomotion with significantly fewer training samples than conventional RL-based methods and can be transferred to different humanoid platforms without additional training. The control architecture has been extensively tested in dynamic simulations, accomplishing terrain height-based dynamic locomotion for three different robots.

5/28/2024

cs.RO

Optimal Gait Control for a Tendon-driven Soft Quadruped Robot by Model-based Reinforcement Learning

Xuezhi Niu, Kaige Tan, Lei Feng

This study presents an innovative approach to optimal gait control for a soft quadruped robot enabled by four Compressible Tendon-driven Soft Actuators (CTSAs). Improving our previous studies of using model-free reinforcement learning for gait control, we employ model-based reinforcement learning (MBRL) to further enhance the performance of the gait controller. Compared to rigid robots, the proposed soft quadruped robot has better safety, less weight, and a simpler mechanism for fabrication and control. However, the primary challenge lies in developing sophisticated control algorithms to attain optimal gait control for fast and stable locomotion. The research employs a multi-stage methodology, including state space restriction, data-driven model training, and reinforcement learning algorithm development. Compared to benchmark methods, the proposed MBRL algorithm, combined with post-training, significantly improves the efficiency and performance of gait control policies. The developed policy is both robust and adaptable to the robot's deformable morphology. The study concludes by highlighting the practical applicability of these findings in real-world scenarios.

6/12/2024

cs.RO cs.SY eess.SY

🏅

I-CTRL: Imitation to Control Humanoid Robots Through Constrained Reinforcement Learning

Yashuai Yan, Esteve Valls Mascaro, Tobias Egle, Dongheui Lee

This paper addresses the critical need for refining robot motions that, despite achieving a high visual similarity through human-to-humanoid retargeting methods, fall short of practical execution in the physical realm. Existing techniques in the graphics community often prioritize visual fidelity over physics-based feasibility, posing a significant challenge for deploying bipedal systems in practical applications. Our research introduces a constrained reinforcement learning algorithm to produce physics-based high-quality motion imitation onto legged humanoid robots that enhance motion resemblance while successfully following the reference human trajectory. We name our framework: I-CTRL. By reformulating the motion imitation problem as a constrained refinement over non-physics-based retargeted motions, our framework excels in motion imitation with simple and unique rewards that generalize across four robots. Moreover, our framework can follow large-scale motion datasets with a unique RL agent. The proposed approach signifies a crucial step forward in advancing the control of bipedal robots, emphasizing the importance of aligning visual and physical realism for successful motion imitation.

5/15/2024

cs.RO cs.AI

📈

A Model Predictive Capture Point Control Framework for Robust Humanoid Balancing via Ankle, Hip, and Stepping Strategies

Myeong-Ju Kim, Daegyu Lim, Gyeongjae Park, Jaeheung Park

The robust balancing capability of humanoid robots has been considered one of the crucial requirements for their mobility in real environments. In particular, many studies have been devoted to the efficient implementation of human-inspired ankle, hip, and stepping strategies, to endow humanoids with human-level balancing capability. In this paper, a robust balance control framework for humanoids is proposed. Firstly, a Model Predictive Control (MPC) framework is proposed for Capture Point (CP) tracking control, enabling the integration of ankle, hip, and stepping strategies within a single framework. Additionally, a variable weighting method is introduced that adjusts the weighting parameters of the Centroidal Angular Momentum (CAM) damping control. Secondly, a hierarchical structure of the MPC and a stepping controller was proposed, allowing for the step time optimization. The robust balancing performance of the proposed method is validated through simulations and real robot experiments. Furthermore, a superior balancing performance is demonstrated compared to a state-of-the-art Quadratic Programming (QP)-based CP controller that employs the ankle, hip, and stepping strategies. The supplementary video is available at https://youtu.be/7Y4CykTpgrw

5/14/2024

cs.RO