Learning H-Infinity Locomotion Control

2404.14405

Published 6/13/2024 by Junfeng Long, Wenye Yu, Quanyi Li, Zirui Wang, Dahua Lin, Jiangmiao Pang

Abstract

Stable locomotion in precipitous environments is an essential task for quadruped robots, requiring the ability to resist various external disturbances. Recent neural policies enhance robustness against disturbances by learning to resist external forces sampled from a fixed distribution in the simulated environment. However, the force generation process doesn't consider the robot's current state, making it difficult to identify the most effective direction and magnitude that can push the robot to the most unstable but recoverable state. Thus, challenging cases in the buffer are insufficient to optimize robustness. In this paper, we propose to model the robust locomotion learning process as an adversarial interaction between the locomotion policy and a learnable disturbance that is conditioned on the robot state to generate appropriate external forces. To make the joint optimization stable, our novel $H_{infty}$ constraint mandates the bound of the ratio between the cost and the intensity of the external forces. We verify the robustness of our approach in both simulated environments and real-world deployment, on quadrupedal locomotion tasks and a more challenging task where the quadruped performs locomotion merely on hind legs. Training and deployment code will be made public.

Create account to get full access

Overview

This paper presents a method for learning H-infinity locomotion control, which aims to enable agile and versatile bipedal robot tracking control.
The proposed approach combines reinforcement learning with H-infinity control theory to achieve robust and adaptive locomotion in the face of uncertainties and disturbances.
The researchers evaluate their method through simulations and real-world experiments on a bipedal robot platform.

Plain English Explanation

The researchers in this paper have developed a new way to control the movement of a two-legged (bipedal) robot. The goal is to allow the robot to move in an agile and versatile manner, even when faced with unpredictable forces or changes in its environment.

The key idea is to combine two different techniques: reinforcement learning and H-infinity control. Reinforcement learning is a type of machine learning where the robot learns by trial and error, getting rewards for desired behaviors. H-infinity control is a mathematical framework for designing robust control systems that can handle uncertainty.

By blending these approaches, the researchers aim to create a control system that can adapt to different situations and maintain stable, well-coordinated movement. This could be useful for robots that need to navigate rough terrain, respond to environmental changes, or handle heavy payloads.

The paper describes experiments conducted in simulation and with a real bipedal robot platform to evaluate the performance of their control method. The results suggest it can provide adaptive and forceful locomotion even when faced with challenging conditions.

Technical Explanation

The researchers propose a framework for learning H-infinity locomotion control, which combines reinforcement learning and robust control theory to enable agile and versatile bipedal robot tracking control.

The key components of their approach are:

Bipedal robot dynamics modeling: The researchers develop a detailed dynamic model of the bipedal robot's movement, incorporating factors like joint angles, ground reaction forces, and external disturbances.
Reinforcement learning-based locomotion policy: A deep reinforcement learning algorithm is used to train a neural network policy that maps the robot's state to desired joint torques. The policy is optimized to maximize a reward function that encourages stable, energy-efficient locomotion.
H-infinity control design: An H-infinity control law is derived to provide robust tracking of the desired locomotion policy, even in the presence of model uncertainties and external disturbances. This ensures the robot can maintain stable and well-coordinated movement.
Integrated control framework: The reinforcement learning policy and H-infinity controller are combined into an integrated control framework that can adapt to different scenarios and handle a wide range of operating conditions.

The researchers evaluate their approach through extensive simulations and real-world experiments on a bipedal robot platform. Their results demonstrate the ability of the proposed method to enable agile and versatile locomotion control, adapt to environmental changes, and handle large payloads while maintaining robust and forceful locomotion.

Critical Analysis

The paper presents a promising approach for improving the robustness and adaptability of bipedal robot locomotion control. The integration of reinforcement learning and H-infinity control is a novel and well-motivated idea, as it combines the flexibility of data-driven learning with the theoretical guarantees of robust control.

However, the paper does not address certain limitations and potential issues that could be important for real-world deployment. For example, the authors do not discuss the sample efficiency of the reinforcement learning process, which is a crucial factor for practical applications. Additionally, the experiments are limited to simulations and a single robot platform, so the generalization of the approach to other robot designs or more complex environments is not fully demonstrated.

Further research could investigate ways to improve the sample efficiency of the learning process, such as by incorporating model-based techniques or transfer learning from other control tasks. It would also be valuable to test the method on a wider range of robot platforms and in more challenging real-world scenarios, to better understand its strengths, limitations, and potential for practical deployment.

Conclusion

This paper presents a novel approach to learning H-infinity locomotion control for bipedal robots, combining reinforcement learning and robust control theory. The proposed method aims to enable agile, versatile, and robust robot tracking control, even in the face of uncertainties and disturbances.

The key contributions of this work include the development of a integrated control framework that leverages the complementary strengths of reinforcement learning and H-infinity control, as well as the demonstration of the approach's effectiveness through simulations and real-world experiments.

While the paper shows promising results, further research is needed to address certain limitations and explore the broader applicability of the method. Nonetheless, this work represents an important step towards more adaptive and reliable locomotion control for bipedal robots, with potential implications for a wide range of real-world applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains

Shangqun Yu, Nisal Perera, Daniel Marew, Donghyun Kim

This paper addresses the challenge of terrain-adaptive dynamic locomotion in humanoid robots, a problem traditionally tackled by optimization-based methods or reinforcement learning (RL). Optimization-based methods, such as model-predictive control, excel in finding optimal reaction forces and achieving agile locomotion, especially in quadruped, but struggle with the nonlinear hybrid dynamics of legged systems and the real-time computation of step location, timing, and reaction forces. Conversely, RL-based methods show promise in navigating dynamic and rough terrains but are limited by their extensive data requirements. We introduce a novel locomotion architecture that integrates a neural network policy, trained through RL in simplified environments, with a state-of-the-art motion controller combining model-predictive control (MPC) and whole-body impulse control (WBIC). The policy efficiently learns high-level locomotion strategies, such as gait selection and step positioning, without the need for full dynamics simulations. This control architecture enables humanoid robots to dynamically navigate discrete terrains, making strategic locomotion decisions (e.g., walking, jumping, and leaping) based on ground height maps. Our results demonstrate that this integrated control architecture achieves dynamic locomotion with significantly fewer training samples than conventional RL-based methods and can be transferred to different humanoid platforms without additional training. The control architecture has been extensively tested in dynamic simulations, accomplishing terrain height-based dynamic locomotion for three different robots.

5/28/2024

cs.RO

⛏️

Rethinking Robustness Assessment: Adversarial Attacks on Learning-based Quadrupedal Locomotion Controllers

Fan Shi, Chong Zhang, Takahiro Miki, Joonho Lee, Marco Hutter, Stelian Coros

Legged locomotion has recently achieved remarkable success with the progress of machine learning techniques, especially deep reinforcement learning (RL). Controllers employing neural networks have demonstrated empirical and qualitative robustness against real-world uncertainties, including sensor noise and external perturbations. However, formally investigating the vulnerabilities of these locomotion controllers remains a challenge. This difficulty arises from the requirement to pinpoint vulnerabilities across a long-tailed distribution within a high-dimensional, temporally sequential space. As a first step towards quantitative verification, we propose a computational method that leverages sequential adversarial attacks to identify weaknesses in learned locomotion controllers. Our research demonstrates that, even state-of-the-art robust controllers can fail significantly under well-designed, low-magnitude adversarial sequence. Through experiments in simulation and on the real robot, we validate our approach's effectiveness, and we illustrate how the results it generates can be used to robustify the original policy and offer valuable insights into the safety of these black-box policies. Project page: https://fanshi14.github.io/me/rss24.html

6/3/2024

cs.RO cs.LG

🔗

Physically Consistent Online Inertial Adaptation for Humanoid Loco-manipulation

James Foster, Stephen McCrory, Christian DeBuys, Sylvain Bertrand, Robert Griffin

The ability to accomplish manipulation and locomotion tasks in the presence of significant time-varying external loads is a remarkable skill of humans that has yet to be replicated convincingly by humanoid robots. Such an ability will be a key requirement in the environments we envision deploying our robots: dull, dirty, and dangerous. External loads constitute a large model bias, which is typically unaccounted for. In this work, we enable our humanoid robot to engage in loco-manipulation tasks in the presence of significant model bias due to external loads. We propose an online estimation and control framework involving the combination of a physically consistent extended Kalman filter for inertial parameter estimation coupled to a whole-body controller. We showcase our results both in simulation and in hardware, where weights are mounted on Nadia's wrist links as a proxy for engaging in tasks where large external loads are applied to the robot.

5/14/2024

cs.RO

🏅

Agile and versatile bipedal robot tracking control through reinforcement learning

Jiayi Li, Linqi Ye, Yi Cheng, Houde Liu, Bin Liang

The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To replicate this balance ability and body agility, this paper proposes a versatile controller for bipedal robots. This controller achieves ankle and body trajectory tracking across a wide range of gaits using a single small-scale neural network, which is based on a model-based IK solver and reinforcement learning. We consider a single step as the smallest control unit and design a universally applicable control input form suitable for any single-step variation. Highly flexible gait control can be achieved by combining these minimal control units with high-level policy through our extensible control interface. To enhance the trajectory-tracking capability of our controller, we utilize a three-stage training curriculum. After training, the robot can move freely between target footholds at varying distances and heights. The robot can also maintain static balance without repeated stepping to adjust posture. Finally, we evaluate the tracking accuracy of our controller on various bipedal tasks, and the effectiveness of our control framework is verified in the simulation environment.

4/15/2024

cs.RO cs.LG