Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

2111.09971

Published 4/4/2024 by Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni

🏋️

Abstract

This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations. We assume that a model of the system dynamics and a state estimator are available along with corresponding error bounds, e.g., estimated from data in practice. We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety, as defined through controlled forward invariance of a safe set. We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior, e.g., data collected from a human operator or an expert controller. When the parametrization of the ROCBF is linear, then we show that, under mild assumptions, the optimization problem is convex. Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF. Towards obtaining a practical control algorithm, we propose an algorithmic implementation of our theoretical framework that accounts for assumptions made in our framework in practice. We validate our algorithm in the autonomous driving simulator CARLA and demonstrate how to learn safe control laws from simulated RGB camera images.

Create account to get full access

Overview

This paper addresses the problem of learning safe control laws from partial observations of expert demonstrations.
The researchers assume the availability of a system dynamics model and state estimator, along with corresponding error bounds.
The key idea is to use robust output control barrier functions (ROCBFs) to guarantee safety, defined as the controlled forward invariance of a safe set.
The researchers formulate an optimization problem to learn ROCBFs from expert demonstrations exhibiting safe system behavior.
When the ROCBF parametrization is linear, the optimization problem is shown to be convex under mild assumptions.
The researchers also provide verifiable conditions to ensure the validity of the obtained ROCBF.
An algorithmic implementation is proposed to account for practical assumptions in the framework.
The approach is validated in the CARLA autonomous driving simulator, learning safe control laws from simulated RGB camera images.

Plain English Explanation

The paper explores a way to learn safe control strategies by observing how experts, such as human drivers or advanced controllers, operate a system. The key idea is to use a mathematical construct called a "robust output control barrier function" (ROCBF) to ensure that the learned control strategy keeps the system within a "safe" region of operation.

Imagine you're trying to teach a self-driving car how to navigate safely. You have access to a model of how the car behaves and a way to estimate the car's current state, but these models aren't perfect - they have some inherent uncertainty or error. The researchers' approach allows you to learn a control strategy from watching an expert human driver, and the ROCBF ensures that the learned strategy will keep the car within safe bounds, even in the presence of these modeling uncertainties.

The optimization problem used to learn the ROCBF is designed to be convex, meaning it can be solved efficiently. The researchers also provide rules to ensure the learned ROCBF is valid and reliable, based on factors like the quality of the expert demonstrations and the accuracy of the system models.

Ultimately, this approach allows for the safe deployment of autonomous systems by learning from expert demonstrations, rather than relying solely on potentially imperfect models or hand-tuned control strategies.

Technical Explanation

The key technical components of the paper are:

Robust Output Control Barrier Functions (ROCBFs): The researchers use ROCBFs to guarantee the safety of the learned control strategy, defined as the controlled forward invariance of a "safe set" in the system's state space. ROCBFs account for uncertainties in the system model and state estimation.
Optimization Problem for Learning ROCBFs: The researchers formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior. When the ROCBF parametrization is linear, the optimization problem is shown to be convex under mild assumptions.
Verifiable Conditions for ROCBF Validity: The researchers provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds to ensure the validity of the obtained ROCBF.
Algorithmic Implementation: The researchers propose an algorithmic implementation of their theoretical framework that accounts for the assumptions made in the practical setting, such as the availability of a system dynamics model and state estimator with known error bounds.
Validation in CARLA Simulator: The researchers validate their approach in the CARLA autonomous driving simulator, demonstrating the ability to learn safe control laws from simulated RGB camera images.

Critical Analysis

The paper presents a compelling approach to learning safe control strategies from expert demonstrations, which could significantly impact the development of autonomous systems. However, some potential limitations and areas for further research include:

Reliance on Accurate System Models: The approach assumes the availability of an accurate system dynamics model and state estimator, which may not always be the case in real-world applications. Further research could explore techniques to relax these assumptions.
Scalability to Complex Systems: The paper focuses on relatively simple systems, and the scalability of the approach to more complex, high-dimensional systems remains to be investigated.
Robustness to Diverse Expert Behaviors: The paper assumes that the expert demonstrations exhibit safe system behavior. In practice, experts may exhibit a wider range of behaviors, and the approach's ability to handle such diversity should be explored.
Bridging the Gap to Real-World Deployment: While the validation in the CARLA simulator is promising, further research is needed to bridge the gap between simulation and real-world deployment, addressing factors such as sensor noise, environmental variability, and hardware limitations.

Overall, the paper presents an innovative approach to learning safe control strategies from expert demonstrations, with the potential to significantly advance the development of autonomous systems. However, further research is needed to address the identified limitations and expand the approach's applicability to more complex, real-world scenarios.

Conclusion

This paper proposes a novel framework for learning safe control laws from partial observations of expert demonstrations. By using robust output control barrier functions (ROCBFs) to guarantee safety, the researchers have developed a systematic approach to leverage expert knowledge and overcome the limitations of imperfect system models and state estimators.

The key contributions of this work include the formulation of a convex optimization problem to learn ROCBFs, the provision of verifiable conditions for ROCBF validity, and the demonstration of the approach in the CARLA autonomous driving simulator. These advancements have the potential to significantly impact the development of safe and reliable autonomous systems, where learning from expert demonstrations can provide a crucial complement to traditional model-based control strategies.

While the paper presents a promising step forward, further research is needed to address the identified limitations, such as the reliance on accurate system models, the scalability to complex systems, and the robustness to diverse expert behaviors. By continuing to build upon this foundation, the researchers can help pave the way for the safe and widespread deployment of autonomous technologies, with far-reaching implications for transportation, robotics, and beyond.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

➖

Synthesis and verification of robust-adaptive safe controllers

Simin Liu, Kai S. Yun, John M. Dolan, Changliu Liu

Safe control with guarantees generally requires the system model to be known. It is far more challenging to handle systems with uncertain parameters. In this paper, we propose a generic algorithm that can synthesize and verify safe controllers for systems with constant, unknown parameters. In particular, we use robust-adaptive control barrier functions (raCBFs) to achieve safety. We develop new theories and techniques using sum-of-squares that enable us to pose synthesis and verification as a series of convex optimization problems. In our experiments, we show that our algorithms are general and scalable, applying them to three different polynomial systems of up to moderate size (7D). Our raCBFs are currently the most effective way to guarantee safety for uncertain systems, achieving 100% safety and up to 55% performance improvement over a robust baseline.

4/4/2024

eess.SY cs.RO cs.SY

Learning Piecewise Residuals of Control Barrier Functions for Safety of Switching Systems using Multi-Output Gaussian Processes

Mohammad Aali, Jun Liu

Control barrier functions (CBFs) have recently been introduced as a systematic tool to ensure safety by establishing set invariance. When combined with a control Lyapunov function (CLF), they form a safety-critical control mechanism. However, the effectiveness of CBFs and CLFs is closely tied to the system model. In practice, model uncertainty can jeopardize safety and stability guarantees and may lead to undesirable performance. In this paper, we develop a safe learning-based control strategy for switching systems in the face of uncertainty. We focus on the case that a nominal model is available for a true underlying switching system. This uncertainty results in piecewise residuals for each switching surface, impacting the CLF and CBF constraints. We introduce a batch multi-output Gaussian process (MOGP) framework to approximate these piecewise residuals, thereby mitigating the adverse effects of uncertainty. A particular structure of the covariance function enables us to convert the MOGP-based chance constraints CLF and CBF into second-order cone constraints, which leads to a convex optimization. We analyze the feasibility of the resulting optimization and provide the necessary and sufficient conditions for feasibility. The effectiveness of the proposed strategy is validated through a simulation of a switching adaptive cruise control system.

4/22/2024

eess.SY cs.RO cs.SY

Constructive Safety-Critical Control: Synthesizing Control Barrier Functions for Partially Feedback Linearizable Systems

Max H. Cohen, Ryan K. Cosner, Aaron D. Ames

Certifying the safety of nonlinear systems, through the lens of set invariance and control barrier functions (CBFs), offers a powerful method for controller synthesis, provided a CBF can be constructed. This paper draws connections between partial feedback linearization and CBF synthesis. We illustrate that when a control affine system is input-output linearizable with respect to a smooth output function, then, under mild regularity conditions, one may extend any safety constraint defined on the output to a CBF for the full-order dynamics. These more general results are specialized to robotic systems where the conditions required to synthesize CBFs simplify. The CBFs constructed from our approach are applied and verified in simulation and hardware experiments on a quadrotor.

6/6/2024

eess.SY cs.RO cs.SY

🏅

Learning Control Barrier Functions and their application in Reinforcement Learning: A Survey

Maeva Guerrier, Hassan Fouad, Giovanni Beltrame

Reinforcement learning is a powerful technique for developing new robot behaviors. However, typical lack of safety guarantees constitutes a hurdle for its practical application on real robots. To address this issue, safe reinforcement learning aims to incorporate safety considerations, enabling faster transfer to real robots and facilitating lifelong learning. One promising approach within safe reinforcement learning is the use of control barrier functions. These functions provide a framework to ensure that the system remains in a safe state during the learning process. However, synthesizing control barrier functions is not straightforward and often requires ample domain knowledge. This challenge motivates the exploration of data-driven methods for automatically defining control barrier functions, which is highly appealing. We conduct a comprehensive review of the existing literature on safe reinforcement learning using control barrier functions. Additionally, we investigate various techniques for automatically learning the Control Barrier Functions, aiming to enhance the safety and efficacy of Reinforcement Learning in practical robot applications.

4/29/2024

cs.LG cs.AI cs.RO cs.SY eess.SY