Autonomous Drifting Based on Maximal Safety Probability Learning

Read original: arXiv:2409.03160 - Published 9/6/2024 by Hikaru Hoshino, Jiaxing Li, Arnav Menon, John M. Dolan, Yorie Nakahira

Autonomous Drifting Based on Maximal Safety Probability Learning

Overview

This paper presents a novel approach for autonomous drifting based on learning the maximum safety probability.
The proposed method aims to enable autonomous vehicles to perform safe and efficient drifting maneuvers.
The key idea is to learn a model that can predict the safety probability of different drifting trajectories and choose the one with the highest probability.

Plain English Explanation

The paper describes a new way for self-driving cars to perform a special driving technique called "drifting." Drifting is when a car intentionally enters a controlled skid around a corner, which can be fun and flashy but also risky.

The researchers developed a system that allows autonomous vehicles to learn how to drift safely. The core of their approach is a model that can estimate the probability that a given drifting maneuver will be safe. By choosing the drifting trajectory with the highest predicted safety probability, the autonomous vehicle can perform drifting while minimizing the risk of accidents or loss of control.

This is a complex problem because drifting involves rapidly changing vehicle dynamics that are difficult to model accurately. The researchers use machine learning techniques to build this safety prediction model from real-world driving data. This allows the autonomous vehicle to learn how to drift in a way that prioritizes safety over flashy performance.

The benefit of this approach is that it could enable autonomous vehicles to safely perform advanced driving maneuvers like drifting, which have practical applications in fields like motorsports and automotive testing. By incorporating safety directly into the decision-making, this method helps ensure the autonomous vehicle will prioritize preventing accidents over purely optimizing for speed or performance.

Technical Explanation

The key technical components of this work are:

Drifting Trajectory Generation: The researchers develop a method to generate a diverse set of candidate drifting trajectories that the autonomous vehicle can consider. This involves modeling the vehicle dynamics and using optimization techniques to find trajectories that satisfy various constraints.
Safety Probability Prediction: The core of the approach is a machine learning model that can predict the probability that a given drifting trajectory will be safe. This model is trained on real-world driving data to learn the complex relationship between vehicle state, environmental conditions, and the risk of a crash or loss of control.
Trajectory Selection: Given the set of candidate drifting trajectories and their associated safety probabilities, the autonomous vehicle selects the trajectory with the highest predicted safety to execute. This allows it to perform drifting maneuvers while prioritizing safety.

The researchers evaluate their approach through simulation experiments that demonstrate its ability to generate safe drifting trajectories in various scenarios. They show that the autonomous vehicle is able to successfully execute drifting maneuvers while maintaining high safety probabilities, outperforming baseline methods.

Critical Analysis

The paper makes a valuable contribution by proposing a safety-oriented approach to autonomous drifting, an important but challenging problem in self-driving car research. However, there are some potential limitations and areas for further work:

The safety prediction model is trained on real-world data, but it's not clear how well it would generalize to completely novel driving scenarios or edge cases that may not be well represented in the training data.
The simulation experiments provide a proof-of-concept, but testing the approach on physical self-driving car platforms would be an important next step to validate its real-world performance and robustness.
The paper does not address the computational efficiency of the trajectory generation and selection process, which could be an important consideration for real-time autonomous driving applications.

Overall, this work represents a promising step towards enabling autonomous vehicles to perform advanced maneuvers like drifting in a safe and reliable manner. Further research to address the limitations and expand the capabilities of the approach could lead to significant advancements in self-driving car technology.

Conclusion

This paper presents a novel approach for autonomous drifting based on learning the maximum safety probability of different drifting trajectories. By developing a safety prediction model and using it to guide the selection of drifting maneuvers, the proposed method allows autonomous vehicles to perform this advanced driving technique while prioritizing safety.

The key benefits of this work are the potential to enable autonomous vehicles to safely execute drifting and other advanced maneuvers, which could have practical applications in motorsports, automotive testing, and beyond. The safety-centric approach demonstrated in this research represents an important step towards developing self-driving cars that can navigate complex driving scenarios with a high degree of reliability and robustness.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Autonomous Drifting Based on Maximal Safety Probability Learning

Hikaru Hoshino, Jiaxing Li, Arnav Menon, John M. Dolan, Yorie Nakahira

This paper proposes a novel learning-based framework for autonomous driving based on the concept of maximal safety probability. Efficient learning requires rewards that are informative of desirable/undesirable states, but such rewards are challenging to design manually due to the difficulty of differentiating better states among many safe states. On the other hand, learning policies that maximize safety probability does not require laborious reward shaping but is numerically challenging because the algorithms must optimize policies based on binary rewards sparse in time. Here, we show that physics-informed reinforcement learning can efficiently learn this form of maximally safe policy. Unlike existing drift control methods, our approach does not require a specific reference trajectory or complex reward shaping, and can learn safe behaviors only from sparse binary rewards. This is enabled by the use of the physics loss that plays an analogous role to reward shaping. The effectiveness of the proposed approach is demonstrated through lane keeping in a normal cornering scenario and safe drifting in a high-speed racing scenario.

9/6/2024

A Safety-Oriented Self-Learning Algorithm for Autonomous Driving: Evolution Starting from a Basic Model

Shuo Yang, Caojun Wang, Zhenyu Ma, Yanjun Huang, Hong Chen

Autonomous driving vehicles with self-learning capabilities are expected to evolve in complex environments to improve their ability to cope with different scenarios. However, most self-learning algorithms suffer from low learning efficiency and lacking safety, which limits their applications. This paper proposes a safety-oriented self-learning algorithm for autonomous driving, which focuses on how to achieve evolution from a basic model. Specifically, a basic model based on the transformer encoder is designed to extract and output policy features from a small number of demonstration trajectories. To improve the learning efficiency, a policy mixed approach is developed. The basic model provides initial values to improve exploration efficiency, and the self-learning algorithm enhances the adaptability and generalization of the model, enabling continuous improvement without external intervention. Finally, an actor approximator based on receding horizon optimization is designed considering the constraints of the environmental input to ensure safety. The proposed method is verified in a challenging mixed traffic environment with pedestrians and vehicles. Simulation and real-vehicle test results show that the proposed method can safely and efficiently learn appropriate autonomous driving behaviors. Compared reinforcement learning and behavior cloning methods, it can achieve comprehensive improvement in learning efficiency and performance under the premise of ensuring safety.

8/23/2024

A Safe and Efficient Self-evolving Algorithm for Decision-making and Control of Autonomous Driving Systems

Shuo Yang, Liwen Wang, Yanjun Huang, Hong Chen

Autonomous vehicles with a self-evolving ability are expected to cope with unknown scenarios in the real-world environment. Take advantage of trial and error mechanism, reinforcement learning is able to self evolve by learning the optimal policy, and it is particularly well suitable for solving decision-making problems. However, reinforcement learning suffers from safety issues and low learning efficiency, especially in the continuous action space. Therefore, the motivation of this paper is to address the above problem by proposing a hybrid Mechanism-Experience-Learning augmented approach. Specifically, to realize the efficient self-evolution, the driving tendency by analogy with human driving experience is proposed to reduce the search space of the autonomous driving problem, while the constrained optimization problem based on a mechanistic model is designed to ensure safety during the self-evolving process. Experimental results show that the proposed method is capable of generating safe and reasonable actions in various complex scenarios, improving the performance of the autonomous driving system. Compared to conventional reinforcement learning, the safety and efficiency of the proposed algorithm are greatly improved. The training process is collision-free, and the training time is equivalent to less than 10 minutes in the real world.

8/23/2024

Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints

Siow Meng Low, Akshat Kumar

In safe Reinforcement Learning (RL), safety cost is typically defined as a function dependent on the immediate state and actions. In practice, safety constraints can often be non-Markovian due to the insufficient fidelity of state representation, and safety cost may not be known. We therefore address a general setting where safety labels (e.g., safe or unsafe) are associated with state-action trajectories. Our key contributions are: first, we design a safety model that specifically performs credit assignment to assess contributions of partial state-action trajectories on safety. This safety model is trained using a labeled safety dataset. Second, using RL-as-inference strategy we derive an effective algorithm for optimizing a safe policy using the learned safety model. Finally, we devise a method to dynamically adapt the tradeoff coefficient between reward maximization and safety compliance. We rewrite the constrained optimization problem into its dual problem and derive a gradient-based method to dynamically adjust the tradeoff coefficient during training. Our empirical results demonstrate that this approach is highly scalable and able to satisfy sophisticated non-Markovian safety constraints.

5/7/2024