Continual Driving Policy Optimization with Closed-Loop Individualized Curricula

Read original: arXiv:2309.14209 - Published 8/14/2024 by Haoyi Niu, Yizhou Xu, Xingjian Jiang, Jianming Hu

🛠️

Overview

Autonomous vehicle (AV) safety is a major concern due to the lack of rare and critical scenarios in real-world driving data.
Researchers have focused on generating high-risk scenarios to test AV models, but limited work has been done on using these scenarios to iteratively improve AV performance.
The paper presents a framework called Closed-Loop Individualized Curricula (CLIC) that aims to address this challenge.

Plain English Explanation

The paper tackles the problem of improving the safety of autonomous vehicles (AVs). One of the key challenges is that real-world driving data does not contain enough rare and high-risk scenarios that could reveal the limitations of AV systems. To address this, researchers have developed methods to generate these types of risky driving scenarios and use them to test AV models.

However, the paper argues that simply testing AVs on these scenarios is not enough. The key insight is that these scenario libraries could be used to iteratively improve the AV models themselves, rather than just evaluating them. The paper presents a framework called Closed-Loop Individualized Curricula (CLIC) that does exactly this.

CLIC works by first evaluating the AV's performance on the high-risk scenarios. It then uses this evaluation to select the most relevant scenarios for further training of the AV. This allows the AV to be gradually exposed to more and more challenging situations, tailored to its current capabilities. The goal is to maximize the utility of the pre-collected scenario library to drive continuous improvement of the AV system.

The paper demonstrates that this approach leads to better management of risky scenarios, while still maintaining proficiency in simpler driving situations. In other words, it helps the AV become safer and more capable over time, by focusing its training on the areas where it needs the most improvement.

Technical Explanation

The key components of the CLIC framework are:

AV Evaluation: CLIC frames AV evaluation as a collision prediction task, where it estimates the probability of AV failure in each scenario.
Scenario Selection: Based on the failure probabilities, CLIC re-samples from the historical scenario library to create individualized training curricula that align with the AV's current capabilities.
AV Training: The AV is then trained on these tailored scenario sets, allowing it to gradually improve its performance on more challenging situations.

The authors show that CLIC outperforms other curriculum-based training strategies. It enables better management of risky scenarios while maintaining proficiency in simpler cases. This is achieved by maximizing the utility of the pre-collected scenario library to drive continuous, closed-loop improvement of the AV system.

Critical Analysis

The paper presents a promising approach to address a critical challenge in autonomous vehicle safety. By leveraging pre-collected scenario libraries for iterative policy optimization, CLIC aims to overcome the limitations of real-world driving data.

However, the paper does not discuss potential limitations or concerns with this approach. For example, the reliance on simulated scenarios raises questions about the generalizability of the improvements to real-world conditions. Additionally, the computational overhead of the scenario evaluation and selection process may be a practical concern for deployment.

Further research is needed to understand the broader implications of this framework, such as its scalability, robustness to noisy or biased scenario data, and the potential for negative side effects (e.g., overfitting to the training scenarios). Systematic comparisons to other AV safety testing and improvement techniques would also help to contextualize the contributions of this work.

Conclusion

The paper presents a novel framework called Closed-Loop Individualized Curricula (CLIC) that aims to address the challenge of improving autonomous vehicle safety. By leveraging pre-collected scenario libraries and tailoring training curricula to the AV's current capabilities, CLIC demonstrates substantial improvements in managing risky driving situations while maintaining proficiency in simpler cases.

This work highlights the potential of using scenario-based approaches to drive continuous, closed-loop optimization of AV systems. While further research is needed to address the limitations and broader implications, the CLIC framework represents a promising step towards enhancing the safety and capabilities of autonomous vehicles.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🛠️

Continual Driving Policy Optimization with Closed-Loop Individualized Curricula

Haoyi Niu, Yizhou Xu, Xingjian Jiang, Jianming Hu

The safety of autonomous vehicles (AV) has been a long-standing top concern, stemming from the absence of rare and safety-critical scenarios in the long-tail naturalistic driving distribution. To tackle this challenge, a surge of research in scenario-based autonomous driving has emerged, with a focus on generating high-risk driving scenarios and applying them to conduct safety-critical testing of AV models. However, limited work has been explored on the reuse of these extensive scenarios to iteratively improve AV models. Moreover, it remains intractable and challenging to filter through gigantic scenario libraries collected from other AV models with distinct behaviors, attempting to extract transferable information for current AV improvement. Therefore, we develop a continual driving policy optimization framework featuring Closed-Loop Individualized Curricula (CLIC), which we factorize into a set of standardized sub-modules for flexible implementation choices: AV Evaluation, Scenario Selection, and AV Training. CLIC frames AV Evaluation as a collision prediction task, where it estimates the chance of AV failures in these scenarios at each iteration. Subsequently, by re-sampling from historical scenarios based on these failure probabilities, CLIC tailors individualized curricula for downstream training, aligning them with the evaluated capability of AV. Accordingly, CLIC not only maximizes the utilization of the vast pre-collected scenario library for closed-loop driving policy optimization but also facilitates AV improvement by individualizing its training with more challenging cases out of those poorly organized scenarios. Experimental results clearly indicate that CLIC surpasses other curriculum-based training strategies, showing substantial improvement in managing risky scenarios, while still maintaining proficiency in handling simpler cases.

8/14/2024

Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation

Hanlin Tian, Kethan Reddy, Yuxiang Feng, Mohammed Quddus, Yiannis Demiris, Panagiotis Angeloudis

This paper introduces CRITICAL, a novel closed-loop framework for autonomous vehicle (AV) training and testing. CRITICAL stands out for its ability to generate diverse scenarios, focusing on critical driving situations that target specific learning and performance gaps identified in the Reinforcement Learning (RL) agent. The framework achieves this by integrating real-world traffic dynamics, driving behavior analysis, surrogate safety measures, and an optional Large Language Model (LLM) component. It is proven that the establishment of a closed feedback loop between the data generation pipeline and the training process can enhance the learning rate during training, elevate overall system performance, and augment safety resilience. Our evaluations, conducted using the Proximal Policy Optimization (PPO) and the HighwayEnv simulation environment, demonstrate noticeable performance improvements with the integration of critical case generation and LLM analysis, indicating CRITICAL's potential to improve the robustness of AV systems and streamline the generation of critical scenarios. This ultimately serves to hasten the development of AV agents, expand the general scope of RL training, and ameliorate validation efforts for AV safety.

4/15/2024

🧪

Life-long Learning and Testing for Automated Vehicles via Adaptive Scenario Sampling as A Continuous Optimization Process

Jingwei Ge, Pengbo Wang, Cheng Chang, Yi Zhang, Danya Yao, Li Li

Sampling critical testing scenarios is an essential step in intelligence testing for Automated Vehicles (AVs). However, due to the lack of prior knowledge on the distribution of critical scenarios in sampling space, we can hardly efficiently find the critical scenarios or accurately evaluate the intelligence of AVs. To solve this problem, we formulate the testing as a continuous optimization process which iteratively generates potential critical scenarios and meanwhile evaluates these scenarios. A bi-level loop is proposed for such life-long learning and testing. In the outer loop, we iteratively learn space knowledge by evaluating AV in the already sampled scenarios and then sample new scenarios based on the retained knowledge. Outer loop stops when all generated samples cover the whole space. While to maximize the coverage of the space in each outer loop, we set an inner loop which receives newly generated samples in outer loop and outputs the updated positions of these samples. We assume that points in a small sphere-like subspace can be covered (or represented) by the point in the center of this sphere. Therefore, we can apply a multi-rounds heuristic strategy to move and pack these spheres in space to find the best covering solution. The simulation results show that faster and more accurate evaluation of AVs can be achieved with more critical scenarios.

5/3/2024

🏋️

Dynamically Expanding Capacity of Autonomous Driving with Near-Miss Focused Training Framework

Ziyuan Yang, Zhaoyang Li, Jianming Hu, Yi Zhang

The long-tail distribution of real driving data poses challenges for training and testing autonomous vehicles (AV), where rare yet crucial safety-critical scenarios are infrequent. And virtual simulation offers a low-cost and efficient solution. This paper proposes a near-miss focused training framework for AV. Utilizing the driving scenario information provided by sensors in the simulator, we design novel reward functions, which enable background vehicles (BV) to generate near-miss scenarios and ensure gradients exist not only in collision-free scenes but also in collision scenarios. And then leveraging the Robust Adversarial Reinforcement Learning (RARL) framework for simultaneous training of AV and BV to gradually enhance AV and BV capabilities, as well as generating near-miss scenarios tailored to different levels of AV capabilities. Results from three testing strategies indicate that the proposed method generates scenarios closer to near-miss, thus enhancing the capabilities of both AVs and BVs throughout training.

6/6/2024