Life-long Learning and Testing for Automated Vehicles via Adaptive Scenario Sampling as A Continuous Optimization Process

Read original: arXiv:2405.00696 - Published 5/3/2024 by Jingwei Ge, Pengbo Wang, Cheng Chang, Yi Zhang, Danya Yao, Li Li

🧪

Overview

Automated Vehicles (AVs) need extensive testing to evaluate their intelligence and safety
Sampling critical testing scenarios is challenging due to lack of knowledge about the distribution of critical scenarios
The paper proposes a continuous optimization process to iteratively generate and evaluate potential critical scenarios

Plain English Explanation

The paper addresses the challenge of testing the intelligence and safety of Automated Vehicles (AVs). Testing AVs involves identifying and evaluating critical scenarios - situations that are particularly challenging for the AV's decision-making and behavior. However, because we don't have a good understanding of where these critical scenarios are likely to occur in the vast space of possible driving situations, it's difficult to efficiently find and test them.

To solve this problem, the researchers propose an iterative optimization process. In this approach, the system first evaluates the AV's performance on the scenarios it has already sampled. It then uses that information to strategically sample new, potentially critical scenarios, with the goal of gradually covering the entire space of possible scenarios.

The key innovation is a bi-level loop structure. The outer loop focuses on learning about the overall scenario space and generating new samples to cover it. The inner loop then works to maximize the coverage of the space with each new set of samples, using a heuristic strategy to efficiently "pack" the samples.

The researchers demonstrate that this iterative approach can lead to faster and more accurate evaluation of AVs, by identifying critical scenarios that reveal important strengths and weaknesses. This could be an important step towards enhancing autonomous vehicle training through language model integration and generating interactive traffic scenarios for scene extrapolation.

Technical Explanation

The key technical elements of the paper are:

Formulating testing as a continuous optimization process: The researchers frame the problem of AV testing as a continuous optimization task, where the goal is to iteratively generate and evaluate potential critical scenarios.
Bi-level loop structure: The proposed approach uses a bi-level loop to achieve this. The outer loop focuses on learning about the overall scenario space and generating new samples to cover it. The inner loop then works to maximize the coverage of the space with each new set of samples.
Heuristic sphere-packing strategy: To maximize coverage in the inner loop, the researchers assume that points in a small "sphere-like" subspace can be represented by the center point of that sphere. They then apply a multi-round heuristic strategy to efficiently "pack" these spheres in the scenario space.
Evaluation and scenario generation: The outer loop iteratively evaluates the AV's performance on the currently sampled scenarios, and then uses that information to generate new, potentially critical scenarios to sample in the next iteration.

The researchers demonstrate through simulation that this approach can lead to faster and more accurate evaluation of AVs, by identifying critical scenarios that reveal important strengths and weaknesses.

Critical Analysis

The paper presents a novel and promising approach to the challenge of testing the intelligence and safety of AVs. However, there are a few potential limitations and areas for further research:

Assumptions and Simplifications: The researchers make some simplifying assumptions, such as the "sphere-like" subspaces and the heuristic packing strategy. It's unclear how well these assumptions hold in the real-world complexity of driving scenarios, and how they might impact the effectiveness of the approach.
Validation and Generalization: The paper only presents simulation results, and it's unclear how well the approach would translate to real-world AV testing. Further validation and testing on actual AV systems would be needed to assess the generalizability of the findings.
Computational Complexity: The iterative optimization process, while conceptually elegant, may be computationally intensive, especially as the scenario space grows. The scalability of the approach should be carefully considered, particularly for large-scale AV testing.
Broader Considerations: The paper focuses narrowly on the technical aspects of scenario testing, but there may be broader considerations around the completeness of argumentation and scenario concepts that should be taken into account when evaluating AV intelligence and safety.

Overall, the paper presents a promising approach, but further research and validation would be needed to fully assess its practical utility and limitations in the context of real-world AV testing and development.

Conclusion

This paper addresses a crucial challenge in the testing and evaluation of Automated Vehicles (AVs) - the difficulty of efficiently identifying and assessing critical scenarios that reveal the strengths and weaknesses of AV decision-making and behavior.

The researchers propose an innovative continuous optimization approach, using a bi-level loop structure to iteratively generate and evaluate potential critical scenarios. This allows the system to gradually build up knowledge about the distribution of critical scenarios in the vast space of possible driving situations.

While the paper presents promising simulation results, further research and validation would be needed to fully assess the practical utility and scalability of this approach in the context of real-world AV testing and development. Nonetheless, this work represents an important step towards more comprehensive and effective testing of autonomous vehicle intelligence, which is crucial for ensuring the safety and reliability of these emerging technologies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🧪

Life-long Learning and Testing for Automated Vehicles via Adaptive Scenario Sampling as A Continuous Optimization Process

Jingwei Ge, Pengbo Wang, Cheng Chang, Yi Zhang, Danya Yao, Li Li

Sampling critical testing scenarios is an essential step in intelligence testing for Automated Vehicles (AVs). However, due to the lack of prior knowledge on the distribution of critical scenarios in sampling space, we can hardly efficiently find the critical scenarios or accurately evaluate the intelligence of AVs. To solve this problem, we formulate the testing as a continuous optimization process which iteratively generates potential critical scenarios and meanwhile evaluates these scenarios. A bi-level loop is proposed for such life-long learning and testing. In the outer loop, we iteratively learn space knowledge by evaluating AV in the already sampled scenarios and then sample new scenarios based on the retained knowledge. Outer loop stops when all generated samples cover the whole space. While to maximize the coverage of the space in each outer loop, we set an inner loop which receives newly generated samples in outer loop and outputs the updated positions of these samples. We assume that points in a small sphere-like subspace can be covered (or represented) by the point in the center of this sphere. Therefore, we can apply a multi-rounds heuristic strategy to move and pack these spheres in space to find the best covering solution. The simulation results show that faster and more accurate evaluation of AVs can be achieved with more critical scenarios.

5/3/2024

Few-Shot Scenario Testing for Autonomous Vehicles Based on Neighborhood Coverage and Similarity

Shu Li, Jingxuan Yang, Honglin He, Yi Zhang, Jianming Hu, Shuo Feng

Testing and evaluating the safety performance of autonomous vehicles (AVs) is essential before the large-scale deployment. Practically, the number of testing scenarios permissible for a specific AV is severely limited by tight constraints on testing budgets and time. With the restrictions imposed by strictly restricted numbers of tests, existing testing methods often lead to significant uncertainty or difficulty to quantifying evaluation results. In this paper, we formulate this problem for the first time the few-shot testing (FST) problem and propose a systematic framework to address this challenge. To alleviate the considerable uncertainty inherent in a small testing scenario set, we frame the FST problem as an optimization problem and search for the testing scenario set based on neighborhood coverage and similarity. Specifically, under the guidance of better generalization ability of the testing scenario set on AVs, we dynamically adjust this set and the contribution of each testing scenario to the evaluation result based on coverage, leveraging the prior information of surrogate models (SMs). With certain hypotheses on SMs, a theoretical upper bound of evaluation error is established to verify the sufficiency of evaluation accuracy within the given limited number of tests. The experiment results on cut-in scenarios demonstrate a notable reduction in evaluation error and variance of our method compared to conventional testing methods, especially for situations with a strict limit on the number of scenarios.

4/24/2024

🛠️

Continual Driving Policy Optimization with Closed-Loop Individualized Curricula

Haoyi Niu, Yizhou Xu, Xingjian Jiang, Jianming Hu

The safety of autonomous vehicles (AV) has been a long-standing top concern, stemming from the absence of rare and safety-critical scenarios in the long-tail naturalistic driving distribution. To tackle this challenge, a surge of research in scenario-based autonomous driving has emerged, with a focus on generating high-risk driving scenarios and applying them to conduct safety-critical testing of AV models. However, limited work has been explored on the reuse of these extensive scenarios to iteratively improve AV models. Moreover, it remains intractable and challenging to filter through gigantic scenario libraries collected from other AV models with distinct behaviors, attempting to extract transferable information for current AV improvement. Therefore, we develop a continual driving policy optimization framework featuring Closed-Loop Individualized Curricula (CLIC), which we factorize into a set of standardized sub-modules for flexible implementation choices: AV Evaluation, Scenario Selection, and AV Training. CLIC frames AV Evaluation as a collision prediction task, where it estimates the chance of AV failures in these scenarios at each iteration. Subsequently, by re-sampling from historical scenarios based on these failure probabilities, CLIC tailors individualized curricula for downstream training, aligning them with the evaluated capability of AV. Accordingly, CLIC not only maximizes the utilization of the vast pre-collected scenario library for closed-loop driving policy optimization but also facilitates AV improvement by individualizing its training with more challenging cases out of those poorly organized scenarios. Experimental results clearly indicate that CLIC surpasses other curriculum-based training strategies, showing substantial improvement in managing risky scenarios, while still maintaining proficiency in handling simpler cases.

8/14/2024

Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles

Qiujing Lu, Xuanhan Wang, Yiwei Jiang, Guangming Zhao, Mingyue Ma, Shuo Feng

The generation of corner cases has become increasingly crucial for efficiently testing autonomous vehicles prior to road deployment. However, existing methods struggle to accommodate diverse testing requirements and often lack the ability to generalize to unseen situations, thereby reducing the convenience and usability of the generated scenarios. A method that facilitates easily controllable scenario generation for efficient autonomous vehicles (AV) testing with realistic and challenging situations is greatly needed. To address this, we proposed OmniTester: a multimodal Large Language Model (LLM) based framework that fully leverages the extensive world knowledge and reasoning capabilities of LLMs. OmniTester is designed to generate realistic and diverse scenarios within a simulation environment, offering a robust solution for testing and evaluating AVs. In addition to prompt engineering, we employ tools from Simulation of Urban Mobility to simplify the complexity of codes generated by LLMs. Furthermore, we incorporate Retrieval-Augmented Generation and a self-improvement mechanism to enhance the LLM's understanding of scenarios, thereby increasing its ability to produce more realistic scenes. In the experiments, we demonstrated the controllability and realism of our approaches in generating three types of challenging and complex scenarios. Additionally, we showcased its effectiveness in reconstructing new scenarios described in crash report, driven by the generalization capability of LLMs.

9/11/2024