LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

2312.04372

Published 4/5/2024 by Yunsheng Ma, Can Cui, Xu Cao, Wenqian Ye, Peiran Liu, Juanwu Lu, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Aniket Bera and 2 others

cs.CL cs.AI

💬

Abstract

Autonomous driving (AD) has made significant strides in recent years. However, existing frameworks struggle to interpret and execute spontaneous user instructions, such as overtake the car ahead. Large Language Models (LLMs) have demonstrated impressive reasoning capabilities showing potential to bridge this gap. In this paper, we present LaMPilot, a novel framework that integrates LLMs into AD systems, enabling them to follow user instructions by generating code that leverages established functional primitives. We also introduce LaMPilot-Bench, the first benchmark dataset specifically designed to quantitatively evaluate the efficacy of language model programs in AD. Adopting the LaMPilot framework, we conduct extensive experiments to assess the performance of off-the-shelf LLMs on LaMPilot-Bench. Our results demonstrate the potential of LLMs in handling diverse driving scenarios and following user instructions in driving. To facilitate further research in this area, we release our code and data at https://github.com/PurdueDigitalTwin/LaMPilot.

Get summaries of the top AI research delivered straight to your inbox:

Overview

• Autonomous driving (AD) systems have made significant progress, but struggle to interpret and follow spontaneous user instructions like "overtake the car ahead."

• Large Language Models (LLMs) have shown impressive reasoning abilities, suggesting they could bridge this gap.

• The paper presents LaMPilot, a framework that integrates LLMs into AD systems, allowing them to generate code to execute user instructions.

• The authors also introduce LaMPilot-Bench, the first benchmark dataset designed to evaluate language model programs in autonomous driving.

Plain English Explanation

Autonomous driving technology has come a long way, but it still has difficulty understanding and following quick, spontaneous instructions from users. For example, if a passenger said "Go around that car in front of us," the autonomous system might not know how to interpret and execute that command.

However, a new type of artificial intelligence called large language models (LLMs) has shown the ability to understand and reason about complex information in very human-like ways. The researchers behind this paper thought that integrating LLMs into autonomous driving systems could help those systems better understand and carry out instructions from passengers.

So they developed a new framework called LaMPilot that does just that - it takes the power of LLMs and applies it to autonomous driving, allowing the cars to follow verbal commands by generating the necessary code to execute those actions. The team also created a new benchmark dataset called LaMPilot-Bench specifically for testing how well language model programs can handle different driving scenarios and instructions.

Technical Explanation

The LaMPilot framework integrates large language models (LLMs) into autonomous driving (AD) systems, enabling them to follow user instructions by generating executable code. The key components are:

Instruction Parsing: The LLM takes a natural language instruction (e.g. "overtake the car ahead") and converts it into a structured representation.
Instruction Grounding: The structured instruction is mapped to relevant functional primitives in the AD system (e.g. lane change, acceleration).
Code Generation: The LLM generates executable code that leverages those primitives to carry out the user's instruction.

To evaluate this approach, the authors introduced LaMPilot-Bench, a new benchmark dataset designed to test how well language model programs can handle diverse driving scenarios and instructions. They conducted experiments using off-the-shelf LLMs on this dataset, demonstrating the potential of this approach for bridging the gap between user instructions and autonomous driving execution.

Critical Analysis

The LaMPilot framework represents an innovative approach to integrating natural language understanding into autonomous driving systems. By leveraging the powerful reasoning capabilities of large language models, it allows AD systems to more flexibly interpret and execute user instructions.

However, the paper acknowledges some limitations that warrant further research. The LaMPilot-Bench dataset, while a valuable contribution, is still relatively small and may not capture the full complexity of real-world driving scenarios. Additionally, the experiments only evaluated off-the-shelf LLMs, and more specialized models or fine-tuning may be required for optimal performance.

Another potential concern is the safety and reliability of allowing language model-generated code to directly control a vehicle. Extensive validation and verification would be needed to ensure the system's decision-making is robust and aligned with safe driving practices.

Overall, this research demonstrates promising progress, but there is still work to be done to fully realize the potential of language-guided autonomous driving at scale. Continued advancements in areas like dataset curation, model development, and safety assurance will be crucial.

Conclusion

This paper presents a novel framework called LaMPilot that integrates large language models (LLMs) into autonomous driving (AD) systems, allowing them to interpret and execute spontaneous user instructions. By generating executable code that leverages established functional primitives, LaMPilot bridges the gap between high-level commands and low-level vehicle control.

The authors also introduced LaMPilot-Bench, a new benchmark dataset designed to evaluate the performance of language model programs in diverse driving scenarios. Their experiments demonstrate the potential of this approach, while also highlighting areas for further research and development.

Overall, this work represents an important step forward in making autonomous driving systems more responsive to user needs and preferences. As language models continue to advance, integrating them into AD could unlock new levels of flexibility and personalization, ultimately leading to safer, more intuitive self-driving experiences.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Personalized Autonomous Driving with Large Language Models: Field Experiments

Can Cui, Zichong Yang, Yupeng Zhou, Yunsheng Ma, Juanwu Lu, Lingxi Li, Yaobin Chen, Jitesh Panchal, Ziran Wang

Integrating large language models (LLMs) in autonomous vehicles enables conversation with AI systems to drive the vehicle. However, it also emphasizes the requirement for such systems to comprehend commands accurately and achieve higher-level personalization to adapt to the preferences of drivers or passengers over a more extended period. In this paper, we introduce an LLM-based framework, Talk2Drive, capable of translating natural verbal commands into executable controls and learning to satisfy personal preferences for safety, efficiency, and comfort with a proposed memory module. This is the first-of-its-kind multi-scenario field experiment that deploys LLMs on a real-world autonomous vehicle. Experiments showcase that the proposed system can comprehend human intentions at different intuition levels, ranging from direct commands like can you drive faster to indirect commands like I am really in a hurry now. Additionally, we use the takeover rate to quantify the trust of human drivers in the LLM-based autonomous driving system, where Talk2Drive significantly reduces the takeover rate in highway, intersection, and parking scenarios. We also validate that the proposed memory module considers personalized preferences and further reduces the takeover rate by up to 65.2% compared with those without a memory module. The experiment video can be watched at https://www.youtube.com/watch?v=4BWsfPaq1Ro

5/9/2024

cs.AI

In-context Learning for Automated Driving Scenarios

Ziqi Zhou, Jingyue Zhang, Jingyuan Zhang, Boyue Wang, Tianyu Shi, Alaa Khamis

One of the key challenges in current Reinforcement Learning (RL)-based Automated Driving (AD) agents is achieving flexible, precise, and human-like behavior cost-effectively. This paper introduces an innovative approach utilizing Large Language Models (LLMs) to intuitively and effectively optimize RL reward functions in a human-centric way. We developed a framework where instructions and dynamic environment descriptions are input into the LLM. The LLM then utilizes this information to assist in generating rewards, thereby steering the behavior of RL agents towards patterns that more closely resemble human driving. The experimental results demonstrate that this approach not only makes RL agents more anthropomorphic but also reaches better performance. Additionally, various strategies for reward-proxy and reward-shaping are investigated, revealing the significant impact of prompt design on shaping an AD vehicle's behavior. These findings offer a promising direction for the development of more advanced and human-like automated driving systems. Our experimental data and source code can be found here.

5/8/2024

cs.AI

💬

Driving Everywhere with Large Language Model Policy Adaptation

Boyi Li, Yue Wang, Jiageng Mao, Boris Ivanovic, Sushant Veer, Karen Leung, Marco Pavone

Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs). In this paper, we present LLaDA, a simple yet powerful tool that enables human drivers and autonomous vehicles alike to drive everywhere by adapting their tasks and motion plans to traffic rules in new locations. LLaDA achieves this by leveraging the impressive zero-shot generalizability of large language models (LLMs) in interpreting the traffic rules in the local driver handbook. Through an extensive user study, we show that LLaDA's instructions are useful in disambiguating in-the-wild unexpected situations. We also demonstrate LLaDA's ability to adapt AV motion planning policies in real-world datasets; LLaDA outperforms baseline planning approaches on all our metrics. Please check our website for more details: https://boyiliee.github.io/llada.

4/12/2024

cs.RO cs.AI cs.CL

Prompting Multi-Modal Tokens to Enhance End-to-End Autonomous Driving Imitation Learning with LLMs

Yiqun Duan, Qiang Zhang, Renjing Xu

The utilization of Large Language Models (LLMs) within the realm of reinforcement learning, particularly as planners, has garnered a significant degree of attention in recent scholarly literature. However, a substantial proportion of existing research predominantly focuses on planning models for robotics that transmute the outputs derived from perception models into linguistic forms, thus adopting a `pure-language' strategy. In this research, we propose a hybrid End-to-End learning framework for autonomous driving by combining basic driving imitation learning with LLMs based on multi-modality prompt tokens. Instead of simply converting perception results from the separated train model into pure language input, our novelty lies in two aspects. 1) The end-to-end integration of visual and LiDAR sensory input into learnable multi-modality tokens, thereby intrinsically alleviating description bias by separated pre-trained perception models. 2) Instead of directly letting LLMs drive, this paper explores a hybrid setting of letting LLMs help the driving model correct mistakes and complicated scenarios. The results of our experiments suggest that the proposed methodology can attain driving scores of 49.21%, coupled with an impressive route completion rate of 91.34% in the offline evaluation conducted via CARLA. These performance metrics are comparable to the most advanced driving models.

4/9/2024

cs.RO cs.AI