Instruct Large Language Models to Drive like Humans

2406.07296

Published 6/12/2024 by Ruijun Zhang, Xianda Guo, Wenzhao Zheng, Chenming Zhang, Kurt Keutzer, Long Chen

Instruct Large Language Models to Drive like Humans

Abstract

Motion planning in complex scenarios is the core challenge in autonomous driving. Conventional methods apply predefined rules or learn from driving data to plan the future trajectory. Recent methods seek the knowledge preserved in large language models (LLMs) and apply them in the driving scenarios. Despite the promising results, it is still unclear whether the LLM learns the underlying human logic to drive. In this paper, we propose an InstructDriver method to transform LLM into a motion planner with explicit instruction tuning to align its behavior with humans. We derive driving instruction data based on human logic (e.g., do not cause collisions) and traffic rules (e.g., proceed only when green lights). We then employ an interpretable InstructChain module to further reason the final planning reflecting the instructions. Our InstructDriver allows the injection of human rules and learning from driving data, enabling both interpretability and data scalability. Different from existing methods that experimented on closed-loop or simulated settings, we adopt the real-world closed-loop motion planning nuPlan benchmark for better evaluation. InstructDriver demonstrates the effectiveness of the LLM planner in a real-world closed-loop setting. Our code is publicly available at https://github.com/bonbon-rj/InstructDriver.

Create account to get full access

Overview

This paper explores the potential for instructing large language models (LLMs) to drive like humans, aiming to improve the safety and realism of autonomous driving systems.
The researchers propose a novel approach that combines LLM-based control and traditional rule-based driving algorithms, leveraging the strengths of both to create a more human-like driving experience.
Key elements of the approach include using LLMs for high-level decision-making, while relying on rule-based systems for low-level vehicle control and safety constraints.
The paper also discusses the potential for personalized driving models and the integration of LLMs with other AI techniques, such as motion prediction and collaborative driving.

Plain English Explanation

The paper explores a new way to make self-driving cars drive more like humans. Self-driving cars today mostly use rule-based systems that follow a set of predefined rules, which can make the driving feel robotic and unnatural. The researchers want to use large language models (LLMs) - powerful AI systems that can understand and generate human-like text - to make the driving more human-like.

The key idea is to have the LLM handle the high-level decisions, like where to go and when to turn, while still using traditional rule-based systems for the low-level control of the vehicle, such as steering and braking. This combines the strengths of both approaches - the LLM can make more human-like decisions, while the rule-based system ensures the car still drives safely.

The researchers also discuss the potential to create personalized driving models, where the LLM learns the unique driving style of the user. Additionally, they explore integrating the LLM-based driving with other AI techniques, such as predicting the motion of other vehicles and having the car collaborate with other self-driving vehicles on the road.

Technical Explanation

The paper proposes a novel approach to instructing large language models (LLMs) to drive in a more human-like manner. Unlike traditional rule-based autonomous driving systems, the researchers aim to leverage the strengths of LLMs for high-level decision-making while still relying on rule-based systems for low-level vehicle control and safety constraints.

The key components of the proposed approach include:

LLM-based high-level decision-making: The LLM is used to make high-level driving decisions, such as route planning, lane selection, and response to traffic conditions, in a more human-like way.
Rule-based low-level control: A traditional rule-based system is used for the low-level control of the vehicle, ensuring safe and stable operation, including steering, braking, and acceleration.
Personalized driving models: The researchers explore the potential to create personalized driving models by fine-tuning the LLM on the driving behavior of individual users, capturing their unique driving styles and preferences.
Integration with other AI techniques: The paper discusses the opportunity to integrate the LLM-based driving approach with other AI techniques, such as motion prediction and collaborative driving, to further enhance the overall performance and safety of the autonomous driving system.

Critical Analysis

The paper presents a promising approach to improving the realism and human-like behavior of autonomous driving systems. By leveraging the strengths of LLMs for high-level decision-making, the researchers aim to create a more natural and engaging driving experience.

However, the paper does not fully address the potential challenges and limitations of this approach. For example, the researchers do not discuss the reliability and robustness of the LLM-based decision-making, particularly in edge cases or unexpected situations. There are also questions about the scalability of the approach and how well it would perform in complex, real-world driving scenarios.

Additionally, the paper does not explore the ethical implications of using LLMs for critical decision-making in autonomous driving, such as the potential for biases or unpredictable behavior. Further research and testing would be needed to address these concerns and validate the safety and reliability of the proposed system.

Conclusion

The paper presents a novel approach to instructing large language models to drive in a more human-like manner, aiming to improve the realism and safety of autonomous driving systems. By combining LLM-based high-level decision-making with traditional rule-based low-level control, the researchers seek to leverage the strengths of both approaches.

The potential for personalized driving models and the integration with other AI techniques, such as motion prediction and collaborative driving, further highlight the versatility and potential of this approach. However, the paper does not fully address the challenges and limitations, such as reliability, scalability, and ethical considerations, which would need to be explored in future research.

Overall, the work presented in this paper represents an important step towards more natural and human-like autonomous driving, with significant implications for the future of transportation and mobility.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang, Yan Wang, Lijun Zhang, Si Liu

Despite real-time planners exhibiting remarkable performance in autonomous driving, the growing exploration of Large Language Models (LLMs) has opened avenues for enhancing the interpretability and controllability of motion planning. Nevertheless, LLM-based planners continue to encounter significant challenges, including elevated resource consumption and extended inference times, which pose substantial obstacles to practical deployment. In light of these challenges, we introduce AsyncDriver, a new asynchronous LLM-enhanced closed-loop framework designed to leverage scene-associated instruction features produced by LLM to guide real-time planners in making precise and controllable trajectory predictions. On one hand, our method highlights the prowess of LLMs in comprehending and reasoning with vectorized scene data and a series of routing instructions, demonstrating its effective assistance to real-time planners. On the other hand, the proposed framework decouples the inference processes of the LLM and real-time planners. By capitalizing on the asynchronous nature of their inference frequencies, our approach have successfully reduced the computational cost introduced by LLM, while maintaining comparable performance. Experiments show that our approach achieves superior closed-loop evaluation performance on nuPlan's challenging scenarios.

6/24/2024

cs.RO cs.CV

In-context Learning for Automated Driving Scenarios

Ziqi Zhou, Jingyue Zhang, Jingyuan Zhang, Boyue Wang, Tianyu Shi, Alaa Khamis

One of the key challenges in current Reinforcement Learning (RL)-based Automated Driving (AD) agents is achieving flexible, precise, and human-like behavior cost-effectively. This paper introduces an innovative approach utilizing Large Language Models (LLMs) to intuitively and effectively optimize RL reward functions in a human-centric way. We developed a framework where instructions and dynamic environment descriptions are input into the LLM. The LLM then utilizes this information to assist in generating rewards, thereby steering the behavior of RL agents towards patterns that more closely resemble human driving. The experimental results demonstrate that this approach not only makes RL agents more anthropomorphic but also reaches better performance. Additionally, various strategies for reward-proxy and reward-shaping are investigated, revealing the significant impact of prompt design on shaping an AD vehicle's behavior. These findings offer a promising direction for the development of more advanced and human-like automated driving systems. Our experimental data and source code can be found here.

5/8/2024

cs.AI

Personalized Autonomous Driving with Large Language Models: Field Experiments

Can Cui, Zichong Yang, Yupeng Zhou, Yunsheng Ma, Juanwu Lu, Lingxi Li, Yaobin Chen, Jitesh Panchal, Ziran Wang

Integrating large language models (LLMs) in autonomous vehicles enables conversation with AI systems to drive the vehicle. However, it also emphasizes the requirement for such systems to comprehend commands accurately and achieve higher-level personalization to adapt to the preferences of drivers or passengers over a more extended period. In this paper, we introduce an LLM-based framework, Talk2Drive, capable of translating natural verbal commands into executable controls and learning to satisfy personal preferences for safety, efficiency, and comfort with a proposed memory module. This is the first-of-its-kind multi-scenario field experiment that deploys LLMs on a real-world autonomous vehicle. Experiments showcase that the proposed system can comprehend human intentions at different intuition levels, ranging from direct commands like can you drive faster to indirect commands like I am really in a hurry now. Additionally, we use the takeover rate to quantify the trust of human drivers in the LLM-based autonomous driving system, where Talk2Drive significantly reduces the takeover rate in highway, intersection, and parking scenarios. We also validate that the proposed memory module considers personalized preferences and further reduces the takeover rate by up to 65.2% compared with those without a memory module. The experiment video can be watched at https://www.youtube.com/watch?v=4BWsfPaq1Ro

5/9/2024

cs.AI

💬

Driving Everywhere with Large Language Model Policy Adaptation

Boyi Li, Yue Wang, Jiageng Mao, Boris Ivanovic, Sushant Veer, Karen Leung, Marco Pavone

Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs). In this paper, we present LLaDA, a simple yet powerful tool that enables human drivers and autonomous vehicles alike to drive everywhere by adapting their tasks and motion plans to traffic rules in new locations. LLaDA achieves this by leveraging the impressive zero-shot generalizability of large language models (LLMs) in interpreting the traffic rules in the local driver handbook. Through an extensive user study, we show that LLaDA's instructions are useful in disambiguating in-the-wild unexpected situations. We also demonstrate LLaDA's ability to adapt AV motion planning policies in real-world datasets; LLaDA outperforms baseline planning approaches on all our metrics. Please check our website for more details: https://boyiliee.github.io/llada.

4/12/2024

cs.RO cs.AI cs.CL