Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)

Read original: arXiv:2402.16720 - Published 7/23/2024 by Qifeng Li, Xiaosong Jia, Shaobo Wang, Junchi Yan

Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)

Overview

The paper presents a reinforcement learning (RL) approach for autonomous driving in the CARLA-v2 simulator.
The key contribution is a "Think2Drive" framework that uses a latent world model to efficiently learn driving policies.
The framework demonstrates improved performance and sample efficiency compared to standard RL methods.

Plain English Explanation

The paper explores a new way to train self-driving car algorithms using a technique called reinforcement learning. Reinforcement learning is a type of machine learning where an agent (in this case, the self-driving car) learns by trial-and-error interactions with an environment (the simulated driving scenario).

The researchers developed a framework called "Think2Drive" that aims to make this reinforcement learning process more efficient. The core idea is to train the self-driving car to first learn an internal "world model" - a mental representation of the driving environment and how it works. Once this world model is learned, the car can "think" about and simulate different driving actions in this internal model, without having to actually try them out in the real environment.

This allows the car to explore and learn effective driving policies much faster, requiring fewer real-world interactions and trials. The paper shows that Think2Drive outperforms standard reinforcement learning approaches in terms of both performance and sample efficiency (the number of trials needed to learn a good policy) when tested in the CARLA-v2 driving simulator.

Technical Explanation

The core of the Think2Drive framework is a latent world model that learns a compressed, abstract representation of the driving environment. This world model is trained using a variational autoencoder (VAE) architecture, which learns to encode the raw sensory inputs (e.g. camera images) into a low-dimensional latent space.

The agent then uses this learned world model to plan and simulate future trajectories in the latent space, rather than having to interact directly with the full, complex environment. This allows it to efficiently explore different driving actions and learn an optimal policy, without the need for extensive real-world trial-and-error.

The paper evaluates Think2Drive on a range of tasks in the CARLA-v2 driving simulator, including lane following, intersection navigation, and overtaking. The results show significant improvements in terms of both task performance and sample efficiency compared to standard RL methods and other world model-based approaches like CardReamer.

Critical Analysis

The paper presents a well-designed study with thorough experimentation and analysis. The use of the CARLA-v2 simulator provides a quasi-realistic environment for evaluating the proposed Think2Drive framework.

One potential limitation is the reliance on a simulated environment, which may not fully capture the complexity and unpredictability of real-world driving scenarios. Further validation in physical robotic platforms would be valuable to assess the approach's real-world applicability.

Additionally, the paper does not provide extensive details on the training process and hyperparameter tuning, which could be important for reproducing the results. More transparency in these areas would be helpful for the research community.

Overall, the Think2Drive framework demonstrates a promising direction for improving the efficiency and performance of reinforcement learning for autonomous driving, and the paper provides a solid foundation for future research in this area.

Conclusion

The "Think2Drive" framework presented in this paper offers a novel approach to autonomous driving using a latent world model and efficient reinforcement learning. By allowing the agent to "think" and simulate driving actions in an internal, compressed representation of the environment, the system is able to learn effective driving policies much more quickly than standard RL methods.

The results in the CARLA-v2 simulator are encouraging, showing significant improvements in both task performance and sample efficiency. While further validation in real-world settings is needed, this work represents an important step forward in developing more practical and scalable reinforcement learning solutions for self-driving car systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang, Junchi Yan

Real-world autonomous driving (AD) especially urban driving involves many corner cases. The lately released AD simulator CARLA v2 adds 39 common events in the driving scene, and provide more quasi-realistic testbed compared to CARLA v1. It poses new challenge to the community and so far no literature has reported any success on the new scenarios in V2 as existing works mostly have to rely on specific rules for planning yet they cannot cover the more complex cases in CARLA v2. In this work, we take the initiative of directly training a planner and the hope is to handle the corner cases flexibly and effectively, which we believe is also the future of AD. To our best knowledge, we develop the first model-based RL method named Think2Drive for AD, with a world model to learn the transitions of the environment, and then it acts as a neural simulator to train the planner. This paradigm significantly boosts the training efficiency due to the low dimensional state space and parallel computing of tensors in the world model. As a result, Think2Drive is able to run in an expert-level proficiency in CARLA v2 within 3 days of training on a single A6000 GPU, and to our best knowledge, so far there is no reported success (100% route completion)on CARLA v2. We also propose CornerCase-Repository, a benchmark that supports the evaluation of driving models by scenarios. Additionally, we propose a new and balanced metric to evaluate the performance by route completion, infraction number, and scenario density, so that the driving score could give more information about the actual driving performance.

7/23/2024

Deep Reinforcement Learning for Adverse Garage Scenario Generation

Kai Li

Autonomous vehicles need to travel over 11 billion miles to ensure their safety. Therefore, the importance of simulation testing before real-world testing is self-evident. In recent years, the release of 3D simulators for autonomous driving, represented by Carla and CarSim, marks the transition of autonomous driving simulation testing environments from simple 2D overhead views to complex 3D models. During simulation testing, experimenters need to build static scenes and dynamic traffic flows, pedestrian flows, and other experimental elements to construct experimental scenarios. When building static scenes in 3D simulators, experimenters often need to manually construct 3D models, set parameters and attributes, which is time-consuming and labor-intensive. This thesis proposes an automated program generation framework. Based on deep reinforcement learning, this framework can generate different 2D ground script codes, on which 3D model files and map model files are built. The generated 3D ground scenes are displayed in the Carla simulator, where experimenters can use this scene for navigation algorithm simulation testing.

7/2/2024

CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving

Dechen Gao, Shuangyu Cai, Hanchu Zhou, Hang Wang, Iman Soltani, Junshan Zhang

To safely navigate intricate real-world scenarios, autonomous vehicles must be able to adapt to diverse road conditions and anticipate future events. World model (WM) based reinforcement learning (RL) has emerged as a promising approach by learning and predicting the complex dynamics of various environments. Nevertheless, to the best of our knowledge, there does not exist an accessible platform for training and testing such algorithms in sophisticated driving environments. To fill this void, we introduce CarDreamer, the first open-source learning platform designed specifically for developing WM based autonomous driving algorithms. It comprises three key components: 1) World model backbone: CarDreamer has integrated some state-of-the-art WMs, which simplifies the reproduction of RL algorithms. The backbone is decoupled from the rest and communicates using the standard Gym interface, so that users can easily integrate and test their own algorithms. 2) Built-in tasks: CarDreamer offers a comprehensive set of highly configurable driving tasks which are compatible with Gym interfaces and are equipped with empirically optimized reward functions. 3) Task development suite: This suite streamlines the creation of driving tasks, enabling easy definition of traffic flows and vehicle routes, along with automatic collection of multi-modal observation data. A visualization server allows users to trace real-time agent driving videos and performance metrics through a browser. Furthermore, we conduct extensive experiments using built-in tasks to evaluate the performance and potential of WMs in autonomous driving. Thanks to the richness and flexibility of CarDreamer, we also systematically study the impact of observation modality, observability, and sharing of vehicle intentions on AV safety and efficiency. All code and documents are accessible on https://github.com/ucd-dare/CarDreamer.

7/29/2024

🤿

A Platform-Agnostic Deep Reinforcement Learning Framework for Effective Sim2Real Transfer in Autonomous Driving

Dianzhao Li, Ostap Okhrin

Deep Reinforcement Learning (DRL) has shown remarkable success in solving complex tasks across various research fields. However, transferring DRL agents to the real world is still challenging due to the significant discrepancies between simulation and reality. To address this issue, we propose a robust DRL framework that leverages platform-dependent perception modules to extract task-relevant information and train a lane-following and overtaking agent in simulation. This framework facilitates the seamless transfer of the DRL agent to new simulated environments and the real world with minimal effort. We evaluate the performance of the agent in various driving scenarios in both simulation and the real world, and compare it to human players and the PID baseline in simulation. Our proposed framework significantly reduces the gaps between different platforms and the Sim2Real gap, enabling the trained agent to achieve similar performance in both simulation and the real world, driving the vehicle effectively.

8/29/2024