GAD-Generative Learning for HD Map-Free Autonomous Driving

2405.00515

Published 6/3/2024 by Weijian Sun, Yanbo Jia, Qi Zeng, Zihao Liu, Jiang Liao, Yue Li, Xianfeng Li

GAD-Generative Learning for HD Map-Free Autonomous Driving

Abstract

Deep-learning-based techniques have been widely adopted for autonomous driving software stacks for mass production in recent years, focusing primarily on perception modules, with some work extending this method to prediction modules. However, the downstream planning and control modules are still designed with hefty handcrafted rules, dominated by optimization-based methods such as quadratic programming or model predictive control. This results in a performance bottleneck for autonomous driving systems in that corner cases simply cannot be solved by enumerating hand-crafted rules. We present a deep-learning-based approach that brings prediction, decision, and planning modules together with the attempt to overcome the rule-based methods' deficiency in real-world applications of autonomous driving, especially for urban scenes. The DNN model we proposed is solely trained with 10 hours of human driver data, and it supports all mass-production ADAS features available on the market to date. This method is deployed onto a Jiyue test car with no modification to its factory-ready sensor set and compute platform. the feasibility, usability, and commercial potential are demonstrated in this article.

Create account to get full access

Overview

This paper introduces GAD, a novel generative learning approach for autonomous driving that does not rely on high-definition (HD) maps.
GAD utilizes a multi-task generative adversarial network (GAN) to learn a joint distribution of the agent's state, action, and surroundings, enabling end-to-end autonomous driving without the need for expensive HD maps.
The paper also presents several benchmark tasks and evaluation metrics to assess the performance of HD map-free autonomous driving systems.

Plain English Explanation

The researchers have developed a new system called GAD (Generative Learning for HD Map-Free Autonomous Driving) that enables self-driving cars to navigate without relying on highly detailed maps. Typically, autonomous vehicles use these high-definition (HD) maps to understand the layout of the roads and surroundings. However, creating and maintaining these HD maps can be very costly and time-consuming.

GAD instead uses a machine learning technique called generative adversarial networks (GANs) to learn the relationship between the car's movements, the actions it takes, and the environment around it. This allows the system to understand the driving environment without needing the expensive HD maps. The researchers have also created some benchmark tasks and evaluation methods to measure how well these HD map-free autonomous driving systems perform.

The key idea behind GAD is to have the system learn a comprehensive model of the driving environment through experience, rather than relying on pre-built maps. This could make autonomous driving systems more accessible and scalable, as they would no longer be dependent on the availability of detailed maps.

Technical Explanation

The paper introduces a novel generative learning approach called GAD (Generative Learning for HD Map-Free Autonomous Driving) that enables end-to-end autonomous driving without the need for high-definition (HD) maps. GAD utilizes a multi-task generative adversarial network (GAN) to learn a joint distribution of the agent's state, action, and surroundings. This allows the system to understand the driving environment and make appropriate decisions without relying on the costly and time-consuming process of creating and maintaining HD maps.

The GAN-based architecture consists of a generator network that learns to produce realistic driving scenes, and a discriminator network that tries to distinguish between real and generated scenes. By training these networks in an adversarial manner, the generator learns to produce driving scenes that are indistinguishable from real-world data, effectively capturing the underlying structure of the environment.

In addition to the generative model, the paper also presents several benchmark tasks and evaluation metrics to assess the performance of HD map-free autonomous driving systems. These include tasks such as lane following, obstacle avoidance, and intersection negotiation, as well as metrics like trajectory prediction accuracy, safety, and efficiency.

The key contribution of this work is the demonstration that autonomous driving can be achieved without relying on expensive HD maps, which could significantly reduce the cost and complexity of deploying these systems. The generative learning approach used in GAD represents a promising direction for making autonomous driving more accessible and scalable.

Critical Analysis

The paper presents a compelling approach to autonomous driving that does not rely on HD maps, which could have significant practical implications. By using a generative learning framework, the system can potentially adapt to a wide range of driving environments without the need for extensive map data.

However, the paper does not fully address the potential limitations and challenges of this approach. For example, it is unclear how the system would handle sudden changes in the environment, such as temporary obstacles or construction zones, which may not be captured in the learned model. Additionally, the paper does not discuss the computational and memory requirements of the GAN-based architecture, which could be a concern for real-world deployment in resource-constrained vehicles.

Furthermore, the benchmarks and evaluation metrics introduced in the paper, while a valuable contribution, may not fully capture the complex and dynamic nature of real-world driving scenarios. Additional testing and validation would be needed to ensure the robustness and safety of the GAD system in more diverse and unpredictable environments.

Despite these potential limitations, the overall approach presented in this paper represents an important step towards more accessible and scalable autonomous driving systems. By reducing the reliance on expensive HD maps, the GAD framework could pave the way for broader adoption of self-driving technologies, with the potential to improve transportation accessibility and efficiency.

Conclusion

The GAD paper introduces a novel generative learning approach for autonomous driving that does not require high-definition (HD) maps. By using a multi-task generative adversarial network (GAN), the system is able to learn a comprehensive model of the driving environment, enabling end-to-end autonomous driving without the need for costly and time-consuming map data.

This work represents a significant advancement in the field of autonomous driving, as it could potentially make these systems more accessible and scalable. By removing the dependency on HD maps, the GAD framework could reduce the barriers to deploying self-driving technologies in a wider range of settings, with the ultimate goal of improving transportation safety, accessibility, and efficiency.

While the paper raises some important questions about the limitations and challenges of this approach, the overall concept of using generative learning for HD map-free autonomous driving is a promising direction that warrants further research and development. As the field of autonomous driving continues to evolve, innovative approaches like GAD will play a crucial role in making self-driving technologies more widely available and impactful.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

GenAD: Generative End-to-End Autonomous Driving

Wenzhao Zheng, Ruiqi Song, Xianda Guo, Chenming Zhang, Long Chen

Directly producing planning results from raw sensors has been a long-desired solution for autonomous driving and has attracted increasing attention recently. Most existing end-to-end autonomous driving methods factorize this problem into perception, motion prediction, and planning. However, we argue that the conventional progressive pipeline still cannot comprehensively model the entire traffic evolution process, e.g., the future interaction between the ego car and other traffic participants and the structural trajectory prior. In this paper, we explore a new paradigm for end-to-end autonomous driving, where the key is to predict how the ego car and the surroundings evolve given past scenes. We propose GenAD, a generative framework that casts autonomous driving into a generative modeling problem. We propose an instance-centric scene tokenizer that first transforms the surrounding scenes into map-aware instance tokens. We then employ a variational autoencoder to learn the future trajectory distribution in a structural latent space for trajectory prior modeling. We further adopt a temporal model to capture the agent and ego movements in the latent space to generate more effective future trajectories. GenAD finally simultaneously performs motion prediction and planning by sampling distributions in the learned structural latent space conditioned on the instance tokens and using the learned temporal model to generate futures. Extensive experiments on the widely used nuScenes benchmark show that the proposed GenAD achieves state-of-the-art performance on vision-centric end-to-end autonomous driving with high efficiency. Code: https://github.com/wzzheng/GenAD.

4/9/2024

cs.CV

Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving

Jianbiao Mei, Yukai Ma, Xuemeng Yang, Licheng Wen, Xinyu Cai, Xin Li, Daocheng Fu, Bo Zhang, Pinlong Cai, Min Dou, Botian Shi, Liang He, Yong Liu, Yu Qiao

Autonomous driving has advanced significantly due to sensors, machine learning, and artificial intelligence improvements. However, prevailing methods struggle with intricate scenarios and causal relationships, hindering adaptability and interpretability in varied environments. To address the above problems, we introduce LeapAD, a novel paradigm for autonomous driving inspired by the human cognitive process. Specifically, LeapAD emulates human attention by selecting critical objects relevant to driving decisions, simplifying environmental interpretation, and mitigating decision-making complexities. Additionally, LeapAD incorporates an innovative dual-process decision-making module, which consists of an Analytic Process (System-II) for thorough analysis and reasoning, along with a Heuristic Process (System-I) for swift and empirical processing. The Analytic Process leverages its logical reasoning to accumulate linguistic driving experience, which is then transferred to the Heuristic Process by supervised fine-tuning. Through reflection mechanisms and a growing memory bank, LeapAD continuously improves itself from past mistakes in a closed-loop environment. Closed-loop testing in CARLA shows that LeapAD outperforms all methods relying solely on camera input, requiring 1-2 orders of magnitude less labeled data. Experiments also demonstrate that as the memory bank expands, the Heuristic Process with only 1.8B parameters can inherit the knowledge from a GPT-4 powered Analytic Process and achieve continuous performance improvement. Code will be released at https://github.com/PJLab-ADG/LeapAD.

5/27/2024

cs.RO cs.AI cs.CV

Multi-Task Lane-Free Driving Strategy for Connected and Automated Vehicles: A Multi-Agent Deep Reinforcement Learning Approach

Mehran Berahman, Majid Rostami-Shahrbabaki, Klaus Bogenberger

Deep reinforcement learning has shown promise in various engineering applications, including vehicular traffic control. The non-stationary nature of traffic, especially in the lane-free environment with more degrees of freedom in vehicle behaviors, poses challenges for decision-making since a wrong action might lead to a catastrophic failure. In this paper, we propose a novel driving strategy for Connected and Automated Vehicles (CAVs) based on a competitive Multi-Agent Deep Deterministic Policy Gradient approach. The developed multi-agent deep reinforcement learning algorithm creates a dynamic and non-stationary scenario, mirroring real-world traffic complexities and making trained agents more robust. The algorithm's reward function is strategically and uniquely formulated to cover multiple vehicle control tasks, including maintaining desired speeds, overtaking, collision avoidance, and merging and diverging maneuvers. Moreover, additional considerations for both lateral and longitudinal passenger comfort and safety criteria are taken into account. We employed inter-vehicle forces, known as nudging and repulsive forces, to manage the maneuvers of CAVs in a lane-free traffic environment. The proposed driving algorithm is trained and evaluated on lane-free roads using the Simulation of Urban Mobility platform. Experimental results demonstrate the algorithm's efficacy in handling different objectives, highlighting its potential to enhance safety and efficiency in autonomous driving within lane-free traffic environments.

6/24/2024

cs.RO

AD-H: Autonomous Driving with Hierarchical Agents

Zaibin Zhang, Shiyu Tang, Yuanhang Zhang, Talas Fu, Yifan Wang, Yang Liu, Dong Wang, Jing Shao, Lijun Wang, Huchuan Lu

Due to the impressive capabilities of multimodal large language models (MLLMs), recent works have focused on employing MLLM-based agents for autonomous driving in large-scale and dynamic environments. However, prevalent approaches often directly translate high-level instructions into low-level vehicle control signals, which deviates from the inherent language generation paradigm of MLLMs and fails to fully harness their emergent powers. As a result, the generalizability of these methods is highly restricted by autonomous driving datasets used during fine-tuning. To tackle this challenge, we propose to connect high-level instructions and low-level control signals with mid-level language-driven commands, which are more fine-grained than high-level instructions but more universal and explainable than control signals, and thus can effectively bridge the gap in between. We implement this idea through a hierarchical multi-agent driving system named AD-H, including a MLLM planner for high-level reasoning and a lightweight controller for low-level execution. The hierarchical design liberates the MLLM from low-level control signal decoding and therefore fully releases their emergent capability in high-level perception, reasoning, and planning. We build a new dataset with action hierarchy annotations. Comprehensive closed-loop evaluations demonstrate several key advantages of our proposed AD-H system. First, AD-H can notably outperform state-of-the-art methods in achieving exceptional driving performance, even exhibiting self-correction capabilities during vehicle operation, a scenario not encountered in the training dataset. Second, AD-H demonstrates superior generalization under long-horizon instructions and novel environmental conditions, significantly surpassing current state-of-the-art methods. We will make our data and code publicly accessible at https://github.com/zhangzaibin/AD-H

6/6/2024

cs.CV