LGmap: Local-to-Global Mapping Network for Online Long-Range Vectorized HD Map Construction

2406.13988

Published 6/21/2024 by Kuang Wu, Sulei Nian, Can Shen, Chuan Yang, Zhanbin Li

LGmap: Local-to-Global Mapping Network for Online Long-Range Vectorized HD Map Construction

Abstract

This report introduces the first-place winning solution for the Autonomous Grand Challenge 2024 - Mapless Driving. In this report, we introduce a novel online mapping pipeline LGmap, which adept at long-range temporal model. Firstly, we propose symmetric view transformation(SVT), a hybrid view transformation module. Our approach overcomes the limitations of forward sparse feature representation and utilizing depth perception and SD prior information. Secondly, we propose hierarchical temporal fusion(HTF) module. It employs temporal information from local to global, which empowers the construction of long-range HD map with high stability. Lastly, we propose a novel ped-crossing resampling. The simplified ped crossing representation accelerates the instance attention based decoder convergence performance. Our method achieves 0.66 UniScore in the Mapless Driving OpenLaneV2 test set.

Create account to get full access

Overview

This paper introduces LGmap, a Local-to-Global Mapping Network for constructing high-definition (HD) vectorized maps online.
LGmap combines local and global spatial information to build a comprehensive map, addressing challenges in existing methods.
The approach is designed for autonomous vehicles to navigate in complex, dynamic environments.

Plain English Explanation

LGmap is a system that can create detailed digital maps, known as HD maps, in real-time as a vehicle is driving. These HD maps provide crucial information for autonomous vehicles to navigate safely and efficiently.

Traditional methods for building HD maps often struggle with capturing the full complexity of real-world environments, which can change rapidly. LGmap tackles this challenge by blending local information, like what the vehicle's sensors can immediately detect, with broader global context. This allows the system to construct a more comprehensive and up-to-date map.

The key innovation in LGmap is its ability to seamlessly integrate these local and global spatial elements. This helps autonomous vehicles better understand their surroundings and make informed decisions, even in dynamic, complicated environments. By continuously updating the map as the vehicle moves, LGmap can provide a detailed, vectorized representation of the road network and other important features.

Technical Explanation

LGmap employs a two-stage architecture to build HD maps online. The first stage, the Local Mapping Network, processes sensor data from the vehicle to extract local spatial information. This includes details about the immediate environment, such as the shape and location of roads, lane markings, and obstacles.

The second stage, the Global Mapping Network, takes the local spatial features and integrates them with broader contextual data. This could include information from previously mapped areas, satellite imagery, or crowdsourced data. By combining the local and global perspectives, LGmap can construct a more comprehensive and accurate HD map.

The researchers evaluated LGmap on several real-world autonomous driving datasets, demonstrating its ability to outperform existing methods in terms of map quality and coverage. The system was able to maintain high performance even in the face of dynamic changes to the environment, a key challenge for autonomous navigation.

Critical Analysis

The LGmap approach represents a significant advancement in the field of HD map construction for autonomous vehicles. By seamlessly integrating local and global spatial information, the system can build more detailed and up-to-date maps compared to previous methods.

However, the paper does not fully address the potential challenges of scaling LGmap to larger geographic areas or handling extreme environmental conditions, such as severe weather or construction zones. Additionally, the reliance on external data sources, like crowdsourcing, introduces potential security and privacy concerns that would need to be carefully considered.

Further research could explore ways to make LGmap more self-sufficient and robust to a wider range of real-world scenarios. Incorporating techniques from MapVision: CVPR 2024 Autonomous Driving Grand Challenge, HGS Mapping: Online Dense Mapping Using Hybrid, and Exploring Real-World Map Change Generalization Prior could further enhance the system's adaptability and robustness.

Conclusion

The LGmap system presents a promising approach for building high-definition, vectorized maps in real-time for autonomous vehicles. By integrating local and global spatial information, the system can construct comprehensive maps that adapt to dynamic changes in the environment, a critical capability for safe and reliable autonomous navigation.

While the paper demonstrates the effectiveness of LGmap, further research is needed to address scaling and robustness challenges. Exploring synergies with other cutting-edge mapping techniques, such as Structure-Aware Lane Graph Transformer Model and HD Maps as Lane Detection Generalizers, could help expand the capabilities of LGmap and bring us closer to the realization of fully autonomous driving.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report

Zhongyu Yang, Mai Liu, Jinluo Xie, Yueming Zhang, Chen Shen, Wei Shao, Jichao Jiao, Tengfei Xing, Runbo Hu, Pengfei Xu

Autonomous driving without high-definition (HD) maps demands a higher level of active scene understanding. In this competition, the organizers provided the multi-perspective camera images and standard-definition (SD) maps to explore the boundaries of scene reasoning capabilities. We found that most existing algorithms construct Bird's Eye View (BEV) features from these multi-perspective images and use multi-task heads to delineate road centerlines, boundary lines, pedestrian crossings, and other areas. However, these algorithms perform poorly at the far end of roads and struggle when the primary subject in the image is occluded. Therefore, in this competition, we not only used multi-perspective images as input but also incorporated SD maps to address this issue. We employed map encoder pre-training to enhance the network's geometric encoding capabilities and utilized YOLOX to improve traffic element detection precision. Additionally, for area detection, we innovatively introduced LDTR and auxiliary tasks to achieve higher precision. As a result, our final OLUS score is 0.58.

6/17/2024

cs.CV

HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes

Ke Wu, Kaizhao Zhang, Zhiwei Zhang, Shanshuai Yuan, Muer Tie, Julong Wei, Zijun Xu, Jieru Zhao, Zhongxue Gan, Wenchao Ding

Online dense mapping of urban scenes forms a fundamental cornerstone for scene understanding and navigation of autonomous vehicles. Recent advancements in mapping methods are mainly based on NeRF, whose rendering speed is too slow to meet online requirements. 3D Gaussian Splatting (3DGS), with its rendering speed hundreds of times faster than NeRF, holds greater potential in online dense mapping. However, integrating 3DGS into a street-view dense mapping framework still faces two challenges, including incomplete reconstruction due to the absence of geometric information beyond the LiDAR coverage area and extensive computation for reconstruction in large urban scenes. To this end, we propose HGS-Mapping, an online dense mapping framework in unbounded large-scale scenes. To attain complete construction, our framework introduces Hybrid Gaussian Representation, which models different parts of the entire scene using Gaussians with distinct properties. Furthermore, we employ a hybrid Gaussian initialization mechanism and an adaptive update method to achieve high-fidelity and rapid reconstruction. To the best of our knowledge, we are the first to integrate Gaussian representation into online dense mapping of urban scenes. Our approach achieves SOTA reconstruction accuracy while only employing 66% number of Gaussians, leading to 20% faster reconstruction speed.

4/1/2024

cs.CV

Exploring Real World Map Change Generalization of Prior-Informed HD Map Prediction Models

Samuel M. Bateman, Ning Xu, H. Charles Zhao, Yael Ben Shalom, Vince Gong, Greg Long, Will Maddern

Building and maintaining High-Definition (HD) maps represents a large barrier to autonomous vehicle deployment. This, along with advances in modern online map detection models, has sparked renewed interest in the online mapping problem. However, effectively predicting online maps at a high enough quality to enable safe, driverless deployments remains a significant challenge. Recent work on these models proposes training robust online mapping systems using low quality map priors with synthetic perturbations in an attempt to simulate out-of-date HD map priors. In this paper, we investigate how models trained on these synthetically perturbed map priors generalize to performance on deployment-scale, real world map changes. We present a large-scale experimental study to determine which synthetic perturbations are most useful in generalizing to real world HD map changes, evaluated using multiple years of real-world autonomous driving data. We show there is still a substantial sim2real gap between synthetic prior perturbations and observed real-world changes, which limits the utility of current prior-informed HD map prediction models.

6/6/2024

cs.RO cs.CV

HD Maps are Lane Detection Generalizers: A Novel Generative Framework for Single-Source Domain Generalization

Daeun Lee, Minhyeok Heo, Jiwon Kim

Lane detection is a vital task for vehicles to navigate and localize their position on the road. To ensure reliable driving, lane detection models must have robust generalization performance in various road environments. However, despite the advanced performance in the trained domain, their generalization performance still falls short of expectations due to the domain discrepancy. To bridge this gap, we propose a novel generative framework using HD Maps for Single-Source Domain Generalization (SSDG) in lane detection. We first generate numerous front-view images from lane markings of HD Maps. Next, we strategically select a core subset among the generated images using (i) lane structure and (ii) road surrounding criteria to maximize their diversity. In the end, utilizing this core set, we train lane detection models to boost their generalization performance. We validate that our generative framework from HD Maps outperforms the Domain Adaptation model MLDA with +3.01%p accuracy improvement, even though we do not access the target domain images.

6/4/2024

cs.CV cs.LG cs.RO