Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map

2405.04290

Published 5/8/2024 by Yuxuan Xia, Erik Stenborg, Junsheng Fu, Gustaf Hendeby

🛠️

Abstract

High-definition map with accurate lane-level information is crucial for autonomous driving, but the creation of these maps is a resource-intensive process. To this end, we present a cost-effective solution to create lane-level roadmaps using only the global navigation satellite system (GNSS) and a camera on customer vehicles. Our proposed solution utilizes a prior standard-definition (SD) map, GNSS measurements, visual odometry, and lane marking edge detection points, to simultaneously estimate the vehicle's 6D pose, its position within a SD map, and also the 3D geometry of traffic lines. This is achieved using a Bayesian simultaneous localization and multi-object tracking filter, where the estimation of traffic lines is formulated as a multiple extended object tracking problem, solved using a trajectory Poisson multi-Bernoulli mixture (TPMBM) filter. In TPMBM filtering, traffic lines are modeled using B-spline trajectories, and each trajectory is parameterized by a sequence of control points. The proposed solution has been evaluated using experimental data collected by a test vehicle driving on highway. Preliminary results show that the traffic line estimates, overlaid on the satellite image, generally align with the lane markings up to some lateral offsets.

Create account to get full access

Overview

The paper presents a cost-effective solution to create lane-level roadmaps for autonomous driving using only global navigation satellite system (GNSS) and a camera on customer vehicles.
The proposed approach uses a prior standard-definition (SD) map, GNSS measurements, visual odometry, and lane marking edge detection points to simultaneously estimate the vehicle's 6D pose, its position within the SD map, and the 3D geometry of traffic lines.
The estimation of traffic lines is formulated as a multiple extended object tracking problem, which is solved using a trajectory Poisson multi-Bernoulli mixture (TPMBM) filter.

Plain English Explanation

Self-driving cars require highly detailed maps to navigate safely, but creating these high-definition (HD) maps is a resource-intensive process. This paper introduces a more cost-effective solution that uses data from existing customer vehicles to build these maps.

The key idea is to start with a basic, standard-definition (SD) map and then use the vehicle's GPS, camera, and other sensors to refine the map and add lane-level details. The system uses a combination of GPS measurements, visual odometry (estimating the vehicle's motion from camera images), and lane markings detected in the camera footage to simultaneously:

Determine the vehicle's precise 6D position and orientation
Locate the vehicle's position on the existing SD map
Map the 3D geometry of the road and lane markings

This is done using a statistical technique called the "trajectory Poisson multi-Bernoulli mixture (TPMBM) filter." The TPMBM filter models the lane markings as B-spline curves, which can accurately capture their shape.

Technical Explanation

The proposed solution uses a Bayesian simultaneous localization and multi-object tracking filter to estimate the vehicle's 6D pose, its position within a SD map, and the 3D geometry of traffic lines. The estimation of traffic lines is formulated as a multiple extended object tracking problem, which is solved using a TPMBM filter.

In the TPMBM filtering approach, traffic lines are modeled using B-spline trajectories, where each trajectory is parameterized by a sequence of control points. This allows the system to accurately capture the shape of the lane markings.

The input data to the system includes:

A prior SD map
GNSS measurements
Visual odometry data
Lane marking edge detection points

The system processes this data to simultaneously:

Estimate the vehicle's 6D pose (position and orientation)
Locate the vehicle's position within the SD map
Estimate the 3D geometry of the traffic lines

This is achieved using a Bayesian filtering framework that jointly optimizes these three elements. The TPMBM filter is used to track the traffic lines as extended objects, modeling them as B-spline trajectories.

The proposed solution has been evaluated using experimental data collected by a test vehicle driving on a highway. The results show that the estimated traffic line positions, when overlaid on a satellite image, generally align with the actual lane markings, although there are some lateral offsets.

Critical Analysis

The paper presents a promising approach to creating HD maps for autonomous driving in a cost-effective manner. By leveraging data from customer vehicles, the system has the potential to scale more easily than methods that rely on dedicated mapping vehicles.

However, the paper does not provide a detailed quantitative evaluation of the accuracy of the traffic line estimates. The authors only mention that the estimates "generally align" with the actual lane markings, but more rigorous validation would be needed to assess the robustness and reliability of the system.

Additionally, the paper does not address the potential privacy concerns that may arise from using customer vehicle data to build these maps. There would need to be careful consideration of data privacy and security measures to ensure user trust and acceptance of such a system.

Further research could also explore ways to increase the pose accuracy of the vehicle's localization within the SD map, as well as improving the detection and tracking of lane markings to reduce the lateral offsets observed in the experimental results.

Conclusion

This paper presents a novel, cost-effective approach to creating high-definition maps for autonomous driving using data from customer vehicles. By leveraging GNSS, cameras, and a prior standard-definition map, the system can simultaneously localize the vehicle, track its position on the map, and estimate the 3D geometry of traffic lines.

The use of a TPMBM filter to model the lane markings as B-spline trajectories is a promising technical innovation that allows the system to accurately capture the shape of the roads. While more validation is needed, this research represents an important step towards making HD mapping more accessible and scalable for self-driving car development.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Multi-Object Tracking with Camera-LiDAR Fusion for Autonomous Driving

Riccardo Pieroni, Simone Specchia, Matteo Corno, Sergio Matteo Savaresi

This paper presents a novel multi-modal Multi-Object Tracking (MOT) algorithm for self-driving cars that combines camera and LiDAR data. Camera frames are processed with a state-of-the-art 3D object detector, whereas classical clustering techniques are used to process LiDAR observations. The proposed MOT algorithm comprises a three-step association process, an Extended Kalman filter for estimating the motion of each detected dynamic obstacle, and a track management phase. The EKF motion model requires the current measured relative position and orientation of the observed object and the longitudinal and angular velocities of the ego vehicle as inputs. Unlike most state-of-the-art multi-modal MOT approaches, the proposed algorithm does not rely on maps or knowledge of the ego global pose. Moreover, it uses a 3D detector exclusively for cameras and is agnostic to the type of LiDAR sensor used. The algorithm is validated both in simulation and with real-world data, with satisfactory results.

5/14/2024

cs.RO cs.CV

Monocular Localization with Semantics Map for Autonomous Vehicles

Jixiang Wan, Xudong Zhang, Shuzhou Dong, Yuwei Zhang, Yuchen Yang, Ruoxi Wu, Ye Jiang, Jijunnan Li, Jinquan Lin, Ming Yang

Accurate and robust localization remains a significant challenge for autonomous vehicles. The cost of sensors and limitations in local computational efficiency make it difficult to scale to large commercial applications. Traditional vision-based approaches focus on texture features that are susceptible to changes in lighting, season, perspective, and appearance. Additionally, the large storage size of maps with descriptors and complex optimization processes hinder system performance. To balance efficiency and accuracy, we propose a novel lightweight visual semantic localization algorithm that employs stable semantic features instead of low-level texture features. First, semantic maps are constructed offline by detecting semantic objects, such as ground markers, lane lines, and poles, using cameras or LiDAR sensors. Then, online visual localization is performed through data association of semantic features and map objects. We evaluated our proposed localization framework in the publicly available KAIST Urban dataset and in scenarios recorded by ourselves. The experimental results demonstrate that our method is a reliable and practical localization solution in various autonomous driving localization tasks.

6/7/2024

cs.CV cs.RO

🔮

Automated Lane Change Behavior Prediction and Environmental Perception Based on SLAM Technology

Han Lei, Baoming Wang, Zuwei Shui, Peiyuan Yang, Penghao Liang

In addition to environmental perception sensors such as cameras, radars, etc. in the automatic driving system, the external environment of the vehicle is perceived, in fact, there is also a perception sensor that has been silently dedicated in the system, that is, the positioning module. This paper explores the application of SLAM (Simultaneous Localization and Mapping) technology in the context of automatic lane change behavior prediction and environment perception for autonomous vehicles. It discusses the limitations of traditional positioning methods, introduces SLAM technology, and compares LIDAR SLAM with visual SLAM. Real-world examples from companies like Tesla, Waymo, and Mobileye showcase the integration of AI-driven technologies, sensor fusion, and SLAM in autonomous driving systems. The paper then delves into the specifics of SLAM algorithms, sensor technologies, and the importance of automatic lane changes in driving safety and efficiency. It highlights Tesla's recent update to its Autopilot system, which incorporates automatic lane change functionality using SLAM technology. The paper concludes by emphasizing the crucial role of SLAM in enabling accurate environment perception, positioning, and decision-making for autonomous vehicles, ultimately enhancing safety and driving experience.

4/9/2024

cs.RO cs.AI cs.CV

MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report

Zhongyu Yang, Mai Liu, Jinluo Xie, Yueming Zhang, Chen Shen, Wei Shao, Jichao Jiao, Tengfei Xing, Runbo Hu, Pengfei Xu

Autonomous driving without high-definition (HD) maps demands a higher level of active scene understanding. In this competition, the organizers provided the multi-perspective camera images and standard-definition (SD) maps to explore the boundaries of scene reasoning capabilities. We found that most existing algorithms construct Bird's Eye View (BEV) features from these multi-perspective images and use multi-task heads to delineate road centerlines, boundary lines, pedestrian crossings, and other areas. However, these algorithms perform poorly at the far end of roads and struggle when the primary subject in the image is occluded. Therefore, in this competition, we not only used multi-perspective images as input but also incorporated SD maps to address this issue. We employed map encoder pre-training to enhance the network's geometric encoding capabilities and utilized YOLOX to improve traffic element detection precision. Additionally, for area detection, we innovatively introduced LDTR and auxiliary tasks to achieve higher precision. As a result, our final OLUS score is 0.58.

6/17/2024

cs.CV