RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud

2405.11536

Published 6/21/2024 by Mohamed Nagy, Naoufel Werghi, Bilal Hassan, Jorge Dias, Majid Khonji

RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud

Abstract

This work addresses limitations in recent 3D tracking-by-detection methods, focusing on identifying legitimate trajectories and addressing state estimation drift in Kalman filters. Current methods rely heavily on threshold-based filtering of false positive detections using detection scores to prevent ghost trajectories. However, this approach is inadequate for distant and partially occluded objects, where detection scores tend to drop, potentially leading to false positives exceeding the threshold. Additionally, the literature generally treats detections as precise localizations of objects. Our research reveals that noise in detections impacts localization information, causing trajectory drift for occluded objects and hindering recovery. To this end, we propose a novel online track validity mechanism that temporally distinguishes between legitimate and ghost tracks, along with a multi-stage observational gating process for incoming observations. This mechanism significantly improves tracking performance, with a $6.28%$ in HOTA and a $17.87%$ increase in MOTA. We also introduce a refinement to the Kalman filter that enhances noise mitigation in trajectory drift, leading to more robust state estimation for occluded objects. Our framework, RobMOT, outperforms state-of-the-art methods, including deep learning approaches, across various detectors, achieving up to a $4%$ margin in HOTA and $6%$ in MOTA. RobMOT excels under challenging conditions, such as prolonged occlusions and tracking distant objects, with up to a 59% improvement in processing latency.

Create account to get full access

Overview

• This paper presents RobMOT, a robust 3D multi-object tracking system that mitigates the effects of observational noise and state estimation drift in LiDAR point cloud data.

• RobMOT uses a novel object detection and association approach to maintain accurate tracking of multiple objects in challenging environments.

• The system demonstrates strong performance on several benchmark datasets, outperforming existing state-of-the-art methods.

Plain English Explanation

• RobMOT is a system that can track multiple objects (like cars, pedestrians, etc.) in 3D space using data from LiDAR sensors.

• LiDAR data can be noisy and prone to errors that cause the tracked object positions to drift over time. RobMOT has special techniques to reduce the impact of these problems.

• RobMOT works by detecting objects in the LiDAR data and then associating those detections with the correct tracked objects over time. This helps it maintain accurate tracking even when the data is noisy or drifting.

• Experiments show that RobMOT outperforms other leading multi-object tracking methods on standard benchmark datasets. This makes it a valuable tool for applications like autonomous vehicles that rely on robust 3D object tracking.

Technical Explanation

• RobMOT uses a two-stage detection and association approach to track multiple objects in 3D LiDAR point clouds.

• The detection stage employs a neural network model to identify objects of interest in each LiDAR frame. This handles the challenge of finding the objects amid the noisy sensor data.

• The association stage then matches the detected objects across frames to maintain consistent identities for each tracked object. This addresses the issue of state estimation drift over time.

• RobMOT incorporates several innovations, including a novel affinity metric and a robust optimization-based association method, to improve tracking performance.

• Extensive experiments on the KITTI, MOTChallenge, and nuScenes benchmarks demonstrate that RobMOT outperforms state-of-the-art 3D multi-object tracking approaches.

Critical Analysis

• While RobMOT shows impressive performance, the paper acknowledges that the system's effectiveness may degrade in extremely cluttered or occluded environments where object detections become very challenging.

• Additionally, the association method relies on a set of heuristic parameters that may need to be tuned for different application domains. More principled approaches to parameter selection could improve the system's generalizability.

• The paper also does not provide a deep analysis of RobMOT's failure modes or edge cases. Further investigation into the system's weaknesses could inform future improvements.

• Nonetheless, RobMOT represents a significant advancement in the field of 3D multi-object tracking, with the potential to enable more robust and reliable perception for autonomous systems.

Conclusion

• RobMOT is a novel 3D multi-object tracking system that effectively mitigates the challenges of observational noise and state estimation drift in LiDAR data.

• By combining advanced object detection and robust association techniques, RobMOT demonstrates state-of-the-art performance on several benchmark datasets, making it a valuable tool for applications like autonomous driving and robotics.

• While the system has some limitations, the innovations presented in this work represent an important step forward in the field of 3D multi-object tracking, with the potential to enable more reliable and robust perception in complex real-world environments.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Multi-Object Tracking with Camera-LiDAR Fusion for Autonomous Driving

Riccardo Pieroni, Simone Specchia, Matteo Corno, Sergio Matteo Savaresi

This paper presents a novel multi-modal Multi-Object Tracking (MOT) algorithm for self-driving cars that combines camera and LiDAR data. Camera frames are processed with a state-of-the-art 3D object detector, whereas classical clustering techniques are used to process LiDAR observations. The proposed MOT algorithm comprises a three-step association process, an Extended Kalman filter for estimating the motion of each detected dynamic obstacle, and a track management phase. The EKF motion model requires the current measured relative position and orientation of the observed object and the longitudinal and angular velocities of the ego vehicle as inputs. Unlike most state-of-the-art multi-modal MOT approaches, the proposed algorithm does not rely on maps or knowledge of the ego global pose. Moreover, it uses a 3D detector exclusively for cameras and is agnostic to the type of LiDAR sensor used. The algorithm is validated both in simulation and with real-world data, with satisfactory results.

5/14/2024

cs.RO cs.CV

BiTrack: Bidirectional Offline 3D Multi-Object Tracking Using Camera-LiDAR Data

Kemiao Huang, Meiying Zhang, Qi Hao

Compared with real-time multi-object tracking (MOT), offline multi-object tracking (OMOT) has the advantages to perform 2D-3D detection fusion, erroneous link correction, and full track optimization but has to deal with the challenges from bounding box misalignment and track evaluation, editing, and refinement. This paper proposes BiTrack, a 3D OMOT framework that includes modules of 2D-3D detection fusion, initial trajectory generation, and bidirectional trajectory re-optimization to achieve optimal tracking results from camera-LiDAR data. The novelty of this paper includes threefold: (1) development of a point-level object registration technique that employs a density-based similarity metric to achieve accurate fusion of 2D-3D detection results; (2) development of a set of data association and track management skills that utilizes a vertex-based similarity metric as well as false alarm rejection and track recovery mechanisms to generate reliable bidirectional object trajectories; (3) development of a trajectory re-optimization scheme that re-organizes track fragments of different fidelities in a greedy fashion, as well as refines each trajectory with completion and smoothing techniques. The experiment results on the KITTI dataset demonstrate that BiTrack achieves the state-of-the-art performance for 3D OMOT tasks in terms of accuracy and efficiency.

6/27/2024

cs.CV cs.AI

Track Initialization and Re-Identification for~3D Multi-View Multi-Object Tracking

Linh Van Ma, Tran Thien Dat Nguyen, Ba-Ngu Vo, Hyunsung Jang, Moongu Jeon

We propose a 3D multi-object tracking (MOT) solution using only 2D detections from monocular cameras, which automatically initiates/terminates tracks as well as resolves track appearance-reappearance and occlusions. Moreover, this approach does not require detector retraining when cameras are reconfigured but only the camera matrices of reconfigured cameras need to be updated. Our approach is based on a Bayesian multi-object formulation that integrates track initiation/termination, re-identification, occlusion handling, and data association into a single Bayes filtering recursion. However, the exact filter that utilizes all these functionalities is numerically intractable due to the exponentially growing number of terms in the (multi-object) filtering density, while existing approximations trade-off some of these functionalities for speed. To this end, we develop a more efficient approximation suitable for online MOT by incorporating object features and kinematics into the measurement model, which improves data association and subsequently reduces the number of terms. Specifically, we exploit the 2D detections and extracted features from multiple cameras to provide a better approximation of the multi-object filtering density to realize the track initiation/termination and re-identification functionalities. Further, incorporating a tractable geometric occlusion model based on 2D projections of 3D objects on the camera planes realizes the occlusion handling functionality of the filter. Evaluation of the proposed solution on challenging datasets demonstrates significant improvements and robustness when camera configurations change on-the-fly, compared to existing multi-view MOT solutions. The source code is publicly available at https://github.com/linh-gist/mv-glmb-ab.

5/30/2024

cs.CV cs.IT

👨‍🏫

LEGO: Learning and Graph-Optimized Modular Tracker for Online Multi-Object Tracking with Point Clouds

Zhenrong Zhang, Jianan Liu, Yuxuan Xia, Tao Huang, Qing-Long Han, Hongbin Liu

Online multi-object tracking (MOT) plays a pivotal role in autonomous systems. The state-of-the-art approaches usually employ a tracking-by-detection method, and data association plays a critical role. This paper proposes a learning and graph-optimized (LEGO) modular tracker to improve data association performance in the existing literature. The proposed LEGO tracker integrates graph optimization and self-attention mechanisms, which efficiently formulate the association score map, facilitating the accurate and efficient matching of objects across time frames. To further enhance the state update process, the Kalman filter is added to ensure consistent tracking by incorporating temporal coherence in the object states. Our proposed method utilizing LiDAR alone has shown exceptional performance compared to other online tracking approaches, including LiDAR-based and LiDAR-camera fusion-based methods. LEGO ranked 1st at the time of submitting results to KITTI object tracking evaluation ranking board and remains 2nd at the time of submitting this paper, among all online trackers in the KITTI MOT benchmark for cars1

5/28/2024

cs.CV cs.AI