RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving

Read original: arXiv:2408.15503 - Published 9/17/2024 by Haisheng Su, Feixiang Song, Cong Ma, Wei Wu, Junchi Yan

RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving

Overview

Large-scale dataset and benchmark for multi-sensor low-speed autonomous driving
Provides diverse real-world driving scenarios and multi-sensor data
Enables research on robust perception and decision-making for autonomous vehicles

Plain English Explanation

The paper presents the RoboSense dataset, a large-scale dataset and benchmark for evaluating autonomous driving systems in low-speed scenarios. The dataset includes diverse real-world driving scenarios captured using multiple sensors, such as cameras, LiDARs, and radars. This comprehensive data allows researchers to develop and test robust perception and decision-making algorithms for autonomous vehicles operating in complex urban environments.

The key aspects of the RoboSense dataset are:

Diverse Driving Scenarios: The dataset covers a wide range of real-world driving situations, including intersections, roundabouts, and parking lots, providing a realistic testbed for autonomous driving systems.
Multi-sensor Data: The dataset includes synchronized data from various sensors, enabling the development of multi-modal perception and fusion algorithms.
Comprehensive Annotations: The dataset provides detailed annotations, such as object classes, bounding boxes, and semantic segmentation, to support a variety of research tasks.
Benchmark Tasks: The dataset defines benchmark tasks, such as object detection, tracking, and behavior prediction, to assess the performance of autonomous driving algorithms.

By providing this extensive dataset and benchmark, the researchers aim to accelerate the development and deployment of safe and reliable autonomous driving systems that can operate in complex urban environments.

Technical Explanation

The RoboSense dataset is a large-scale dataset and benchmark for evaluating multi-sensor low-speed autonomous driving. It consists of over 1,000 km of driving data collected using a fleet of vehicles equipped with a variety of sensors, including cameras, LiDARs, and radars.

The dataset covers diverse driving scenarios, such as intersections, roundabouts, and parking lots, which are challenging for autonomous vehicles due to the presence of various obstacles, pedestrians, and complex traffic patterns. The multi-sensor data, including synchronized images, point clouds, and radar data, allows researchers to develop and test advanced perception and sensor fusion algorithms.

The dataset provides comprehensive annotations, including object classes, bounding boxes, and semantic segmentation, to support a wide range of research tasks, such as object detection, tracking, and behavior prediction. The researchers also define benchmark tasks and evaluation metrics to assess the performance of autonomous driving algorithms.

The RoboSense dataset aims to serve as a valuable resource for the research community, enabling the development of robust and reliable autonomous driving systems that can operate safely in complex urban environments.

Critical Analysis

The RoboSense dataset is a significant contribution to the field of autonomous driving research, as it provides a comprehensive and diverse set of real-world driving scenarios and multi-sensor data. The dataset's coverage of challenging driving situations, such as intersections and parking lots, is particularly noteworthy, as these scenarios are often underrepresented in existing datasets.

However, one potential limitation of the dataset is the focus on low-speed driving scenarios. While this is an important aspect of autonomous driving, it would be valuable to expand the dataset to include higher-speed driving scenarios, such as highway driving, to enable a more comprehensive evaluation of autonomous driving systems.

Additionally, the dataset could potentially be enriched with more detailed annotations, such as driver intention or behavior, which could further support the development of advanced perception and decision-making algorithms.

Overall, the RoboSense dataset is a valuable resource for the research community, and the insights and algorithms developed using this dataset can contribute to the advancement of safe and reliable autonomous driving systems.

Conclusion

The RoboSense dataset is a large-scale dataset and benchmark that provides a comprehensive testbed for evaluating multi-sensor low-speed autonomous driving. By incorporating diverse real-world driving scenarios and multi-modal sensor data, the dataset enables researchers to develop and test robust perception and decision-making algorithms for autonomous vehicles.

The dataset's focus on challenging urban environments, such as intersections and parking lots, is particularly noteworthy and can help accelerate the deployment of autonomous driving systems in complex, real-world settings. As the field of autonomous driving continues to evolve, resources like the RoboSense dataset will play a crucial role in driving innovation and ensuring the safety and reliability of self-driving vehicles.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving

Haisheng Su, Feixiang Song, Cong Ma, Wei Wu, Junchi Yan

Robust object detection and tracking under arbitrary sight of view is challenging yet essential for the development of Autonomous Vehicle technology. With the growing demand of unmanned function vehicles, near-field scene understanding becomes an important research topic in the areas of low-speed autonomous driving. Due to the complexity of driving conditions and diversity of near obstacles such as blind spots and high occlusion, the perception capability of near-field environment is still inferior than its farther counterpart. To further enhance the intelligent ability of unmanned vehicles, in this paper, we construct a multimodal data collection platform based on 3 main types of sensors (Camera, LiDAR and Fisheye), which supports flexible sensor configurations to enable dynamic sight of view for ego vehicle, either global view or local view. Meanwhile, a large-scale multi-sensor dataset is built, named RoboSense, to facilitate near-field scene understanding. RoboSense contains more than 133K synchronized data with 1.4M 3D bounding box and IDs annotated in the full $360^{circ}$ view, forming 216K trajectories across 7.6K temporal sequences. It has $270times$ and $18times$ as many annotations of near-field obstacles within 5$m$ as the previous single-vehicle datasets such as KITTI and nuScenes. Moreover, we define a novel matching criterion for near-field 3D perception and prediction metrics. Based on RoboSense, we formulate 6 popular tasks to facilitate the future development of related research, where the detailed data analysis as well as benchmarks are also provided accordingly.

9/17/2024

🧪

CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments

Yang Zhou, Long Quang, Carlos Nieto-Granda, Giuseppe Loianno

In the past decade, although single-robot perception has made significant advancements, the exploration of multi-robot collaborative perception remains largely unexplored. This involves fusing compressed, intermittent, limited, heterogeneous, and asynchronous environmental information across multiple robots to enhance overall perception, despite challenges like sensor noise, occlusions, and sensor failures. One major hurdle has been the lack of real-world datasets. This paper presents a pioneering and comprehensive real-world multi-robot collaborative perception dataset to boost research in this area. Our dataset leverages the untapped potential of air-ground robot collaboration featuring distinct spatial viewpoints, complementary robot mobilities, coverage ranges, and sensor modalities. It features raw sensor inputs, pose estimation, and optional high-level perception annotation, thus accommodating diverse research interests. Compared to existing datasets predominantly designed for Simultaneous Localization and Mapping (SLAM), our setup ensures a diverse range and adequate overlap of sensor views to facilitate the study of multi-robot collaborative perception algorithms. We demonstrate the value of this dataset qualitatively through multiple collaborative perception tasks. We believe this work will unlock the potential research of high-level scene understanding through multi-modal collaborative perception in multi-robot settings.

5/24/2024

Robustness-Aware 3D Object Detection in Autonomous Driving: A Review and Outlook

Ziying Song, Lin Liu, Feiyang Jia, Yadan Luo, Guoxin Zhang, Lei Yang, Li Wang, Caiyan Jia

In the realm of modern autonomous driving, the perception system is indispensable for accurately assessing the state of the surrounding environment, thereby enabling informed prediction and planning. The key step to this system is related to 3D object detection that utilizes vehicle-mounted sensors such as LiDAR and cameras to identify the size, the category, and the location of nearby objects. Despite the surge in 3D object detection methods aimed at enhancing detection precision and efficiency, there is a gap in the literature that systematically examines their resilience against environmental variations, noise, and weather changes. This study emphasizes the importance of robustness, alongside accuracy and latency, in evaluating perception systems under practical scenarios. Our work presents an extensive survey of camera-only, LiDAR-only, and multi-modal 3D object detection algorithms, thoroughly evaluating their trade-off between accuracy, latency, and robustness, particularly on datasets like KITTI-C and nuScenes-C to ensure fair comparisons. Among these, multi-modal 3D detection approaches exhibit superior robustness, and a novel taxonomy is introduced to reorganize the literature for enhanced clarity. This survey aims to offer a more practical perspective on the current capabilities and the constraints of 3D object detection algorithms in real-world applications, thus steering future research towards robustness-centric advancements.

8/16/2024

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

Xiaosu Zhu, Hualian Sheng, Sijia Cai, Bing Deng, Shaopeng Yang, Qiao Liang, Ken Chen, Lianli Gao, Jingkuan Song, Jieping Ye

We introduce RoScenes, the largest multi-view roadside perception dataset, which aims to shed light on the development of vision-centric Bird's Eye View (BEV) approaches for more challenging traffic scenes. The highlights of RoScenes include significantly large perception area, full scene coverage and crowded traffic. More specifically, our dataset achieves surprising 21.13M 3D annotations within 64,000 $m^2$. To relieve the expensive costs of roadside 3D labeling, we present a novel BEV-to-3D joint annotation pipeline to efficiently collect such a large volume of data. After that, we organize a comprehensive study for current BEV methods on RoScenes in terms of effectiveness and efficiency. Tested methods suffer from the vast perception area and variation of sensor layout across scenes, resulting in performance levels falling below expectations. To this end, we propose RoBEV that incorporates feature-guided position embedding for effective 2D-3D feature assignment. With its help, our method outperforms state-of-the-art by a large margin without extra computational overhead on validation set. Our dataset and devkit will be made available at https://github.com/xiaosu-zhu/RoScenes.

7/8/2024