CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks

Read original: arXiv:2404.03191 - Published 5/7/2024 by Beibei Wang, Shuang Meng, Lu Zhang, Chenjie Wang, Jingjing Huang, Yao Li, Haojie Ren, Yuxuan Xiao, Yuru Peng, Jianmin Ji and 2 others

CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks

Overview

This paper introduces CORP, a new multi-modal dataset for campus-oriented roadside perception tasks.
The dataset includes various sensor data like camera images, LiDAR point clouds, and GPS/IMU data collected from a vehicle driving around a university campus.
The goal is to enable research on perception challenges unique to campus environments, like detecting pedestrians, traffic signs, and other objects of interest.

Plain English Explanation

The researchers have created a new dataset called CORP that contains a variety of sensor data collected from a vehicle driving around a university campus. This includes camera images, 3D point cloud data from LiDAR sensors, and information about the vehicle's location and orientation from GPS and inertial measurement units.

The key idea is to provide a resource for researchers working on perception challenges in campus environments. Unlike highways or city streets, campus roads often have unique features like pedestrian crossings, mobility scooters, and temporary structures that can be difficult for autonomous systems to detect and understand. By having this diverse dataset, researchers can develop and test new computer vision and sensor fusion techniques tailored to these types of scenarios.

The dataset covers a wide range of objects and situations you might encounter on a university campus, from students walking between buildings to traffic signs and construction zones. This can help advance the state-of-the-art in areas like pedestrian detection, semantic segmentation, and situation awareness for self-driving vehicles or robots operating in these environments.

Technical Explanation

The CORP dataset was collected using a vehicle equipped with various sensors, including RGB cameras, a 3D LiDAR, and GPS/IMU units. The vehicle was driven around the campus of the University of Michigan, capturing over 1 hour of data covering a variety of roadside scenes and objects.

The dataset includes:

60,000 camera images at 1920x1080 resolution
3D point cloud data from the LiDAR sensor
Corresponding GPS and IMU data for localization
Annotated bounding boxes and semantic segmentation for objects like pedestrians, vehicles, traffic signs, and more

The researchers designed the data capture and annotation process to enable a range of perception tasks relevant to campus environments, like detecting mobility scooters, construction zones, and illegal parking. They also provides tools and baselines for evaluating model performance on these challenges.

Critical Analysis

The CORP dataset appears to be a valuable contribution to the field of autonomous navigation and robotic perception. By focusing on the unique challenges of campus environments, it addresses an important gap in existing roadside datasets which have tended to emphasize more generic urban or highway scenarios.

That said, the dataset is still limited in scope, covering only a single university campus over a relatively short time period. The scenes and objects encountered may not fully generalize to all campus settings, and the lack of temporal diversity (e.g. different weather, times of day, etc.) could constrain the applicability of models trained on CORP.

Additionally, the paper does not provide a detailed analysis of dataset biases or limitations. It would be helpful to understand things like the demographic representation in the pedestrian annotations, the prevalence of certain object classes, or potential blind spots in the sensor coverage.

Overall, CORP represents an important step forward, but continued expansion and rigorous evaluation will be needed to fully leverage its potential for advancing campus-oriented perception research. Researchers should also be mindful of the dataset's boundaries when applying models to real-world deployments.

Conclusion

The CORP dataset provides a valuable new resource for researchers working on perception challenges in campus environments. By capturing diverse sensor data from a university setting, it enables the development of computer vision and sensor fusion techniques tailored to the unique objects and scenarios found on college campuses.

While the current dataset has some limitations in scope and diversity, it lays an important foundation for advancing the state-of-the-art in areas like pedestrian detection, semantic segmentation, and situation awareness for autonomous systems operating in these complex, people-centric spaces. Continued expansion and evaluation of CORP can help drive progress towards safer and more intelligent mobility solutions for campus communities.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks

Beibei Wang, Shuang Meng, Lu Zhang, Chenjie Wang, Jingjing Huang, Yao Li, Haojie Ren, Yuxuan Xiao, Yuru Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang

Numerous roadside perception datasets have been introduced to propel advancements in autonomous driving and intelligent transportation systems research and development. However, it has been observed that the majority of their concentrates is on urban arterial roads, inadvertently overlooking residential areas such as parks and campuses that exhibit entirely distinct characteristics. In light of this gap, we propose CORP, which stands as the first public benchmark dataset tailored for multi-modal roadside perception tasks under campus scenarios. Collected in a university campus, CORP consists of over 205k images plus 102k point clouds captured from 18 cameras and 9 LiDAR sensors. These sensors with different configurations are mounted on roadside utility poles to provide diverse viewpoints within the campus region. The annotations of CORP encompass multi-dimensional information beyond 2D and 3D bounding boxes, providing extra support for 3D seamless tracking and instance segmentation with unique IDs and pixel masks for identifying targets, to enhance the understanding of objects and their behaviors distributed across the campus premises. Unlike other roadside datasets about urban traffic, CORP extends the spectrum to highlight the challenges for multi-modal perception in campuses and other residential areas.

5/7/2024

🧪

CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments

Yang Zhou, Long Quang, Carlos Nieto-Granda, Giuseppe Loianno

In the past decade, although single-robot perception has made significant advancements, the exploration of multi-robot collaborative perception remains largely unexplored. This involves fusing compressed, intermittent, limited, heterogeneous, and asynchronous environmental information across multiple robots to enhance overall perception, despite challenges like sensor noise, occlusions, and sensor failures. One major hurdle has been the lack of real-world datasets. This paper presents a pioneering and comprehensive real-world multi-robot collaborative perception dataset to boost research in this area. Our dataset leverages the untapped potential of air-ground robot collaboration featuring distinct spatial viewpoints, complementary robot mobilities, coverage ranges, and sensor modalities. It features raw sensor inputs, pose estimation, and optional high-level perception annotation, thus accommodating diverse research interests. Compared to existing datasets predominantly designed for Simultaneous Localization and Mapping (SLAM), our setup ensures a diverse range and adequate overlap of sensor views to facilitate the study of multi-robot collaborative perception algorithms. We demonstrate the value of this dataset qualitatively through multiple collaborative perception tasks. We believe this work will unlock the potential research of high-level scene understanding through multi-modal collaborative perception in multi-robot settings.

5/24/2024

🔎

MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection

Till Beemelmanns, Quan Zhang, Christian Geller, Lutz Eckstein

Multi-modal 3D object detection models for automated driving have demonstrated exceptional performance on computer vision benchmarks like nuScenes. However, their reliance on densely sampled LiDAR point clouds and meticulously calibrated sensor arrays poses challenges for real-world applications. Issues such as sensor misalignment, miscalibration, and disparate sampling frequencies lead to spatial and temporal misalignment in data from LiDAR and cameras. Additionally, the integrity of LiDAR and camera data is often compromised by adverse environmental conditions such as inclement weather, leading to occlusions and noise interference. To address this challenge, we introduce MultiCorrupt, a comprehensive benchmark designed to evaluate the robustness of multi-modal 3D object detectors against ten distinct types of corruptions. We evaluate five state-of-the-art multi-modal detectors on MultiCorrupt and analyze their performance in terms of their resistance ability. Our results show that existing methods exhibit varying degrees of robustness depending on the type of corruption and their fusion strategy. We provide insights into which multi-modal design choices make such models robust against certain perturbations. The dataset generation code and benchmark are open-sourced at https://github.com/ika-rwth-aachen/MultiCorrupt.

4/23/2024

The OPNV Data Collection: A Dataset for Infrastructure-Supported Perception Research with Focus on Public Transportation

Marcel Vosshans, Alexander Baumann, Matthias Drueppel, Omar Ait-Aider, Ralf Woerner, Youcef Mezouar, Thao Dang, Markus Enzweiler

This paper we present our vision and ongoing work for a novel dataset designed to advance research into the interoperability of intelligent vehicles and infrastructure, specifically aimed at enhancing cooperative perception and interaction in the realm of public transportation. Unlike conventional datasets centered on ego-vehicle data, this approach encompasses both a stationary sensor tower and a moving vehicle, each equipped with cameras, LiDARs, and GNSS, while the vehicle additionally includes an inertial navigation system. Our setup features comprehensive calibration and time synchronization, ensuring seamless and accurate sensor data fusion crucial for studying complex, dynamic scenes. Emphasizing public transportation, the dataset targets to include scenes like bus station maneuvers and driving on dedicated bus lanes, reflecting the specifics of small public buses. We introduce the open-source .4mse file format for the new dataset, accompanied by a research kit. This kit provides tools such as ego-motion compensation or LiDAR-to-camera projection enabling advanced research on intelligent vehicle-infrastructure integration. Our approach does not include annotations; however, we plan to implement automatically generated labels sourced from state-of-the-art public repositories. Several aspects are still up for discussion, and timely feedback from the community would be greatly appreciated. A sneak preview on one data frame will be available at a Google Colab Notebook. Moreover, we will use the related GitHub Repository to collect remarks and suggestions.

7/12/2024