Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles

Read original: arXiv:2404.18411 - Published 4/30/2024 by Mingi Jeong, Arihant Chadda, Ziang Ren, Luyang Zhao, Haowen Liu, Monika Roznere, Aiwei Zhang, Yitao Jiang, Sabriel Achong, Samuel Lensgraf and 1 other

🏷️

Overview

This paper introduces a new dataset for autonomous maritime navigation, focused on detecting and classifying in-water obstacles to improve situational awareness for Autonomous Surface Vehicles (ASVs).
The dataset includes diverse objects encountered under varying environmental conditions, aiming to bridge the research gap in marine robotics by providing a multi-modal, annotated, and ego-centric perception dataset for object detection and classification.
The authors showcase the applicability of their dataset using deep learning-based open-source perception algorithms.
This is a work-in-progress paper, with plans to release the full dataset in a future publication.

Plain English Explanation

The researchers have created a new dataset to help autonomous boats and ships (called Autonomous Surface Vehicles or ASVs) better understand their surroundings. The dataset includes a variety of objects that an ASV might encounter in the water, like debris, buoys, or other obstacles. The dataset also includes information about the environment, like changes in lighting or weather conditions.

This is important because ASVs need to be able to "see" and "recognize" objects in the water to navigate safely. The researchers hope that this dataset will help advance the field of marine robotics by providing a comprehensive set of data for training AI systems to detect and classify different types of obstacles.

The dataset is "multi-modal," which means it includes different types of sensor data, like cameras, radar, and sonar. This helps the AI systems get a more complete understanding of the environment. The dataset is also "ego-centric," which means the sensor data is from the perspective of the ASV itself, mimicking how the vehicle would actually "see" the world.

The researchers have also tested their dataset using some existing deep learning-based perception algorithms, showing that it can be useful for developing these types of AI systems. Overall, this new dataset aims to improve the safety and capabilities of autonomous boats and ships by helping them better understand their aquatic environments.

Technical Explanation

The paper introduces a new multi-modal perception dataset for autonomous maritime navigation, focusing on the detection and classification of in-water obstacles. The dataset includes a diverse range of objects encountered under varying environmental conditions, such as changes in lighting, weather, and water clarity.

The dataset is designed to be "ego-centric," meaning the sensor data is from the perspective of the Autonomous Surface Vehicle (ASV) itself. This aims to better mimic the real-world perception challenges faced by these autonomous systems. The multi-modal nature of the dataset, which includes data from cameras, radar, and sonar, provides a more comprehensive view of the aquatic environment.

To demonstrate the applicability of the dataset, the researchers evaluate the performance of several deep learning-based open-source perception algorithms. The results show the dataset's potential to contribute to the development of the marine autonomy pipeline and advance the field of marine (field) robotics.

Critical Analysis

The paper introduces a promising dataset, but it is still a work-in-progress, and the full dataset is not yet publicly available. The authors acknowledge the need for further research and evaluation to refine the dataset and address any potential limitations.

One potential limitation is the diversity of the dataset. While the authors claim to have included a range of objects and environmental conditions, it would be valuable to have a more detailed analysis of the dataset's coverage and how it compares to the real-world challenges faced by ASVs.

Additionally, the paper does not provide a comprehensive evaluation of the dataset's performance across different deep learning algorithms or task-specific metrics. Further research could explore the dataset's suitability for a wider range of perception tasks, such as 3D mapping or multi-sensor fusion, to ensure its broader applicability in the field of marine robotics.

Conclusion

This paper presents a new multi-modal perception dataset for autonomous maritime navigation, focused on detecting and classifying in-water obstacles. The dataset aims to bridge the research gap in marine robotics by providing a comprehensive, annotated, and ego-centric dataset to support the development of AI-powered perception systems for Autonomous Surface Vehicles.

While the dataset is still a work-in-progress, the authors have demonstrated its potential by evaluating the performance of existing deep learning-based perception algorithms. The successful application of these algorithms suggests that the dataset could make a valuable contribution to the field of marine autonomy and robotics. Further research and refinement of the dataset could lead to significant advancements in the safety and capabilities of autonomous boats and ships operating in aquatic environments.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🏷️

Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles

Mingi Jeong, Arihant Chadda, Ziang Ren, Luyang Zhao, Haowen Liu, Monika Roznere, Aiwei Zhang, Yitao Jiang, Sabriel Achong, Samuel Lensgraf, Alberto Quattrini Li

This paper introduces the first publicly accessible multi-modal perception dataset for autonomous maritime navigation, focusing on in-water obstacles within the aquatic environment to enhance situational awareness for Autonomous Surface Vehicles (ASVs). This dataset, consisting of diverse objects encountered under varying environmental conditions, aims to bridge the research gap in marine robotics by providing a multi-modal, annotated, and ego-centric perception dataset, for object detection and classification. We also show the applicability of the proposed dataset's framework using deep learning-based open-source perception algorithms that have shown success. We expect that our dataset will contribute to development of the marine autonomy pipeline and marine (field) robotics. Please note this is a work-in-progress paper about our on-going research that we plan to release in full via future publication.

4/30/2024

🤷

WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset and Benchmarks for Autonomous Driving on Water Surfaces

Shanliang Yao, Runwei Guan, Zhaodong Wu, Yi Ni, Zile Huang, Ryan Wen Liu, Yong Yue, Weiping Ding, Eng Gee Lim, Hyungjoon Seo, Ka Lok Man, Jieming Ma, Xiaohui Zhu, Yutao Yue

Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camera, our Unmanned Surface Vehicle (USV) proffers all-weather solutions for discerning object-related information, including color, shape, texture, range, velocity, azimuth, and elevation. Focusing on typical static and dynamic objects on water surfaces, we label the camera images and radar point clouds at pixel-level and point-level, respectively. In addition to basic perception tasks, such as object detection, instance segmentation and semantic segmentation, we also provide annotations for free-space segmentation and waterline segmentation. Leveraging the multi-task and multi-modal data, we conduct benchmark experiments on the uni-modality of radar and camera, as well as the fused modalities. Experimental results demonstrate that 4D radar-camera fusion can considerably improve the accuracy and robustness of perception on water surfaces, especially in adverse lighting and weather conditions. WaterScenes dataset is public on https://waterscenes.github.io.

6/18/2024

🧪

CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments

Yang Zhou, Long Quang, Carlos Nieto-Granda, Giuseppe Loianno

In the past decade, although single-robot perception has made significant advancements, the exploration of multi-robot collaborative perception remains largely unexplored. This involves fusing compressed, intermittent, limited, heterogeneous, and asynchronous environmental information across multiple robots to enhance overall perception, despite challenges like sensor noise, occlusions, and sensor failures. One major hurdle has been the lack of real-world datasets. This paper presents a pioneering and comprehensive real-world multi-robot collaborative perception dataset to boost research in this area. Our dataset leverages the untapped potential of air-ground robot collaboration featuring distinct spatial viewpoints, complementary robot mobilities, coverage ranges, and sensor modalities. It features raw sensor inputs, pose estimation, and optional high-level perception annotation, thus accommodating diverse research interests. Compared to existing datasets predominantly designed for Simultaneous Localization and Mapping (SLAM), our setup ensures a diverse range and adequate overlap of sensor views to facilitate the study of multi-robot collaborative perception algorithms. We demonstrate the value of this dataset qualitatively through multiple collaborative perception tasks. We believe this work will unlock the potential research of high-level scene understanding through multi-modal collaborative perception in multi-robot settings.

5/24/2024

RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving

Haisheng Su, Feixiang Song, Cong Ma, Wei Wu, Junchi Yan

Robust object detection and tracking under arbitrary sight of view is challenging yet essential for the development of Autonomous Vehicle technology. With the growing demand of unmanned function vehicles, near-field scene understanding becomes an important research topic in the areas of low-speed autonomous driving. Due to the complexity of driving conditions and diversity of near obstacles such as blind spots and high occlusion, the perception capability of near-field environment is still inferior than its farther counterpart. To further enhance the intelligent ability of unmanned vehicles, in this paper, we construct a multimodal data collection platform based on 3 main types of sensors (Camera, LiDAR and Fisheye), which supports flexible sensor configurations to enable dynamic sight of view for ego vehicle, either global view or local view. Meanwhile, a large-scale multi-sensor dataset is built, named RoboSense, to facilitate near-field scene understanding. RoboSense contains more than 133K synchronized data with 1.4M 3D bounding box and IDs annotated in the full $360^{circ}$ view, forming 216K trajectories across 7.6K temporal sequences. It has $270times$ and $18times$ as many annotations of near-field obstacles within 5$m$ as the previous single-vehicle datasets such as KITTI and nuScenes. Moreover, we define a novel matching criterion for near-field 3D perception and prediction metrics. Based on RoboSense, we formulate 6 popular tasks to facilitate the future development of related research, where the detailed data analysis as well as benchmarks are also provided accordingly. Code and dataset will be available at https://github.com/suhaisheng/RoboSense.

9/26/2024