3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

Read original: arXiv:2406.04875 - Published 6/10/2024 by Xiaobiao Du, Haiyang Sun, Shuyun Wang, Zhuojie Wu, Hongwei Sheng, Jiaying Ying, Ming Lu, Tianqing Zhu, Kun Zhan, Xin Yu

3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

Overview

Presents a new large-scale dataset called "3DRealCar" for RGB-D car modeling and detection
Dataset contains 360-degree views of cars captured in the wild across diverse environments and conditions
Provides comprehensive annotations including 3D bounding boxes, poses, and instance segmentation
Enables research on 3D car modeling, detection, and understanding in real-world settings

Plain English Explanation

This research paper introduces a new dataset called "3DRealCar" that can help advance the field of computer vision for autonomous vehicles. The dataset contains 360-degree views of real-world cars captured in various outdoor environments and situations.

Unlike many existing car datasets that are limited to specific studio or lab settings, 3DRealCar provides a more realistic and diverse set of data. It includes annotations such as 3D bounding boxes, poses, and instance segmentation that can aid in tasks like 3D car modeling, detection, and understanding.

Having a comprehensive dataset of real-world cars can enable researchers to develop more robust and generalizable computer vision models for autonomous driving applications. This is an important step towards making self-driving cars function reliably in the messy, unpredictable conditions found in the real world.

Technical Explanation

The 3DRealCar dataset is unique in that it provides 360-degree views of cars captured across diverse outdoor environments, weather conditions, and viewing angles. This contrasts with many prior car datasets that are limited to specific studio or controlled settings.

The authors leveraged a fleet of cars equipped with RGB-D sensors to capture the 3DRealCar dataset "in the wild". In addition to the raw image and depth data, the dataset includes detailed annotations such as 3D bounding boxes, car poses, and instance segmentation masks. These annotations enable a variety of research tasks related to 3D car modeling, detection, and understanding.

Extensive experiments demonstrate the value of the 3DRealCar dataset for advancing the state-of-the-art in areas like 3D object detection and 6-DoF pose estimation for cars in real-world environments. The dataset provides a more realistic and challenging benchmark compared to prior datasets, pushing the boundaries of what current computer vision techniques can achieve.

Critical Analysis

A key strength of the 3DRealCar dataset is its diversity in capturing cars across a wide range of real-world conditions. This helps ensure that models trained on the data will be robust to the complex, uncontrolled settings found in the actual deployment of autonomous vehicles.

However, the authors acknowledge that the dataset is still limited in its geographic and environmental coverage, as the data was primarily collected in a single city. Expanding the dataset to include cars from different regions and countries would further enhance its utility.

Additionally, while the 3D annotation quality is impressive, there may be some inaccuracies or inconsistencies due to the challenges of annotating data collected "in the wild". The authors could consider conducting a human evaluation of the annotation accuracy to quantify this potential issue.

Overall, the 3DRealCar dataset represents an important contribution to the field of autonomous driving research. By providing a more realistic and diverse benchmark, it pushes the community to develop more capable computer vision techniques for real-world deployment.

Conclusion

The 3DRealCar dataset introduced in this paper fills an important gap in the autonomous driving research community. By capturing 360-degree RGB-D data of cars in diverse outdoor environments, it enables the development of more robust and generalizable 3D car modeling, detection, and understanding algorithms.

The comprehensive annotations and the dataset's real-world nature make it a valuable resource for advancing the state-of-the-art in areas like object detection and pose estimation. As the field of autonomous driving continues to progress, datasets like 3DRealCar will play a crucial role in bridging the gap between lab-based research and real-world deployment.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

Xiaobiao Du, Haiyang Sun, Shuyun Wang, Zhuojie Wu, Hongwei Sheng, Jiaying Ying, Ming Lu, Tianqing Zhu, Kun Zhan, Xin Yu

3D cars are commonly used in self-driving systems, virtual/augmented reality, and games. However, existing 3D car datasets are either synthetic or low-quality, presenting a significant gap toward the high-quality real-world 3D car datasets and limiting their applications in practical scenarios. In this paper, we propose the first large-scale 3D real car dataset, termed 3DRealCar, offering three distinctive features. (1) textbf{High-Volume}: 2,500 cars are meticulously scanned by 3D scanners, obtaining car images and point clouds with real-world dimensions; (2) textbf{High-Quality}: Each car is captured in an average of 200 dense, high-resolution 360-degree RGB-D views, enabling high-fidelity 3D reconstruction; (3) textbf{High-Diversity}: The dataset contains various cars from over 100 brands, collected under three distinct lighting conditions, including reflective, standard, and dark. Additionally, we offer detailed car parsing maps for each instance to promote research in car parsing tasks. Moreover, we remove background point clouds and standardize the car orientation to a unified axis for the reconstruction only on cars without background and controllable rendering. We benchmark 3D reconstruction results with state-of-the-art methods across each lighting condition in 3DRealCar. Extensive experiments demonstrate that the standard lighting condition part of 3DRealCar can be used to produce a large number of high-quality 3D cars, improving various 2D and 3D tasks related to cars. Notably, our dataset brings insight into the fact that recent 3D reconstruction methods face challenges in reconstructing high-quality 3D cars under reflective and dark lighting conditions. textcolor{red}{href{https://xiaobiaodu.github.io/3drealcar/}{Our dataset is available here.}}

6/10/2024

DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction

Xiaobiao Du, Haiyang Sun, Ming Lu, Tianqing Zhu, Xin Yu

Self-driving industries usually employ professional artists to build exquisite 3D cars. However, it is expensive to craft large-scale digital assets. Since there are already numerous datasets available that contain a vast number of images of cars, we focus on reconstructing high-quality 3D car models from these datasets. However, these datasets only contain one side of cars in the forward-moving scene. We try to use the existing generative models to provide more supervision information, but they struggle to generalize well in cars since they are trained on synthetic datasets not car-specific. In addition, The reconstructed 3D car texture misaligns due to a large error in camera pose estimation when dealing with in-the-wild images. These restrictions make it challenging for previous methods to reconstruct complete 3D cars. To address these problems, we propose a novel method, named DreamCar, which can reconstruct high-quality 3D cars given a few images even a single image. To generalize the generative model, we collect a car dataset, named Car360, with over 5,600 vehicles. With this dataset, we make the generative model more robust to cars. We use this generative prior specific to the car to guide its reconstruction via Score Distillation Sampling. To further complement the supervision information, we utilize the geometric and appearance symmetry of cars. Finally, we propose a pose optimization method that rectifies poses to tackle texture misalignment. Extensive experiments demonstrate that our method significantly outperforms existing methods in reconstructing high-quality 3D cars. href{https://xiaobiaodu.github.io/dreamcar-project/}{Our code is available.}

7/31/2024

RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

Hongchi Xia, Yang Fu, Sifei Liu, Xiaolong Wang

We introduce a new RGB-D object dataset captured in the wild called WildRGB-D. Unlike most existing real-world object-centric datasets which only come with RGB capturing, the direct capture of the depth channel allows better 3D annotations and broader downstream applications. WildRGB-D comprises large-scale category-level RGB-D object videos, which are taken using an iPhone to go around the objects in 360 degrees. It contains around 8500 recorded objects and nearly 20000 RGB-D videos across 46 common object categories. These videos are taken with diverse cluttered backgrounds with three setups to cover as many real-world scenarios as possible: (i) a single object in one video; (ii) multiple objects in one video; and (iii) an object with a static hand in one video. The dataset is annotated with object masks, real-world scale camera poses, and reconstructed aggregated point clouds from RGBD videos. We benchmark four tasks with WildRGB-D including novel view synthesis, camera pose estimation, object 6d pose estimation, and object surface reconstruction. Our experiments show that the large-scale capture of RGB-D objects provides a large potential to advance 3D object learning. Our project page is https://wildrgbd.github.io/.

7/30/2024

360 in the Wild: Dataset for Depth Prediction and View Synthesis

Kibaek Park, Francois Rameau, Jaesik Park, In So Kweon

The large abundance of perspective camera datasets facilitated the emergence of novel learning-based strategies for various tasks, such as camera localization, single image depth estimation, or view synthesis. However, panoramic or omnidirectional image datasets, including essential information, such as pose and depth, are mostly made with synthetic scenes. In this work, we introduce a large scale 360$^{circ}$ videos dataset in the wild. This dataset has been carefully scraped from the Internet and has been captured from various locations worldwide. Hence, this dataset exhibits very diversified environments (e.g., indoor and outdoor) and contexts (e.g., with and without moving objects). Each of the 25K images constituting our dataset is provided with its respective camera's pose and depth map. We illustrate the relevance of our dataset for two main tasks, namely, single image depth estimation and view synthesis.

7/8/2024