VIPS-Odom: Visual-Inertial Odometry Tightly-coupled with Parking Slots for Autonomous Parking

Read original: arXiv:2407.05017 - Published 7/9/2024 by Xuefeng Jiang, Fangyuan Wang, Rongzhang Zheng, Han Liu, Yixiong Huo, Jinzhang Peng, Lu Tian, Emad Barsoum

VIPS-Odom: Visual-Inertial Odometry Tightly-coupled with Parking Slots for Autonomous Parking

Overview

This paper presents a tightly-coupled visual-inertial odometry (VIO) system called VIPS-Odom that leverages parking slot detection to improve autonomous parking.
VIPS-Odom combines visual and inertial data to estimate the vehicle's pose and integrates information about detected parking slots to enhance the odometry.
The system is designed to work in complex urban environments where GPS signals may be unreliable or unavailable.

Plain English Explanation

VIPS-Odom is a system that helps self-driving cars park themselves more accurately. It uses cameras and sensors to track the car's movement and position, and it also looks for parking spots around the car. By combining this information, VIPS-Odom can better estimate where the car is and how it needs to move to park correctly, even in crowded urban areas where GPS signals might not be strong enough.

The key idea is that by detecting the parking spots around the car and integrating that information with the data from the cameras and sensors, VIPS-Odom can create a more detailed and reliable picture of the car's location and orientation. This allows the self-driving system to navigate the car into the parking spot more precisely, which is important for safe and efficient autonomous parking.

Technical Explanation

The VIPS-Odom system combines visual and inertial data in a tight coupling to estimate the vehicle's pose during autonomous parking maneuvers. It integrates the detection of nearby parking slots to further refine the odometry estimation.

This approach is designed to work well in complex urban environments where GPS signals may be unreliable or unavailable. The visual-inertial odometry (VIO) component uses features from camera images and measurements from inertial measurement units (IMUs) to track the vehicle's motion. The parking slot detection module identifies suitable parking spots around the vehicle, and this information is then incorporated into the VIO estimation to improve the overall accuracy.

The authors evaluate VIPS-Odom on both simulated and real-world datasets, demonstrating improved positioning accuracy compared to standalone VIO approaches, especially in challenging parking scenarios. The system's ability to robustly handle occlusions and dynamic environments is highlighted as a key strength.

Critical Analysis

The VIPS-Odom system presents a promising approach to enhancing autonomous parking capabilities by tightly integrating visual-inertial odometry with parking slot detection. The authors demonstrate the system's effectiveness in improving positioning accuracy, which is crucial for precise and safe autonomous parking maneuvers.

One potential limitation of the research is the reliance on parking slot detection, which may not always be available or reliable, especially in cluttered or unstructured environments. The authors acknowledge this and suggest exploring alternative environment perception methods to further improve the system's robustness.

Additionally, the paper does not provide a detailed analysis of the computational complexity or real-time performance of the VIPS-Odom system, which would be important considerations for practical deployment in autonomous vehicles. Further investigation into the system's scalability and resource requirements would be beneficial.

Overall, the VIPS-Odom research represents a valuable contribution to the field of autonomous parking, and the authors' approach of leveraging multiple sensor modalities and environmental cues to enhance odometry estimation is a promising direction for future work in this area.

Conclusion

The VIPS-Odom system introduces a tightly-coupled visual-inertial odometry approach that integrates parking slot detection to improve autonomous parking capabilities, particularly in complex urban environments. By combining visual and inertial data with information about nearby parking spots, the system can better estimate the vehicle's pose and navigate into parking spaces more accurately.

The research demonstrates the benefits of leveraging multiple sensor modalities and environmental cues to enhance odometry estimation, which is a crucial component of autonomous driving systems. While the reliance on parking slot detection may be a limitation in some scenarios, the VIPS-Odom approach represents a significant step forward in enhancing the reliability and precision of autonomous parking, with potential implications for the broader development of self-driving technologies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

VIPS-Odom: Visual-Inertial Odometry Tightly-coupled with Parking Slots for Autonomous Parking

Xuefeng Jiang, Fangyuan Wang, Rongzhang Zheng, Han Liu, Yixiong Huo, Jinzhang Peng, Lu Tian, Emad Barsoum

Precise localization is of great importance for autonomous parking task since it provides service for the downstream planning and control modules, which significantly affects the system performance. For parking scenarios, dynamic lighting, sparse textures, and the instability of global positioning system (GPS) signals pose challenges for most traditional localization methods. To address these difficulties, we propose VIPS-Odom, a novel semantic visual-inertial odometry framework for underground autonomous parking, which adopts tightly-coupled optimization to fuse measurements from multi-modal sensors and solves odometry. Our VIPS-Odom integrates parking slots detected from the synthesized bird-eye-view (BEV) image with traditional feature points in the frontend, and conducts tightly-coupled optimization with joint constraints introduced by measurements from the inertial measurement unit, wheel speed sensor and parking slots in the backend. We develop a multi-object tracking framework to robustly track parking slots' states. To prove the superiority of our method, we equip an electronic vehicle with related sensors and build an experimental platform based on ROS2 system. Extensive experiments demonstrate the efficacy and advantages of our method compared with other baselines for parking scenarios.

7/9/2024

🔗

AVM-SLAM: Semantic Visual SLAM with Multi-Sensor Fusion in a Bird's Eye View for Automated Valet Parking

Ye Li, Wenchao Yang, Dekun Lin, Qianlei Wang, Zhe Cui, Xiaolin Qin

Accurate localization in challenging garage environments -- marked by poor lighting, sparse textures, repetitive structures, dynamic scenes, and the absence of GPS -- is crucial for automated valet parking (AVP) tasks. Addressing these challenges, our research introduces AVM-SLAM, a cutting-edge semantic visual SLAM architecture with multi-sensor fusion in a bird's eye view (BEV). This novel framework synergizes the capabilities of four fisheye cameras, wheel encoders, and an inertial measurement unit (IMU) to construct a robust SLAM system. Unique to our approach is the implementation of a flare removal technique within the BEV imagery, significantly enhancing road marking detection and semantic feature extraction by convolutional neural networks for superior mapping and localization. Our work also pioneers a semantic pre-qualification (SPQ) module, designed to adeptly handle the challenges posed by environments with repetitive textures, thereby enhancing loop detection and system robustness. To demonstrate the effectiveness and resilience of AVM-SLAM, we have released a specialized multi-sensor and high-resolution dataset of an underground garage, accessible at https://yale-cv.github.io/avm-slam_dataset, encouraging further exploration and validation of our approach within similar settings.

7/2/2024

Enhanced Parking Perception by Multi-Task Fisheye Cross-view Transformers

Antonyo Musabini, Ivan Novikov, Sana Soula, Christel Leonet, Lihao Wang, Rachid Benmokhtar, Fabian Burger, Thomas Boulay, Xavier Perrotton

Current parking area perception algorithms primarily focus on detecting vacant slots within a limited range, relying on error-prone homographic projection for both labeling and inference. However, recent advancements in Advanced Driver Assistance System (ADAS) require interaction with end-users through comprehensive and intelligent Human-Machine Interfaces (HMIs). These interfaces should present a complete perception of the parking area going from distinguishing vacant slots' entry lines to the orientation of other parked vehicles. This paper introduces Multi-Task Fisheye Cross View Transformers (MT F-CVT), which leverages features from a four-camera fisheye Surround-view Camera System (SVCS) with multihead attentions to create a detailed Bird-Eye View (BEV) grid feature map. Features are processed by both a segmentation decoder and a Polygon-Yolo based object detection decoder for parking slots and vehicles. Trained on data labeled using LiDAR, MT F-CVT positions objects within a 25m x 25m real open-road scenes with an average error of only 20 cm. Our larger model achieves an F-1 score of 0.89. Moreover the smaller model operates at 16 fps on an Nvidia Jetson Orin embedded board, with similar detection results to the larger one. MT F-CVT demonstrates robust generalization capability across different vehicles and camera rig configurations. A demo video from an unseen vehicle and camera rig is available at: https://streamable.com/jjw54x.

8/23/2024

Smart Camera Parking System With Auto Parking Spot Detection

Tuan T. Nguyen, Mina Sartipi

Given the rising urban population and the consequential rise in traffic congestion, the implementation of smart parking systems has emerged as a critical matter of concern. Smart parking solutions use cameras, sensors, and algorithms like computer vision to find available parking spaces. This method improves parking place recognition, reduces traffic and pollution, and optimizes travel time. In recent years, computer vision-based approaches have been widely used. However, most existing studies rely on manually labeled parking spots, which has implications for the cost and practicality of implementation. To solve this problem, we propose a novel approach PakLoc, which automatically localize parking spots. Furthermore, we present the PakSke module, which automatically adjust the rotation and the size of detected bounding box. The efficacy of our proposed methodology on the PKLot dataset results in a significant reduction in human labor of 94.25%. Another fundamental aspect of a smart parking system is its capacity to accurately determine and indicate the state of parking spots within a parking lot. The conventional approach involves employing classification techniques to forecast the condition of parking spots based on the bounding boxes derived from manually labeled grids. In this study, we provide a novel approach called PakSta for identifying the state of parking spots automatically. Our method utilizes object detector from PakLoc to simultaneously determine the occupancy status of all parking lots within a video frame. Our proposed method PakSta exhibits a competitive performance on the PKLot dataset when compared to other classification methods.

7/9/2024