MAVIS: Multi-Camera Augmented Visual-Inertial SLAM using SE2(3) Based Exact IMU Pre-integration

Read original: arXiv:2309.08142 - Published 7/17/2024 by Yifu Wang, Yonhon Ng, Inkyu Sa, Alvaro Parra, Cristian Rodriguez, Tao Jun Lin, Hongdong Li

⚙️

Overview

Presents a novel Visual-Inertial SLAM (Simultaneous Localization and Mapping) system called MAVIS designed for multiple partially overlapped camera systems
Fully exploits the benefits of wide field-of-view from multi-camera systems and the metric scale measurements provided by an inertial measurement unit (IMU)
Introduces an improved IMU pre-integration formulation to enhance tracking performance under fast rotational motion and extended integration time
Extends conventional front-end tracking and back-end optimization modules designed for monocular or stereo setups towards multi-camera systems
Supported by experiments on public datasets, and won the first place in all the vision-IMU tracks on the Hilti SLAM Challenge 2023

Plain English Explanation

The research paper presents a new visual-inertial SLAM system called MAVIS that is designed to work with multiple partially overlapping camera systems. SLAM is a technique used by robots and autonomous vehicles to simultaneously map their surroundings and track their own location within that map.

MAVIS takes advantage of the wide field of view provided by using multiple cameras, as well as the precise measurements of movement and orientation provided by an inertial measurement unit (IMU). The researchers have developed an improved way of processing the IMU data that allows the system to work better when the cameras are moving quickly or for extended periods of time.

The paper also explains how the system's front-end tracking and back-end optimization modules, which are typically designed for single-camera or stereo setups, have been extended to work with multi-camera systems. These modifications have helped the MAVIS system perform well in challenging real-world scenarios.

The researchers have tested their system on publicly available datasets and report that MAVIS outperformed other SLAM systems, winning first place in all the vision-IMU tracks of the Hilti SLAM Challenge 2023.

Technical Explanation

The researchers have developed a novel optimization-based Visual-Inertial SLAM system called MAVIS that is designed to work with multiple partially overlapped camera systems. The key innovations include:

Exploiting the benefits of wide field-of-view from multi-camera systems and the metric scale measurements provided by an inertial measurement unit (IMU). This allows for more robust and accurate tracking and mapping.
An improved IMU pre-integration formulation based on the exponential function of an automorphism of SE_2(3), which can effectively enhance tracking performance under fast rotational motion and extended integration time. This addresses a common challenge in visual-inertial SLAM systems.
Extensions of the conventional front-end tracking and back-end optimization modules designed for monocular or stereo setups towards multi-camera systems. These modifications contribute to the system's performance in challenging scenarios.

The researchers have validated the practical effectiveness of their approach through experiments on public datasets. Notably, MAVIS won the first place in all the vision-IMU tracks (single and multi-session SLAM) on the Hilti SLAM Challenge 2023, outperforming other state-of-the-art systems by a significant margin.

Critical Analysis

The paper presents a well-designed and thoroughly evaluated SLAM system that addresses some of the key challenges in multi-camera visual-inertial SLAM. The improved IMU pre-integration formulation is a notable contribution that can enhance the performance of SLAM systems in demanding conditions.

However, the paper does not discuss the computational efficiency of the MAVIS system, which is an important consideration for real-world deployment, especially in resource-constrained environments like mobile robots or autonomous vehicles. Additionally, the paper does not provide detailed comparisons with other multi-camera visual-inertial SLAM systems, such as VINS-Fusion or EVI-SAM, which could help readers better understand the relative strengths and weaknesses of the MAVIS system.

Furthermore, the paper does not address potential issues related to sensor synchronization, calibration, and robustness to sensor failures, which are crucial considerations for real-world deployments of multi-sensor SLAM systems. Exploring these aspects in future research could further strengthen the MAVIS system.

Conclusion

The MAVIS system presented in this paper represents a significant advancement in multi-camera visual-inertial SLAM. By leveraging the benefits of wide field-of-view from multiple cameras and the precise measurements from an IMU, the researchers have developed a robust and accurate SLAM solution that outperforms other state-of-the-art systems.

The key innovations, such as the improved IMU pre-integration formulation and the extensions to the front-end and back-end modules, contribute to the system's strong performance in challenging scenarios. The successful results in the Hilti SLAM Challenge 2023 demonstrate the practical viability of the MAVIS approach.

While the paper does not address certain practical considerations, the MAVIS system shows great promise for applications in robotics, autonomous vehicles, and other domains that require reliable and high-precision localization and mapping capabilities. Further research and development in this direction could lead to even more advanced SLAM solutions that can truly transform the way we interact with and navigate our environments.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

⚙️

MAVIS: Multi-Camera Augmented Visual-Inertial SLAM using SE2(3) Based Exact IMU Pre-integration

Yifu Wang, Yonhon Ng, Inkyu Sa, Alvaro Parra, Cristian Rodriguez, Tao Jun Lin, Hongdong Li

We present a novel optimization-based Visual-Inertial SLAM system designed for multiple partially overlapped camera systems, named MAVIS. Our framework fully exploits the benefits of wide field-of-view from multi-camera systems, and the metric scale measurements provided by an inertial measurement unit (IMU). We introduce an improved IMU pre-integration formulation based on the exponential function of an automorphism of SE_2(3), which can effectively enhance tracking performance under fast rotational motion and extended integration time. Furthermore, we extend conventional front-end tracking and back-end optimization module designed for monocular or stereo setup towards multi-camera systems, and introduce implementation details that contribute to the performance of our system in challenging scenarios. The practical validity of our approach is supported by our experiments on public datasets. Our MAVIS won the first place in all the vision-IMU tracks (single and multi-session SLAM) on Hilti SLAM Challenge 2023 with 1.7 times the score compared to the second place.

7/17/2024

🌀

Multi-Visual-Inertial System: Analysis, Calibration and Estimation

Yulin Yang, Patrick Geneva, Guoquan Huang

In this paper, we study state estimation of multi-visual-inertial systems (MVIS) and develop sensor fusion algorithms to optimally fuse an arbitrary number of asynchronous inertial measurement units (IMUs) or gyroscopes and global and(or) rolling shutter cameras. We are especially interested in the full calibration of the associated visual-inertial sensors, including the IMU or camera intrinsics and the IMU-IMU(or camera) spatiotemporal extrinsics as well as the image readout time of rolling-shutter cameras (if used). To this end, we develop a new analytic combined IMU integration with intrinsics-termed ACI3-to preintegrate IMU measurements, which is leveraged to fuse auxiliary IMUs and(or) gyroscopes alongside a base IMU. We model the multi-inertial measurements to include all the necessary inertial intrinsic and IMU-IMU spatiotemporal extrinsic parameters, while leveraging IMU-IMU rigid-body constraints to eliminate the necessity of auxiliary inertial poses and thus reducing computational complexity. By performing observability analysis of MVIS, we prove that the standard four unobservable directions remain - no matter how many inertial sensors are used, and also identify, for the first time, degenerate motions for IMU-IMU spatiotemporal extrinsics and auxiliary inertial intrinsics. In addition to the extensive simulations that validate our analysis and algorithms, we have built our own MVIS sensor rig and collected over 25 real-world datasets to experimentally verify the proposed calibration against the state-of-the-art calibration method such as Kalibr. We show that the proposed MVIS calibration is able to achieve competing accuracy with improved convergence and repeatability, which is open sourced to better benefit the community.

9/4/2024

🌐

DVI-SLAM: A Dual Visual Inertial SLAM Network

Xiongfeng Peng, Zhihua Liu, Weiming Li, Ping Tan, SoonYong Cho, Qiang Wang

Recent deep learning based visual simultaneous localization and mapping (SLAM) methods have made significant progress. However, how to make full use of visual information as well as better integrate with inertial measurement unit (IMU) in visual SLAM has potential research value. This paper proposes a novel deep SLAM network with dual visual factors. The basic idea is to integrate both photometric factor and re-projection factor into the end-to-end differentiable structure through multi-factor data association module. We show that the proposed network dynamically learns and adjusts the confidence maps of both visual factors and it can be further extended to include the IMU factors as well. Extensive experiments validate that our proposed method significantly outperforms the state-of-the-art methods on several public datasets, including TartanAir, EuRoC and ETH3D-SLAM. Specifically, when dynamically fusing the three factors together, the absolute trajectory error for both monocular and stereo configurations on EuRoC dataset has reduced by 45.3% and 36.2% respectively.

5/28/2024

$BundledSLAM: An Accurate Visual SLAM System Using Multiple Cameras$

BundledSLAM: An Accurate Visual SLAM System Using Multiple Cameras

Han Song, Cong Liu, Huafeng Dai

Multi-camera SLAM systems offer a plethora of advantages, primarily stemming from their capacity to amalgamate information from a broader field of view, thereby resulting in heightened robustness and improved localization accuracy. In this research, we present a significant extension and refinement of the state-of-the-art stereo SLAM system, known as ORB-SLAM2, with the objective of attaining even higher precision.To accomplish this objective, we commence by mapping measurements from all cameras onto a virtual camera termed BundledFrame. This virtual camera is meticulously engineered to seamlessly adapt to multi-camera configurations, facilitating the effective fusion of data captured from multiple cameras. Additionally, we harness extrinsic parameters in the bundle adjustment (BA) process to achieve precise trajectory estimation.Furthermore, we conduct an extensive analysis of the role of bundle adjustment (BA) in the context of multi-camera scenarios, delving into its impact on tracking, local mapping, and global optimization. Our experimental evaluation entails comprehensive comparisons between ground truth data and the state-of-the-art SLAM system. To rigorously assess the system's performance, we utilize the EuRoC datasets. The consistent results of our evaluations demonstrate the superior accuracy of our system in comparison to existing approaches.

4/1/2024