Multi-Visual-Inertial System: Analysis, Calibration and Estimation

Read original: arXiv:2308.05303 - Published 9/4/2024 by Yulin Yang, Patrick Geneva, Guoquan Huang

🌀

Overview

The paper focuses on state estimation for multi-visual-inertial systems (MVIS), which involves fusing data from multiple asynchronous inertial measurement units (IMUs) or gyroscopes and global and/or rolling shutter cameras.
The researchers developed sensor fusion algorithms to optimally combine these various sensor inputs, while also calibrating the associated visual-inertial sensors, including IMU/camera intrinsics and IMU-IMU(or camera) spatiotemporal extrinsics.
They introduce a new analytical combined IMU integration method called ACI3 to preintegrate IMU measurements, which is used to fuse auxiliary IMUs and/or gyroscopes alongside a base IMU.
The paper also includes observability analysis of MVIS, identifying the standard four unobservable directions as well as degenerate motions for IMU-IMU spatiotemporal extrinsics and auxiliary inertial intrinsics.

Plain English Explanation

The researchers in this paper looked at ways to combine information from multiple cameras and motion sensors (called inertial measurement units or IMUs) to get a better understanding of the position and orientation of a system. This is important for applications like robotics, augmented reality, and autonomous vehicles.

One key challenge is that the cameras and IMUs may not be perfectly synchronized or calibrated. So the researchers developed new algorithms to fuse the sensor data in an optimal way, while also calibrating the sensors to account for things like the camera's field of view and the exact positioning of the IMUs.

They introduced a new mathematical technique called ACI3 to help combine the measurements from multiple IMUs. And they analyzed the observability of the system - meaning they identified which properties of the system's state can be reliably estimated from the sensor data, and which ones are hard to pin down.

Overall, the goal was to create a robust and accurate way to track the motion and position of a system using a variety of visual and inertial sensors. This could have applications in areas like robotics and augmented reality.

Technical Explanation

The paper presents a sensor fusion framework for multi-visual-inertial systems (MVIS). The key contributions include:

Sensor Fusion: The researchers developed algorithms to optimally fuse data from an arbitrary number of asynchronous inertial measurement units (IMUs) or gyroscopes and global and/or rolling shutter cameras.
Sensor Calibration: They addressed the full calibration of the visual-inertial sensors, including the IMU or camera intrinsics and the IMU-IMU(or camera) spatiotemporal extrinsics, as well as the image readout time of rolling-shutter cameras.
ACI3 Preintegration: The team introduced a new analytic combined IMU integration method called ACI3 to preintegrate IMU measurements. This enables fusing auxiliary IMUs and/or gyroscopes alongside a base IMU.
Observability Analysis: The researchers performed an observability analysis of MVIS, proving that the standard four unobservable directions remain, regardless of the number of inertial sensors used. They also identified degenerate motions for IMU-IMU spatiotemporal extrinsics and auxiliary inertial intrinsics.
Experimental Validation: In addition to extensive simulations, the team built their own MVIS sensor rig and collected over 25 real-world datasets. They showed that their proposed calibration method achieves competing accuracy with improved convergence and repeatability compared to the state-of-the-art Kalibr calibration.

Critical Analysis

The paper presents a comprehensive approach to state estimation for multi-visual-inertial systems. The researchers have addressed several key challenges, including sensor synchronization, calibration, and observability analysis.

One potential limitation is the computational complexity of fusing data from multiple IMUs and cameras. The authors claim to have reduced this complexity by leveraging IMU-IMU rigid-body constraints, but the scalability of the approach for larger sensor arrays could still be an issue.

Additionally, the real-world experiments were conducted using the researchers' own custom MVIS sensor rig. It would be valuable to see the performance of their methods on a wider range of hardware setups, including commercial off-the-shelf components, to assess the generalizability of the results.

Further research could also explore the use of event-based cameras in MVIS, which could provide additional benefits in terms of temporal resolution and robustness to motion blur.

Conclusion

This paper presents a comprehensive framework for state estimation in multi-visual-inertial systems. The researchers developed novel sensor fusion algorithms and calibration methods to optimally combine data from multiple asynchronous IMUs and cameras. Their work includes important theoretical contributions, such as observability analysis, as well as experimental validation on real-world datasets.

The proposed techniques could have significant practical applications in fields like robotics, augmented reality, and autonomous vehicles, where accurate and robust motion tracking is crucial. By open-sourcing their work, the researchers aim to benefit the wider research community and drive further advancements in this area.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌀

Multi-Visual-Inertial System: Analysis, Calibration and Estimation

Yulin Yang, Patrick Geneva, Guoquan Huang

In this paper, we study state estimation of multi-visual-inertial systems (MVIS) and develop sensor fusion algorithms to optimally fuse an arbitrary number of asynchronous inertial measurement units (IMUs) or gyroscopes and global and(or) rolling shutter cameras. We are especially interested in the full calibration of the associated visual-inertial sensors, including the IMU or camera intrinsics and the IMU-IMU(or camera) spatiotemporal extrinsics as well as the image readout time of rolling-shutter cameras (if used). To this end, we develop a new analytic combined IMU integration with intrinsics-termed ACI3-to preintegrate IMU measurements, which is leveraged to fuse auxiliary IMUs and(or) gyroscopes alongside a base IMU. We model the multi-inertial measurements to include all the necessary inertial intrinsic and IMU-IMU spatiotemporal extrinsic parameters, while leveraging IMU-IMU rigid-body constraints to eliminate the necessity of auxiliary inertial poses and thus reducing computational complexity. By performing observability analysis of MVIS, we prove that the standard four unobservable directions remain - no matter how many inertial sensors are used, and also identify, for the first time, degenerate motions for IMU-IMU spatiotemporal extrinsics and auxiliary inertial intrinsics. In addition to the extensive simulations that validate our analysis and algorithms, we have built our own MVIS sensor rig and collected over 25 real-world datasets to experimentally verify the proposed calibration against the state-of-the-art calibration method such as Kalibr. We show that the proposed MVIS calibration is able to achieve competing accuracy with improved convergence and repeatability, which is open sourced to better benefit the community.

9/4/2024

⚙️

MAVIS: Multi-Camera Augmented Visual-Inertial SLAM using SE2(3) Based Exact IMU Pre-integration

Yifu Wang, Yonhon Ng, Inkyu Sa, Alvaro Parra, Cristian Rodriguez, Tao Jun Lin, Hongdong Li

We present a novel optimization-based Visual-Inertial SLAM system designed for multiple partially overlapped camera systems, named MAVIS. Our framework fully exploits the benefits of wide field-of-view from multi-camera systems, and the metric scale measurements provided by an inertial measurement unit (IMU). We introduce an improved IMU pre-integration formulation based on the exponential function of an automorphism of SE_2(3), which can effectively enhance tracking performance under fast rotational motion and extended integration time. Furthermore, we extend conventional front-end tracking and back-end optimization module designed for monocular or stereo setup towards multi-camera systems, and introduce implementation details that contribute to the performance of our system in challenging scenarios. The practical validity of our approach is supported by our experiments on public datasets. Our MAVIS won the first place in all the vision-IMU tracks (single and multi-session SLAM) on Hilti SLAM Challenge 2023 with 1.7 times the score compared to the second place.

7/17/2024

📉

VINS-Multi: A Robust Asynchronous Multi-camera-IMU State Estimator

Luqi Wang, Yang Xu, Shaojie Shen

State estimation is a critical foundational module in robotics applications, where robustness and performance are paramount. Although in recent years, many works have been focusing on improving one of the most widely adopted state estimation methods, visual inertial odometry (VIO), by incorporating multiple cameras, these efforts predominantly address synchronous camera systems. Asynchronous cameras, which offer simpler hardware configurations and enhanced resilience, have been largely overlooked. To fill this gap, this paper presents VINS-Multi, a novel multi-camera-IMU state estimator for asynchronous cameras. The estimator comprises parallel front ends, a front end coordinator, and a back end optimization module capable of handling asynchronous input frames. It utilizes the frames effectively through a dynamic feature number allocation and a frame priority coordination strategy. The proposed estimator is integrated into a customized quadrotor platform and tested in multiple realistic and challenging scenarios to validate its practicality. Additionally, comprehensive benchmark results are provided to showcase the robustness and superior performance of the proposed estimator.

5/24/2024

Pose, Velocity and Landmark Position Estimation Using IMU and Bearing Measurements

Miaomiao Wang, Abdelhamid Tayebi

This paper investigates the estimation problem of the pose (orientation and position) and linear velocity of a rigid body, as well as the landmark positions, using an inertial measurement unit (IMU) and a monocular camera. First, we propose a globally exponentially stable (GES) linear time-varying (LTV) observer for the estimation of body-frame landmark positions and velocity, using IMU and monocular bearing measurements. Thereafter, using the gyro measurements, some landmarks known in the inertial frame and the estimates from the LTV observer, we propose a nonlinear pose observer on $SO(3)times mathbb{R}^3$. The overall estimation system is shown to be almost globally asymptotically stable (AGAS) using the notion of almost global input-to-state stability (ISS). Interestingly, we show that with the knowledge (in the inertial frame) of a small number of landmarks, we can recover (under some conditions) the unknown positions (in the inertial frame) of a large number of landmarks. Numerical simulation results are presented to illustrate the performance of the proposed estimation scheme.

7/26/2024