High-throughput Visual Nano-drone to Nano-drone Relative Localization using Onboard Fully Convolutional Networks

2402.13756

Published 4/3/2024 by Luca Crupi, Alessandro Giusti, Daniele Palossi

High-throughput Visual Nano-drone to Nano-drone Relative Localization using Onboard Fully Convolutional Networks

Abstract

Relative drone-to-drone localization is a fundamental building block for any swarm operations. We address this task in the context of miniaturized nano-drones, i.e., 10cm in diameter, which show an ever-growing interest due to novel use cases enabled by their reduced form factor. The price for their versatility comes with limited onboard resources, i.e., sensors, processing units, and memory, which limits the complexity of the onboard algorithms. A traditional solution to overcome these limitations is represented by lightweight deep learning models directly deployed aboard nano-drones. This work tackles the challenging relative pose estimation between nano-drones using only a gray-scale low-resolution camera and an ultra-low-power System-on-Chip (SoC) hosted onboard. We present a vertically integrated system based on a novel vision-based fully convolutional neural network (FCNN), which runs at 39Hz within 101mW onboard a Crazyflie nano-drone extended with the GWT GAP8 SoC. We compare our FCNN against three State-of-the-Art (SoA) systems. Considering the best-performing SoA approach, our model results in an R-squared improvement from 32 to 47% on the horizontal image coordinate and from 18 to 55% on the vertical image coordinate, on a real-world dataset of 30k images. Finally, our in-field tests show a reduction of the average tracking error of 37% compared to a previous SoA work and an endurance performance up to the entire battery lifetime of 4 minutes.

Create account to get full access

Overview

This paper presents a novel approach for high-throughput visual localization between nano-drones using onboard fully convolutional networks.
The authors developed a system that allows nano-drones to precisely determine their relative position and orientation in real-time using only onboard cameras and deep learning.
This enables nano-drones to autonomously coordinate and collaborate without the need for external positioning systems like GPS.

Plain English Explanation

The researchers in this study wanted to find a way for tiny drones, called nano-drones, to figure out where they are in relation to each other using only the cameras on the drones themselves. This is important because nano-drones are so small that they can't use GPS, which is how larger drones and robots often know their location.

The key idea is to use a powerful type of artificial intelligence called a convolutional neural network. This neural network is trained on many examples of what drones see when they are in different positions relative to each other. Once trained, the neural network can then analyze the video from a drone's camera and instantly determine the drone's exact position and orientation compared to the other drones around it.

This allows the nano-drones to autonomously coordinate and work together, even in complex environments where GPS doesn't work well, like indoors or in cities. The drones can now keep track of each other's locations in real-time using only their onboard cameras and the deep learning algorithms, without needing any external positioning systems.

Technical Explanation

The paper proposes a system for high-throughput visual localization between nano-drones using fully convolutional networks. The key technical components include:

Dataset: The authors created a large synthetic dataset of nano-drone camera images labeled with precise 6-DOF relative poses. This allowed training robust neural network models.
Architecture: They developed a fully convolutional network architecture that takes in a single RGB image from a nano-drone's camera and outputs the 6-DOF relative pose to nearby drones.
Training: The network was trained end-to-end using the synthetic dataset, with techniques like data augmentation to improve generalization.
Deployment: The trained model runs onboard the nano-drones in real-time, enabling robust and high-speed relative localization without any external infrastructure.

Experiments showed the system can achieve sub-degree angular accuracy and centimeter-level position accuracy at high frame rates, outperforming traditional approaches. This enables seamless autonomous coordination between multiple nano-drones using only onboard sensing and computation.

Critical Analysis

The paper presents a compelling technical solution to a challenging problem in nano-drone coordination. However, a few caveats and areas for further research are worth noting:

The reliance on a synthetic training dataset raises questions about how well the models will generalize to real-world nano-drone environments, which may have different lighting, textures, and occlusions.
The paper does not explore the system's robustness to sensor noise, battery life constraints, or potential communication failures between drones, which will be important for real-world deployment.
While the proposed approach enables high-throughput localization, the computational and energy requirements of running the neural network onboard small nano-drones may still be a limiting factor.

Overall, this research represents an important step forward in enabling autonomous nano-drone swarms. However, further work is needed to address the practical challenges of deploying such systems in complex real-world environments.

Conclusion

This paper presents a novel deep learning-based approach for enabling high-throughput visual localization between nano-drones. By training convolutional neural networks on synthetic data, the researchers developed a system that allows nano-drones to precisely determine their relative positions and orientations using only onboard cameras.

This breakthrough enables seamless autonomous coordination and collaboration between multiple nano-drones, without the need for external positioning infrastructure like GPS. As nano-drone technologies continue to advance, this work represents a significant contribution towards realizing the vision of highly capable and self-organizing drone swarms that can operate in a wide range of complex environments.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🔎

C2FDrone: Coarse-to-Fine Drone-to-Drone Detection using Vision Transformer Networks

Sairam VC Rebbapragada, Pranoy Panda, Vineeth N Balasubramanian

A vision-based drone-to-drone detection system is crucial for various applications like collision avoidance, countering hostile drones, and search-and-rescue operations. However, detecting drones presents unique challenges, including small object sizes, distortion, occlusion, and real-time processing requirements. Current methods integrating multi-scale feature fusion and temporal information have limitations in handling extreme blur and minuscule objects. To address this, we propose a novel coarse-to-fine detection strategy based on vision transformers. We evaluate our approach on three challenging drone-to-drone detection datasets, achieving F1 score enhancements of 7%, 3%, and 1% on the FL-Drones, AOT, and NPS-Drones datasets, respectively. Additionally, we demonstrate real-time processing capabilities by deploying our model on an edge-computing device. Our code will be made publicly available.

5/1/2024

cs.CV

🤿

An Attention-Based Deep Learning Architecture for Real-Time Monocular Visual Odometry: Applications to GPS-free Drone Navigation

Olivier Brochu Dufour, Abolfazl Mohebbi, Sofiane Achiche

Drones are increasingly used in fields like industry, medicine, research, disaster relief, defense, and security. Technical challenges, such as navigation in GPS-denied environments, hinder further adoption. Research in visual odometry is advancing, potentially solving GPS-free navigation issues. Traditional visual odometry methods use geometry-based pipelines which, while popular, often suffer from error accumulation and high computational demands. Recent studies utilizing deep neural networks (DNNs) have shown improved performance, addressing these drawbacks. Deep visual odometry typically employs convolutional neural networks (CNNs) and sequence modeling networks like recurrent neural networks (RNNs) to interpret scenes and deduce visual odometry from video sequences. This paper presents a novel real-time monocular visual odometry model for drones, using a deep neural architecture with a self-attention module. It estimates the ego-motion of a camera on a drone, using consecutive video frames. An inference utility processes the live video feed, employing deep learning to estimate the drone's trajectory. The architecture combines a CNN for image feature extraction and a long short-term memory (LSTM) network with a multi-head attention module for video sequence modeling. Tested on two visual odometry datasets, this model converged 48% faster than a previous RNN model and showed a 22% reduction in mean translational drift and a 12% improvement in mean translational absolute trajectory error, demonstrating enhanced robustness to noise.

4/30/2024

cs.RO cs.CV cs.LG eess.IV

Fusing Multi-sensor Input with State Information on TinyML Brains for Autonomous Nano-drones

Luca Crupi, Elia Cereda, Daniele Palossi

Autonomous nano-drones (~10 cm in diameter), thanks to their ultra-low power TinyML-based brains, are capable of coping with real-world environments. However, due to their simplified sensors and compute units, they are still far from the sense-and-act capabilities shown in their bigger counterparts. This system paper presents a novel deep learning-based pipeline that fuses multi-sensorial input (i.e., low-resolution images and 8x8 depth map) with the robot's state information to tackle a human pose estimation task. Thanks to our design, the proposed system -- trained in simulation and tested on a real-world dataset -- improves a state-unaware State-of-the-Art baseline by increasing the R^2 regression metric up to 0.10 on the distance's prediction.

4/4/2024

cs.RO

Leveraging edge detection and neural networks for better UAV localization

Theo Di Piazza, Enric Meinhardt-Llopis, Gabriele Facciolo, Benedicte Bascle, Corentin Abgrall, Jean-Clement Devaux

We propose a novel method for geolocalizing Unmanned Aerial Vehicles (UAVs) in environments lacking Global Navigation Satellite Systems (GNSS). Current state-of-the-art techniques employ an offline-trained encoder to generate a vector representation (embedding) of the UAV's current view, which is then compared with pre-computed embeddings of geo-referenced images to determine the UAV's position. Here, we demonstrate that the performance of these methods can be significantly enhanced by preprocessing the images to extract their edges, which exhibit robustness to seasonal and illumination variations. Furthermore, we establish that utilizing edges enhances resilience to orientation and altitude inaccuracies. Additionally, we introduce a confidence criterion for localization. Our findings are substantiated through synthetic experiments.

6/4/2024

cs.CV