Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVs

Read original: arXiv:2407.12675 - Published 7/18/2024 by Lorenzo Lamberti, Lorenzo Bellone, Luka Macan, Enrico Natalizio, Francesco Conti, Daniele Palossi, Luca Benini

Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVs

Overview

This paper presents a method for distilling tiny and ultra-fast deep neural networks for autonomous navigation on nano-UAVs (Unmanned Aerial Vehicles).
The key challenges addressed are the limited computational resources and power constraints of nano-UAVs, which make it difficult to run complex deep learning models for tasks like navigation.
The proposed approach involves training a larger "teacher" model and then distilling its knowledge into a smaller, more efficient "student" model that can run on the nano-UAV hardware.

Plain English Explanation

Nano-UAVs are very small drones that have limited computing power and battery life. Traditional deep learning models for tasks like navigation are often too large and complex to run efficiently on these devices. The researchers in this paper developed a way to "distill" the knowledge from a larger, more powerful deep learning model into a much smaller and faster model that can run on nano-UAVs.

The key idea is to first train a large "teacher" model that is good at the navigation task, and then use a technique called "knowledge distillation" to transfer the essential information from the teacher model into a smaller "student" model. This student model is optimized to be tiny and ultra-fast, so it can be deployed on the constrained hardware of a nano-UAV. By leveraging the knowledge from the larger teacher model, the student model can still perform well on the navigation task, even though it is much simpler and more efficient.

This approach allows nano-UAVs to benefit from the power of deep learning for autonomous navigation, without requiring them to have the same level of computational resources as larger drones or ground-based robots. The researchers demonstrate that their distilled models can achieve good performance while running in real-time on low-power nano-UAV hardware.

Technical Explanation

The paper presents a method for distilling tiny and ultra-fast deep neural networks for autonomous navigation on nano-UAVs. The key technical contributions are:

Teacher-Student Framework: The researchers first train a larger "teacher" model that is capable of performing the navigation task well. They then use a knowledge distillation technique to transfer the essential knowledge from the teacher model into a smaller "student" model.
Model Compression and Acceleration: The student model is designed to be extremely compact and efficient, with a focus on minimizing the number of parameters and computational operations required. This is achieved through techniques like network pruning, weight quantization, and architecture search.
Optimization for Nano-UAV Hardware: The researchers carefully optimize the student model to run efficiently on the limited computing resources and power constraints of nano-UAV platforms. This includes hardware-aware design choices and careful balancing of model size, latency, and accuracy.
Evaluation on Challenging Datasets: The performance of the distilled models is evaluated on benchmark datasets for autonomous navigation, including challenging environments and multi-sensor fusion tasks. The results demonstrate the effectiveness of the proposed approach in delivering high-performance navigation capabilities on nano-UAVs.

Critical Analysis

The paper presents a well-designed solution to a practical problem in the field of nano-UAV autonomy. The researchers have clearly identified the key challenges of limited computational resources and power constraints, and have developed a novel approach to address them.

One potential limitation of the work is that it focuses solely on the navigation task, and does not consider other critical capabilities that nano-UAVs may require, such as object detection, obstacle avoidance, or environmental sensing. While the distilled models perform well on the navigation task, it would be interesting to see how they could be extended to handle a broader range of tasks.

Additionally, the paper does not provide much insight into the trade-offs between model size, latency, and accuracy. It would be helpful to see a more detailed analysis of how these factors are balanced in the design of the student models, and how they may vary depending on the specific hardware and deployment constraints of different nano-UAV platforms.

Overall, the research presented in this paper is a significant contribution to the field of embedded AI and nano-UAV autonomy. The ability to run high-performance deep learning models on resource-constrained devices is a key enabler for the widespread adoption of these technologies in real-world applications.

Conclusion

This paper presents a novel approach for distilling tiny and ultra-fast deep neural networks for autonomous navigation on nano-UAVs. By leveraging a teacher-student framework and advanced model compression techniques, the researchers have demonstrated the ability to transfer the essential knowledge from a larger, more capable model into a highly efficient student model that can run in real-time on the limited hardware of nano-UAVs.

The key significance of this work is its potential to unlock the power of deep learning for a wide range of nano-UAV applications, where the computational constraints have previously been a major barrier. The distilled models developed in this research could enable nano-UAVs to perform advanced tasks like navigation, object detection, and environmental sensing, opening up new possibilities for applications in areas such as search and rescue, infrastructure inspection, and environmental monitoring.

As nano-UAV technology continues to evolve, the techniques presented in this paper will become increasingly important for ensuring that these small, low-power devices can still benefit from the latest advancements in artificial intelligence and deep learning.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVs

Lorenzo Lamberti, Lorenzo Bellone, Luka Macan, Enrico Natalizio, Francesco Conti, Daniele Palossi, Luca Benini

Nano-sized unmanned aerial vehicles (UAVs) are ideal candidates for flying Internet-of-Things smart sensors to collect information in narrow spaces. This requires ultra-fast navigation under very tight memory/computation constraints. The PULP-Dronet convolutional neural network (CNN) enables autonomous navigation running aboard a nano-UAV at 19 frame/s, at the cost of a large memory footprint of 320 kB -- and with drone control in complex scenarios hindered by the disjoint training of collision avoidance and steering capabilities. In this work, we distill a novel family of CNNs with better capabilities than PULP-Dronet, but memory footprint reduced by up to 168x (down to 2.9 kB), achieving an inference rate of up to 139 frame/s; we collect a new open-source unified collision/steering 66 k images dataset for more robust navigation; and we perform a thorough in-field analysis of both PULP-Dronet and our tiny CNNs running on a commercially available nano-UAV. Our tiniest CNN, called Tiny-PULP-Dronet v3, navigates with a 100% success rate a challenging and never-seen-before path, composed of a narrow obstacle-populated corridor and a 180{deg} turn, at a maximum target speed of 0.5 m/s. In the same scenario, the SoA PULP-Dronet consistently fails despite having 168x more parameters.

7/18/2024

🤿

A Deep Learning-based Pest Insect Monitoring System for Ultra-low Power Pocket-sized Drones

Luca Crupi, Luca Butera, Alberto Ferrante, Daniele Palossi

Smart farming and precision agriculture represent game-changer technologies for efficient and sustainable agribusiness. Miniaturized palm-sized drones can act as flexible smart sensors inspecting crops, looking for early signs of potential pest outbreaking. However, achieving such an ambitious goal requires hardware-software codesign to develop accurate deep learning (DL) detection models while keeping memory and computational needs under an ultra-tight budget, i.e., a few MB on-chip memory and a few 100s mW power envelope. This work presents a novel vertically integrated solution featuring two ultra-low power System-on-Chips (SoCs), i.e., the dual-core STM32H74 and a multi-core GWT GAP9, running two State-of-the-Art DL models for detecting the Popillia japonica bug. We fine-tune both models for our image-based detection task, quantize them in 8-bit integers, and deploy them on the two SoCs. On the STM32H74, we deploy a FOMO-MobileNetV2 model, achieving a mean average precision (mAP) of 0.66 and running at 16.1 frame/s within 498 mW. While on the GAP9 SoC, we deploy a more complex SSDLite-MobileNetV3, which scores an mAP of 0.79 and peaks at 6.8 frame/s within 33 mW. Compared to a top-notch RetinaNet-ResNet101-FPN full-precision baseline, which requires 14.9x more memory and 300x more operations per inference, our best model drops only 15% in mAP, paving the way toward autonomous palm-sized drones capable of lightweight and precise pest detection.

7/2/2024

High-throughput Visual Nano-drone to Nano-drone Relative Localization using Onboard Fully Convolutional Networks

Luca Crupi, Alessandro Giusti, Daniele Palossi

Relative drone-to-drone localization is a fundamental building block for any swarm operations. We address this task in the context of miniaturized nano-drones, i.e., 10cm in diameter, which show an ever-growing interest due to novel use cases enabled by their reduced form factor. The price for their versatility comes with limited onboard resources, i.e., sensors, processing units, and memory, which limits the complexity of the onboard algorithms. A traditional solution to overcome these limitations is represented by lightweight deep learning models directly deployed aboard nano-drones. This work tackles the challenging relative pose estimation between nano-drones using only a gray-scale low-resolution camera and an ultra-low-power System-on-Chip (SoC) hosted onboard. We present a vertically integrated system based on a novel vision-based fully convolutional neural network (FCNN), which runs at 39Hz within 101mW onboard a Crazyflie nano-drone extended with the GWT GAP8 SoC. We compare our FCNN against three State-of-the-Art (SoA) systems. Considering the best-performing SoA approach, our model results in an R-squared improvement from 32 to 47% on the horizontal image coordinate and from 18 to 55% on the vertical image coordinate, on a real-world dataset of 30k images. Finally, our in-field tests show a reduction of the average tracking error of 37% compared to a previous SoA work and an endurance performance up to the entire battery lifetime of 4 minutes.

4/3/2024

Training on the Fly: On-device Self-supervised Learning aboard Nano-drones within 20 mW

Elia Cereda, Alessandro Giusti, Daniele Palossi

Miniaturized cyber-physical systems (CPSes) powered by tiny machine learning (TinyML), such as nano-drones, are becoming an increasingly attractive technology. Their small form factor (i.e., ~10cm diameter) ensures vast applicability, ranging from the exploration of narrow disaster scenarios to safe human-robot interaction. Simple electronics make these CPSes inexpensive, but strongly limit the computational, memory, and sensing resources available on board. In real-world applications, these limitations are further exacerbated by domain shift. This fundamental machine learning problem implies that model perception performance drops when moving from the training domain to a different deployment one. To cope with and mitigate this general problem, we present a novel on-device fine-tuning approach that relies only on the limited ultra-low power resources available aboard nano-drones. Then, to overcome the lack of ground-truth training labels aboard our CPS, we also employ a self-supervised method based on ego-motion consistency. Albeit our work builds on top of a specific real-world vision-based human pose estimation task, it is widely applicable for many embedded TinyML use cases. Our 512-image on-device training procedure is fully deployed aboard an ultra-low power GWT GAP9 System-on-Chip and requires only 1MB of memory while consuming as low as 19mW or running in just 510ms (at 38mW). Finally, we demonstrate the benefits of our on-device learning approach by field-testing our closed-loop CPS, showing a reduction in horizontal position error of up to 26% vs. a non-fine-tuned state-of-the-art baseline. In the most challenging never-seen-before environment, our on-device learning procedure makes the difference between succeeding or failing the mission.

8/7/2024