YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers

Read original: arXiv:2409.02334 - Published 9/5/2024 by Sourav Raxit, Simant Bahadur Singh, Abdullah Al Redwan Newaz

YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers

Overview

This paper presents YoloTag, a vision-based robust UAV navigation system that uses fiducial markers.
The key contributions include real-time fiducial marker detection using deep learning and robust UAV pose estimation and navigation.
The proposed system demonstrates high accuracy and reliability in outdoor environments, making it suitable for various UAV applications.

Plain English Explanation

The paper focuses on developing a vision-based navigation system for unmanned aerial vehicles (UAVs) that uses fiducial markers. Fiducial markers are physical markers that can be easily detected and tracked by cameras, enabling precise localization and pose estimation of the UAV.

The key idea is to use deep learning-based object detection to quickly and accurately identify the fiducial markers in the camera's field of view. This allows the UAV to determine its position and orientation relative to the markers, which can then be used for real-time navigation and control.

The authors demonstrate that their system, called YoloTag, is highly robust to various outdoor conditions, such as changing lighting, occlusions, and camera viewpoint changes. This makes it suitable for a wide range of UAV applications, including search and rescue operations, infrastructure inspection, and autonomous navigation.

Technical Explanation

The paper first describes the real-time fiducial marker detection component of YoloTag, which is based on the popular YOLO (You Only Look Once) object detection model. The authors train a custom YOLO network to identify the fiducial markers in the camera images, achieving high precision and recall rates.

The pose estimation and navigation module then uses the detected fiducial markers to calculate the UAV's position and orientation in 3D space. This is done by leveraging the known size and layout of the markers to triangulate the UAV's pose relative to the markers.

The authors evaluate the performance of YoloTag in extensive outdoor experiments, testing it under various conditions such as changing lighting, occlusions, and camera viewpoint changes. The results show that YoloTag is highly accurate and reliable, outperforming traditional approaches based on hand-crafted features or less robust deep learning models.

Critical Analysis

The paper provides a comprehensive and well-designed study of the YoloTag system, addressing key challenges in UAV navigation. The use of deep learning for fiducial marker detection is a novel and effective approach, demonstrating the potential of vision-based techniques for UAV autonomy.

However, the paper does not discuss the limitations of the system, such as the range or accuracy of the pose estimation under extreme conditions or the potential impact of marker placement or density on the overall performance. Additionally, the authors do not compare YoloTag to other state-of-the-art vision-based UAV navigation systems, which would provide a more complete picture of the system's capabilities.

Further research could explore the integration of YoloTag with other sensor modalities, such as LiDAR or GPS, to enhance the overall robustness and reliability of the UAV navigation system. Additionally, investigating the scalability of the approach to larger environments or more complex marker layouts would be valuable.

Conclusion

The YoloTag system presented in this paper demonstrates a promising vision-based approach to robust UAV navigation using fiducial markers. The deep learning-based fiducial marker detection and pose estimation modules enable reliable and real-time UAV localization and control, making the system suitable for a variety of outdoor UAV applications.

The authors have made a significant contribution to the field of UAV autonomy by addressing key challenges in vision-based navigation. While the paper provides a thorough evaluation of the system's performance, further research on the system's limitations and integration with other sensors could help to enhance its capabilities and broaden its applicability.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers

Sourav Raxit, Simant Bahadur Singh, Abdullah Al Redwan Newaz

By harnessing fiducial markers as visual landmarks in the environment, Unmanned Aerial Vehicles (UAVs) can rapidly build precise maps and navigate spaces safely and efficiently, unlocking their potential for fluent collaboration and coexistence with humans. Existing fiducial marker methods rely on handcrafted feature extraction, which sacrifices accuracy. On the other hand, deep learning pipelines for marker detection fail to meet real-time runtime constraints crucial for navigation applications. In this work, we propose YoloTag textemdash a real-time fiducial marker-based localization system. YoloTag uses a lightweight YOLO v8 object detector to accurately detect fiducial markers in images while meeting the runtime constraints needed for navigation. The detected markers are then used by an efficient perspective-n-point algorithm to estimate UAV states. However, this localization system introduces noise, causing instability in trajectory tracking. To suppress noise, we design a higher-order Butterworth filter that effectively eliminates noise through frequency domain analysis. We evaluate our algorithm through real-robot experiments in an indoor environment, comparing the trajectory tracking performance of our method against other approaches in terms of several distance metrics.

9/5/2024

Leveraging edge detection and neural networks for better UAV localization

Theo Di Piazza, Enric Meinhardt-Llopis, Gabriele Facciolo, Benedicte Bascle, Corentin Abgrall, Jean-Clement Devaux

We propose a novel method for geolocalizing Unmanned Aerial Vehicles (UAVs) in environments lacking Global Navigation Satellite Systems (GNSS). Current state-of-the-art techniques employ an offline-trained encoder to generate a vector representation (embedding) of the UAV's current view, which is then compared with pre-computed embeddings of geo-referenced images to determine the UAV's position. Here, we demonstrate that the performance of these methods can be significantly enhanced by preprocessing the images to extract their edges, which exhibit robustness to seasonal and illumination variations. Furthermore, we establish that utilizing edges enhances resilience to orientation and altitude inaccuracies. Additionally, we introduce a confidence criterion for localization. Our findings are substantiated through synthetic experiments.

6/4/2024

Long-Range Vision-Based UAV-assisted Localization for Unmanned Surface Vehicles

Waseem Akram, Siyuan Yang, Hailiang Kuang, Xiaoyu He, Muhayy Ud Din, Yihao Dong, Defu Lin, Lakmal Seneviratne, Shaoming He, Irfan Hussain

The global positioning system (GPS) has become an indispensable navigation method for field operations with unmanned surface vehicles (USVs) in marine environments. However, GPS may not always be available outdoors because it is vulnerable to natural interference and malicious jamming attacks. Thus, an alternative navigation system is required when the use of GPS is restricted or prohibited. To this end, we present a novel method that utilizes an Unmanned Aerial Vehicle (UAV) to assist in localizing USVs in GNSS-restricted marine environments. In our approach, the UAV flies along the shoreline at a consistent altitude, continuously tracking and detecting the USV using a deep learning-based approach on camera images. Subsequently, triangulation techniques are applied to estimate the USV's position relative to the UAV, utilizing geometric information and datalink range from the UAV. We propose adjusting the UAV's camera angle based on the pixel error between the USV and the image center throughout the localization process to enhance accuracy. Additionally, visual measurements are integrated into an Extended Kalman Filter (EKF) for robust state estimation. To validate our proposed method, we utilize a USV equipped with onboard sensors and a UAV equipped with a camera. A heterogeneous robotic interface is established to facilitate communication between the USV and UAV. We demonstrate the efficacy of our approach through a series of experiments conducted during the ``Muhammad Bin Zayed International Robotic Challenge (MBZIRC-2024)'' in real marine environments, incorporating noisy measurements and ocean disturbances. The successful outcomes indicate the potential of our method to complement GPS for USV navigation.

8/22/2024

📉

Fiducial Tag Localization on a 3D LiDAR Prior Map

Yibo Liu, Jinjun Shan, Hunter Schofield

The LiDAR fiducial tag, akin to the well-known AprilTag used in camera applications, serves as a convenient resource to impart artificial features to the LiDAR sensor, facilitating robotics applications. Unfortunately, the existing LiDAR fiducial tag localization methods do not apply to 3D LiDAR maps while resolving this problem is beneficial to LiDAR-based relocalization and navigation. In this paper, we develop a novel approach to directly localize fiducial tags on a 3D LiDAR prior map, returning the tag poses (labeled by ID number) and vertex locations (labeled by index) w.r.t. the global coordinate system of the map. In particular, considering that fiducial tags are thin sheet objects indistinguishable from the attached planes, we design a new pipeline that gradually analyzes the 3D point cloud of the map from the intensity and geometry perspectives, extracting potential tag-containing point clusters. Then, we introduce an intermediate-plane-based method to further check if each potential cluster has a tag and compute the vertex locations and tag pose if found. We conduct both qualitative and quantitative experiments to demonstrate that our approach is the first method applicable to localize tags on a 3D LiDAR map while achieving better accuracy compared to previous methods. The open-source implementation of this work is available at: https://github.com/York-SDCNLab/Marker-Detection-General.

6/6/2024