UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization

Read original: arXiv:2405.11936 - Published 5/21/2024 by Wenjia Xu, Yaxuan Yao, Jiaqi Cao, Zhiwei Wei, Chunbo Liu, Jiuniu Wang, Mugen Peng

UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization

Overview

The paper presents a large-scale dataset called UAV-VisLoc for visual localization of unmanned aerial vehicles (UAVs)
The dataset contains over 1 million images captured from diverse viewpoints and environments, along with precise ground truth GPS/INS data
The authors benchmark several state-of-the-art visual localization methods on this dataset, providing insights into their performance and limitations

Plain English Explanation

The researchers have created a new dataset called UAV-VisLoc that can be used to train and test AI systems for visual localization on unmanned aerial vehicles (UAVs). Visual localization is the process of determining the exact position and orientation of a camera based on the images it captures.

The UAV-VisLoc dataset contains over 1 million images taken from various viewpoints and environments, along with precise GPS and inertial measurement unit (IMU) data that provides the ground truth location and orientation for each image. This allows AI models to be trained to predict a UAV's location from just the camera images, without relying on additional sensors.

The researchers then tested several state-of-the-art visual localization methods on this dataset. This helped them understand how well these AI models perform in real-world UAV scenarios, and identify areas where the technology still has room for improvement. By making this large, diverse dataset publicly available, the researchers hope to accelerate progress in the field of UAV visual localization.

Technical Explanation

The paper introduces the UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization. The key aspects of the dataset and the technical evaluation are:

Data Collection: The dataset was collected using a UAV equipped with a high-resolution camera and precise GPS/IMU sensors. The flights covered diverse environments including urban areas, forests, and open fields.
Dataset Characteristics: The final dataset contains over 1 million images, with associated camera poses (position and orientation) derived from the GPS/IMU data. The images exhibit significant variation in viewpoint, lighting, and scene content.
Benchmark Tasks: The researchers evaluated the performance of several state-of-the-art visual localization methods, including leveraging-edge-detection-neural-networks-better-uav and openstreetview-5m-many-roads-to-global-visual, on the UAV-VisLoc dataset.
Evaluation Metrics: The localization accuracy was measured using the median position and orientation errors, as well as the percentage of poses estimated within certain error thresholds.
Insights: The benchmark results provided insights into the strengths and limitations of existing visual localization approaches for UAV applications, informing future research directions.

Critical Analysis

The UAV-VisLoc dataset represents a valuable contribution to the field of UAV visual localization. By providing a large-scale, diverse dataset with precise ground truth, the authors enable more robust evaluation and development of AI models for this task.

However, the paper acknowledges several limitations and areas for further research:

The dataset is focused on outdoor environments, and may not capture the full range of challenges posed by indoor or complex urban settings.
The benchmark only considers static localization, and does not evaluate the performance of methods for active-visual-localization-multi-agent-collaboration-data or 360loc-dataset-benchmark-omnidirectional-visual-localization-cross scenarios.
The dataset does not include additional sensor modalities, such as depth information or semantic annotations, which could further improve localization performance.

Future work could explore expanding the dataset to address these limitations, as well as investigating more advanced localization methods that can leverage the rich data provided by the UAV-VisLoc collection.

Conclusion

The UAV-VisLoc dataset represents a significant step forward in the development of visual localization for unmanned aerial vehicles. By providing a large-scale, diverse dataset with precise ground truth, the authors have enabled more robust benchmarking and advancement of AI-powered UAV localization techniques.

The insights gained from the benchmark experiments can inform future research directions, helping to address the remaining challenges and limitations in this critical field. As AI-powered UAVs continue to grow in importance for applications ranging from surveying to search and rescue, datasets like UAV-VisLoc will play a crucial role in unlocking their full potential.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization

Wenjia Xu, Yaxuan Yao, Jiaqi Cao, Zhiwei Wei, Chunbo Liu, Jiuniu Wang, Mugen Peng

The application of unmanned aerial vehicles (UAV) has been widely extended recently. It is crucial to ensure accurate latitude and longitude coordinates for UAVs, especially when the global navigation satellite systems (GNSS) are disrupted and unreliable. Existing visual localization methods achieve autonomous visual localization without error accumulation by matching the ground-down view image of UAV with the ortho satellite maps. However, collecting UAV ground-down view images across diverse locations is costly, leading to a scarcity of large-scale datasets for real-world scenarios. Existing datasets for UAV visual localization are often limited to small geographic areas or are focused only on urban regions with distinct textures. To address this, we define the UAV visual localization task by determining the UAV's real position coordinates on a large-scale satellite map based on the captured ground-down view. In this paper, we present a large-scale dataset, UAV-VisLoc, to facilitate the UAV visual localization task. This dataset comprises images from diverse drones across 11 locations in China, capturing a range of topographical features. The dataset features images from fixed-wing drones and multi-terrain drones, captured at different altitudes and orientations. Our dataset includes 6,742 drone images and 11 satellite maps, with metadata such as latitude, longitude, altitude, and capture date. Our dataset is tailored to support both the training and testing of models by providing a diverse and extensive data.

5/21/2024

Game4Loc: A UAV Geo-Localization Benchmark from Game Data

Yuxiang Ji, Boyong He, Zhuoyue Tan, Liaoni Wu

The vision-based geo-localization technology for UAV, serving as a secondary source of GPS information in addition to the global navigation satellite systems (GNSS), can still operate independently in the GPS-denied environment. Recent deep learning based methods attribute this as the task of image matching and retrieval. By retrieving drone-view images in geo-tagged satellite image database, approximate localization information can be obtained. However, due to high costs and privacy concerns, it is usually difficult to obtain large quantities of drone-view images from a continuous area. Existing drone-view datasets are mostly composed of small-scale aerial photography with a strong assumption that there exists a perfect one-to-one aligned reference image for any query, leaving a significant gap from the practical localization scenario. In this work, we construct a large-range contiguous area UAV geo-localization dataset named GTA-UAV, featuring multiple flight altitudes, attitudes, scenes, and targets using modern computer games. Based on this dataset, we introduce a more practical UAV geo-localization task including partial matches of cross-view paired data, and expand the image-level retrieval to the actual localization in terms of distance (meters). For the construction of drone-view and satellite-view pairs, we adopt a weight-based contrastive learning approach, which allows for effective learning while avoiding additional post-processing matching steps. Experiments demonstrate the effectiveness of our data and training method for UAV geo-localization, as well as the generalization capabilities to real-world scenarios.

9/26/2024

Long-Range Vision-Based UAV-assisted Localization for Unmanned Surface Vehicles

Waseem Akram, Siyuan Yang, Hailiang Kuang, Xiaoyu He, Muhayy Ud Din, Yihao Dong, Defu Lin, Lakmal Seneviratne, Shaoming He, Irfan Hussain

The global positioning system (GPS) has become an indispensable navigation method for field operations with unmanned surface vehicles (USVs) in marine environments. However, GPS may not always be available outdoors because it is vulnerable to natural interference and malicious jamming attacks. Thus, an alternative navigation system is required when the use of GPS is restricted or prohibited. To this end, we present a novel method that utilizes an Unmanned Aerial Vehicle (UAV) to assist in localizing USVs in GNSS-restricted marine environments. In our approach, the UAV flies along the shoreline at a consistent altitude, continuously tracking and detecting the USV using a deep learning-based approach on camera images. Subsequently, triangulation techniques are applied to estimate the USV's position relative to the UAV, utilizing geometric information and datalink range from the UAV. We propose adjusting the UAV's camera angle based on the pixel error between the USV and the image center throughout the localization process to enhance accuracy. Additionally, visual measurements are integrated into an Extended Kalman Filter (EKF) for robust state estimation. To validate our proposed method, we utilize a USV equipped with onboard sensors and a UAV equipped with a camera. A heterogeneous robotic interface is established to facilitate communication between the USV and UAV. We demonstrate the efficacy of our approach through a series of experiments conducted during the ``Muhammad Bin Zayed International Robotic Challenge (MBZIRC-2024)'' in real marine environments, incorporating noisy measurements and ocean disturbances. The successful outcomes indicate the potential of our method to complement GPS for USV navigation.

8/22/2024

🗣️

JointLoc: A Real-time Visual Localization Framework for Planetary UAVs Based on Joint Relative and Absolute Pose Estimation

Xubo Luo, Xue Wan, Yixing Gao, Yaolin Tian, Wei Zhang, Leizheng Shu

Unmanned aerial vehicles (UAVs) visual localization in planetary aims to estimate the absolute pose of the UAV in the world coordinate system through satellite maps and images captured by on-board cameras. However, since planetary scenes often lack significant landmarks and there are modal differences between satellite maps and UAV images, the accuracy and real-time performance of UAV positioning will be reduced. In order to accurately determine the position of the UAV in a planetary scene in the absence of the global navigation satellite system (GNSS), this paper proposes JointLoc, which estimates the real-time UAV position in the world coordinate system by adaptively fusing the absolute 2-degree-of-freedom (2-DoF) pose and the relative 6-degree-of-freedom (6-DoF) pose. Extensive comparative experiments were conducted on a proposed planetary UAV image cross-modal localization dataset, which contains three types of typical Martian topography generated via a simulation engine as well as real Martian UAV images from the Ingenuity helicopter. JointLoc achieved a root-mean-square error of 0.237m in the trajectories of up to 1,000m, compared to 0.594m and 0.557m for ORB-SLAM2 and ORB-SLAM3 respectively. The source code will be available at https://github.com/LuoXubo/JointLoc.

5/14/2024