M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and RGB Data

Read original: arXiv:2406.04230 - Published 6/7/2024 by Matthew J Allen, Francisco Dorr, Joseph Alejandro Gallego Mejia, Laura Mart'inez-Ferrer, Anna Jungbluth, Freddie Kalaitzis, Ra'ul Ramos-Poll'an

M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and RGB Data

Overview

This paper introduces M3LEO, a new multi-modal, multi-label Earth observation dataset that integrates interferometric Synthetic Aperture Radar (SAR) and RGB data.
The dataset is designed to support research on tasks like land cover classification, object detection, and semantic segmentation in remote sensing applications.
M3LEO provides diverse and comprehensive data spanning different geographic regions, climate zones, and land cover types, with high-quality annotations for a wide range of semantic categories.

Plain English Explanation

M3LEO is a new dataset that combines two types of satellite imagery data - interferometric SAR and RGB color images. Interferometric SAR is a technique that uses radar signals to measure small changes in the Earth's surface, while RGB images capture the visible spectrum of light. By bringing these two modalities together, M3LEO provides a rich and diverse dataset for training AI models on various remote sensing tasks, such as classifying different land cover types, detecting specific objects, and segmenting images into semantic regions.

The dataset covers a wide range of geographic locations, climates, and land cover types, with high-quality annotations for many different categories. This diversity and comprehensive labeling make M3LEO a valuable resource for developing and evaluating AI systems for Earth observation and enabling cross-sensor alignment and transfer learning in remote sensing applications.

Technical Explanation

M3LEO is a novel multi-modal, multi-label Earth observation dataset that combines interferometric SAR and RGB data. The dataset covers a diverse range of geographic regions, climate zones, and land cover types, with high-quality annotations for a wide variety of semantic categories.

The interferometric SAR data in M3LEO provides information about the 3D structure and changes of the Earth's surface, while the RGB imagery captures the visual appearance of the land cover. By integrating these two modalities, the dataset enables research on tasks like land cover classification, object detection, and semantic segmentation in remote sensing applications.

The dataset was carefully curated and annotated by domain experts, resulting in a comprehensive set of labels covering a broad range of semantic categories. This rich annotation data supports the development of multi-task learning and cross-modal integration approaches for Earth observation tasks.

Critical Analysis

The authors of the M3LEO dataset have made a significant contribution to the field of remote sensing by providing a high-quality, multi-modal dataset that integrates interferometric SAR and RGB data. The comprehensive nature of the annotations and the diversity of the included geographic regions and land cover types make M3LEO a valuable resource for researchers and practitioners in this domain.

One potential limitation of the dataset is the availability of SAR data, which can be more challenging to obtain and process than RGB imagery. The authors acknowledge this and suggest that future work could explore ways to leverage synthetic SAR data or self-supervised pretraining techniques to address this challenge.

Additionally, while the dataset covers a wide range of land cover types, there may be some geographic regions or specific classes that are underrepresented. Researchers should be mindful of potential biases in the data and carefully evaluate the performance of their models across different subsets of the dataset.

Conclusion

The M3LEO dataset represents a significant advancement in the field of Earth observation, providing a comprehensive, multi-modal, and multi-label dataset that can support a wide range of remote sensing applications. By integrating interferometric SAR and RGB data, the dataset enables the development of more robust and accurate AI models for tasks like land cover classification, object detection, and semantic segmentation.

The diverse and high-quality annotations in M3LEO also open up new research opportunities in areas like multi-task learning, cross-modal integration, and efficient model design. Overall, the M3LEO dataset represents a significant step forward in empowering AI for Earth observation and enabling cross-sensor alignment and transfer learning in remote sensing applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and RGB Data

Matthew J Allen, Francisco Dorr, Joseph Alejandro Gallego Mejia, Laura Mart'inez-Ferrer, Anna Jungbluth, Freddie Kalaitzis, Ra'ul Ramos-Poll'an

Satellite-based remote sensing has revolutionised the way we address global challenges in a rapidly evolving world. Huge quantities of Earth Observation (EO) data are generated by satellite sensors daily, but processing these large datasets for use in ML pipelines is technically and computationally challenging. Specifically, different types of EO data are often hosted on a variety of platforms, with differing availability for Python preprocessing tools. In addition, spatial alignment across data sources and data tiling can present significant technical hurdles for novice users. While some preprocessed EO datasets exist, their content is often limited to optical or near-optical wavelength data, which is ineffective at night or in adverse weather conditions. Synthetic Aperture Radar (SAR), an active sensing technique based on microwave length radiation, offers a viable alternative. However, the application of machine learning to SAR has been limited due to a lack of ML-ready data and pipelines, particularly for the full diversity of SAR data, including polarimetry, coherence and interferometry. We introduce M3LEO, a multi-modal, multi-label EO dataset that includes polarimetric, interferometric, and coherence SAR data derived from Sentinel-1, alongside Sentinel-2 RGB imagery and a suite of labelled tasks for model evaluation. M3LEO spans 17.5TB and contains approximately 10M data chips across six geographic regions. The dataset is complemented by a flexible PyTorch Lightning framework, with configuration management using Hydra. We provide tools to process any dataset available on popular platforms such as Google Earth Engine for integration with our framework. Initial experiments validate the utility of our data and framework, showing that SAR imagery contains information additional to that extractable from RGB data. Data at huggingface.co/M3LEO, and code at github.com/spaceml-org/M3LEO.

6/7/2024

➖

Unlocking the Use of Raw Multispectral Earth Observation Imagery for Onboard Artificial Intelligence

Gabriele Meoni, Roberto Del Prete, Federico Serva, Alix De Beussche, Olivier Colin, Nicolas Long'ep'e

Nowadays, there is growing interest in applying Artificial Intelligence (AI) on board Earth Observation (EO) satellites for time-critical applications, such as natural disaster response. However, the unavailability of raw satellite data currently hinders research on lightweight pre-processing techniques and limits the exploration of end-to-end pipelines, which could offer more efficient and accurate extraction of insights directly from the source data. To fill this gap, this work presents a novel methodology to automate the creation of datasets for the detection of target events (e.g., warm thermal hotspots) or objects (e.g., vessels) from Sentinel-2 raw data and other multispectral EO pushbroom raw imagery. The presented approach first processes the raw data by applying a pipeline consisting of spatial band registration and georeferencing of the raw data pixels. Then, it detects the target events by leveraging event-specific state-of-the-art algorithms on the Level-1C products, which are mosaicked and cropped on the georeferenced correspondent raw granule area. The detected events are finally re-projected back onto the corresponding raw images. We apply the proposed methodology to realize THRawS (Thermal Hotspots in Raw Sentinel-2 data), the first dataset of Sentinel-2 raw data containing warm thermal hotspots. THRawS includes 1090 samples containing wildfires, volcanic eruptions, and 33,335 event-free acquisitions to enable thermal hotspot detection and general classification applications. This dataset and associated toolkits provide the community with both an immediately useful resource as well as a framework and methodology acting as a template for future additions. With this work, we hope to pave the way for research on energy-efficient pre-processing algorithms and AI-based end-to-end processing systems on board EO satellites.

9/11/2024

MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning

Vishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke, Serge Belongie, Christian Igel, Nico Lang

The volume of unlabelled Earth observation (EO) data is huge, but many important applications lack labelled training data. However, EO data offers the unique opportunity to pair data from different modalities and sensors automatically based on geographic location and time, at virtually no human labor cost. We seize this opportunity to create MMEarth, a diverse multi-modal pretraining dataset at global scale. Using this new corpus of 1.2 million locations, we propose a Multi-Pretext Masked Autoencoder (MP-MAE) approach to learn general-purpose representations for optical satellite images. Our approach builds on the ConvNeXt V2 architecture, a fully convolutional masked autoencoder (MAE). Drawing upon a suite of multi-modal pretext tasks, we demonstrate that our MP-MAE approach outperforms both MAEs pretrained on ImageNet and MAEs pretrained on domain-specific satellite images. This is shown on several downstream tasks including image classification and semantic segmentation. We find that pretraining with multi-modal pretext tasks notably improves the linear probing performance compared to pretraining on optical satellite images only. This also leads to better label efficiency and parameter efficiency which are crucial aspects in global scale applications.

7/30/2024

🔮

OmniSat: Self-Supervised Modality Fusion for Earth Observation

Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu

The diversity and complementarity of sensors available for Earth Observations (EO) calls for developing bespoke self-supervised multimodal learning approaches. However, current multimodal EO datasets and models typically focus on a single data type, either mono-date images or time series, which limits their impact. To address this issue, we introduce OmniSat, a novel architecture able to merge diverse EO modalities into expressive features without labels by exploiting their alignment. To demonstrate the advantages of our approach, we create two new multimodal datasets by augmenting existing ones with new modalities. As demonstrated for three downstream tasks -- forestry, land cover classification, and crop mapping -- OmniSat can learn rich representations without supervision, leading to state-of-the-art performances in semi- and fully supervised settings. Furthermore, our multimodal pretraining scheme improves performance even when only one modality is available for inference. The code and dataset are available at https://github.com/gastruc/OmniSat.

7/18/2024