Kuro Siwo: 33 billion $m^2$ under the water. A global multi-temporal satellite dataset for rapid flood mapping

Read original: arXiv:2311.12056 - Published 6/11/2024 by Nikolaos Ioannis Bountos, Maria Sdraka, Angelos Zavras, Ilektra Karasante, Andreas Karavias, Themistocles Herekakis, Angeliki Thanasou, Dimitrios Michail, Ioannis Papoutsis

🤯

Overview

This paper introduces a new dataset called Kuro Siwo, which is a manually annotated multi-temporal dataset spanning 43 flood events globally.
Kuro Siwo maps more than 338 billion $m^2$ of land, with 33 billion designated as either flooded areas or permanent water bodies.
The dataset includes both a processed product optimized for flood mapping based on SAR Ground Range Detected, and a raw SAR Single Look Complex product with minimal preprocessing.
The paper also provides an extensive benchmark, called BlackBench, offering strong baselines for a diverse set of flood events from Europe, America, Africa, Asia, and Australia.

Plain English Explanation

Floods caused by climate change are a severe threat to human life, infrastructure, and the environment. Recent catastrophic events in Pakistan and New Zealand have highlighted the urgent need for accurate flood mapping to guide restoration efforts, understand vulnerabilities, and prepare for future occurrences.

Satellite-based Synthetic Aperture Radar (SAR) technology can provide day-and-night, all-weather imaging capabilities that are well-suited for flood mapping. However, the lack of large, annotated datasets has limited the application of deep learning techniques for this task. To address this, the researchers have created a new dataset called Kuro Siwo, which contains manually annotated flood data from 43 events around the world.

Kuro Siwo includes over 338 billion square meters of land, with 33 billion square meters identified as either flooded areas or permanent water bodies. The dataset provides both a processed product optimized for flood mapping and a raw SAR product with minimal preprocessing, allowing researchers to explore different approaches to exploiting the phase and amplitude information in the data.

To further support research in this area, the researchers have also included a large set of unlabeled SAR samples and an extensive benchmark, called BlackBench, which offers strong baselines for a diverse set of flood events from across the globe. These resources are designed to help advance the use of deep learning for flood mapping and improve our ability to respond to and prepare for the impacts of climate change.

Technical Explanation

The paper introduces a new dataset called Kuro Siwo, which is a manually annotated multi-temporal dataset spanning 43 flood events globally. The dataset maps more than 338 billion $m^2$ of land, with 33 billion designated as either flooded areas or permanent water bodies.

Kuro Siwo includes two types of SAR data products: a highly processed product optimized for flood mapping based on SAR Ground Range Detected, and a primal SAR Single Look Complex product with minimal preprocessing. The latter is designed to promote research on the exploitation of both the phase and amplitude information in the data and to offer maximum flexibility for downstream task preprocessing.

To leverage advances in large-scale self-supervised pretraining methods for remote sensing data, the researchers have augmented Kuro Siwo with a large unlabeled set of SAR samples. This allows the dataset to be used for pretraining models, which can then be fine-tuned on the annotated flood data.

In addition to the dataset, the paper also provides an extensive benchmark, namely BlackBench, offering strong baselines for a diverse set of flood events from Europe, America, Africa, Asia, and Australia. This benchmark is designed to support the development and evaluation of flood mapping algorithms, helping to drive progress in this important area of research.

Critical Analysis

The Kuro Siwo dataset and BlackBench benchmark represent a significant contribution to the field of flood mapping using SAR data. By providing a large, manually annotated dataset and a comprehensive evaluation framework, the researchers have addressed a key limitation in the existing literature, which has been hampered by the lack of high-quality training data.

However, the paper does acknowledge some potential limitations of the dataset. For example, the annotation process can be subjective, and there may be some inconsistencies or errors in the labeling. Additionally, the dataset may not capture the full range of flood characteristics and environmental conditions, as it is limited to the 43 events included in the study.

Further research could explore methods for improving the annotation process, perhaps through the use of crowdsourcing or automated techniques. Combining the Kuro Siwo dataset with other flood datasets could also help to broaden the range of flood events and environmental conditions represented, providing a more comprehensive resource for the research community.

Conclusion

The introduction of the Kuro Siwo dataset and BlackBench benchmark represents a significant step forward in the development of deep learning-based flood mapping techniques. By providing a large, high-quality dataset and a comprehensive evaluation framework, the researchers have enabled the research community to make rapid progress in this important field.

The potential applications of this work are wide-ranging, from guiding disaster response and recovery efforts to informing long-term planning and infrastructure development. As the impacts of climate change continue to intensify, accurate and reliable flood mapping will become increasingly crucial for protecting human lives and mitigating the economic and environmental damages caused by these events.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤯

Kuro Siwo: 33 billion $m^2$ under the water. A global multi-temporal satellite dataset for rapid flood mapping

Nikolaos Ioannis Bountos, Maria Sdraka, Angelos Zavras, Ilektra Karasante, Andreas Karavias, Themistocles Herekakis, Angeliki Thanasou, Dimitrios Michail, Ioannis Papoutsis

Global floods, exacerbated by climate change, pose severe threats to human life, infrastructure, and the environment. Recent catastrophic events in Pakistan and New Zealand underscore the urgent need for precise flood mapping to guide restoration efforts, understand vulnerabilities, and prepare for future occurrences. While Synthetic Aperture Radar (SAR) remote sensing offers day-and-night, all-weather imaging capabilities, its application in deep learning for flood segmentation is limited by the lack of large annotated datasets. To address this, we introduce Kuro Siwo, a manually annotated multi-temporal dataset, spanning 43 flood events globally. Our dataset maps more than 338 billion $m^2$ of land, with 33 billion designated as either flooded areas or permanent water bodies. Kuro Siwo includes a highly processed product optimized for flood mapping based on SAR Ground Range Detected, and a primal SAR Single Look Complex product with minimal preprocessing, designed to promote research on the exploitation of both the phase and amplitude information and to offer maximum flexibility for downstream task preprocessing. To leverage advances in large scale self-supervised pretraining methods for remote sensing data, we augment Kuro Siwo with a large unlabeled set of SAR samples. Finally, we provide an extensive benchmark, namely BlackBench, offering strong baselines for a diverse set of flood events from Europe, America, Africa, Asia and Australia.

6/11/2024

UrbanSARFloods: Sentinel-1 SLC-Based Benchmark Dataset for Urban and Open-Area Flood Mapping

Jie Zhao, Zhitong Xiong, Xiao Xiang Zhu

Due to its cloud-penetrating capability and independence from solar illumination, satellite Synthetic Aperture Radar (SAR) is the preferred data source for large-scale flood mapping, providing global coverage and including various land cover classes. However, most studies on large-scale SAR-derived flood mapping using deep learning algorithms have primarily focused on flooded open areas, utilizing available open-access datasets (e.g., Sen1Floods11) and with limited attention to urban floods. To address this gap, we introduce textbf{UrbanSARFloods}, a floodwater dataset featuring pre-processed Sentinel-1 intensity data and interferometric coherence imagery acquired before and during flood events. It contains 8,879 $512times 512$ chips covering 807,500 $km^2$ across 20 land cover classes and 5 continents, spanning 18 flood events. We used UrbanSARFloods to benchmark existing state-of-the-art convolutional neural networks (CNNs) for segmenting open and urban flood areas. Our findings indicate that prevalent approaches, including the Weighted Cross-Entropy (WCE) loss and the application of transfer learning with pretrained models, fall short in overcoming the obstacles posed by imbalanced data and the constraints of a small training dataset. Urban flood detection remains challenging. Future research should explore strategies for addressing imbalanced data challenges and investigate transfer learning's potential for SAR-based large-scale flood mapping. Besides, expanding this dataset to include additional flood events holds promise for enhancing its utility and contributing to advancements in flood mapping techniques.

6/7/2024

BlessemFlood21: Advancing Flood Analysis with a High-Resolution Georeferenced Dataset for Humanitarian Aid Support

Vladyslav Polushko, Alexander Jenal, Jens Bongartz, Immanuel Weber, Damjan Hatic, Ronald Rosch, Thomas Marz, Markus Rauhut, Andreas Weinmann

Floods are an increasingly common global threat, causing emergencies and severe damage to infrastructure. During crises, organisations such as the World Food Programme use remotely sensed imagery, typically obtained through drones, for rapid situational analysis to plan life-saving actions. Computer Vision tools are needed to support task force experts on-site in the evaluation of the imagery to improve their efficiency and to allocate resources strategically. We introduce the BlessemFlood21 dataset to stimulate research on efficient flood detection tools. The imagery was acquired during the 2021 Erftstadt-Blessem flooding event and consists of high-resolution and georeferenced RGB-NIR images. In the resulting RGB dataset, the images are supplemented with detailed water masks, obtained via a semi-supervised human-in-the-loop technique, where in particular the NIR information is leveraged to classify pixels as either water or non-water. We evaluate our dataset by training and testing established Deep Learning models for semantic segmentation. With BlessemFlood21 we provide labeled high-resolution RGB data and a baseline for further development of algorithmic solutions tailored to flood detection in RGB imagery.

7/9/2024

Enabling Quick, Accurate Crowdsourced Annotation for Elevation-Aware Flood Extent Mapping

Landon Dyken, Saugat Adhikari, Pravin Poudel, Steve Petruzza, Da Yan, Will Usher, Sidharth Kumar

In order to assess damage and properly allocate relief efforts, mapping the extent of flood events is a necessary and important aspect of disaster management. In recent years, deep learning methods have evolved as an effective tool to quickly label high-resolution imagery and provide necessary flood extent mappings. These methods, though, require large amounts of annotated training data to create models that are accurate and robust to new flooded imagery. In this work, we provide FloodTrace, an application that enables effective crowdsourcing for flooded region annotation for machine learning training data, removing the requirement for annotation to be done solely by researchers. We accomplish this through two orthogonal methods within our application, informed by requirements from domain experts. First, we utilize elevation-guided annotation tools and 3D rendering to inform user annotation decisions with digital elevation model data, improving annotation accuracy. For this purpose, we provide a unique annotation method that uses topological data analysis to outperform the state-of-the-art elevation-guided annotation tool in efficiency. Second, we provide a framework for researchers to review aggregated crowdsourced annotations and correct inaccuracies using methods inspired by uncertainty visualization. We conducted a user study to confirm the application effectiveness in which 266 graduate students annotated high-resolution aerial imagery from Hurricane Matthew in North Carolina. Experimental results show the accuracy and efficiency benefits of our application apply even for untrained users. In addition, using our aggregation and correction framework, flood detection models trained on crowdsourced annotations were able to achieve performance equal to models trained on expert-labeled annotations, while requiring a fraction of the time on the part of the researcher.

8/13/2024