Estimating optical vegetation indices and biophysical variables for temperate forests with Sentinel-1 SAR data using machine learning techniques: A case study for Czechia

Read original: arXiv:2311.07537 - Published 8/28/2024 by Daniel Paluba, Bertrand Le Saux, Pv{r}emysl Stych

📊

Overview

Current vegetation indices (VIs) for monitoring forests have limitations due to atmospheric effects.
Synthetic aperture radar (SAR) data can provide consistent forest monitoring through clouds.
This study explores using SAR data and machine learning to estimate optical VIs for forests.

Plain English Explanation

Vegetation indices are measurements that use satellite data to track the health and characteristics of vegetation, like forests. The most common vegetation indices are things like the Normalized Difference Vegetation Index (NDVI) and the Enhanced Vegetation Index (EVI).

However, these vegetation indices can be limited by cloud cover and other atmospheric effects that interfere with the satellite data. In contrast, radar satellites can see through clouds and take images day or night, providing more consistent data.

The researchers in this study wanted to see if they could use the radar satellite data, together with machine learning, to accurately estimate the common vegetation indices for forests. This could provide a way to monitor forests that is less affected by clouds and weather.

The researchers focused on healthy and disturbed temperate forests in Czechia, using data from the European Sentinel-1 radar satellite and other datasets like terrain and weather information. They were able to estimate four key vegetation indices - Leaf Area Index (LAI), Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), EVI, and NDVI - with high accuracy compared to ground truth data.

This SAR-based approach to estimating vegetation indices could provide a powerful way to continuously monitor forest health and changes, even in cloudy conditions. The researchers were able to detect abrupt forest changes on a weekly timescale, which is difficult to achieve with traditional optical satellite data.

Technical Explanation

The researchers created a paired multi-modal time series dataset in Google Earth Engine, including temporally and spatially aligned data from the Sentinel-1 radar satellite, the Sentinel-2 multispectral satellite, a digital elevation model, weather data, and land cover information. This allowed them to train and test machine learning models to estimate four key vegetation indices - LAI, FAPAR, EVI, and NDVI - using only the Sentinel-1 SAR data and auxiliary features.

They compared the performance of traditional machine learning algorithms like Random Forest Regression (RFR) and XGBoost to an AutoML approach (auto-sklearn). The traditional ML models slightly outperformed the AutoML, achieving R^2 values between 70-86% and low mean absolute errors (0.055-0.29) in estimating the target vegetation indices.

The inclusion of terrain and weather features further improved the results. The researchers were able to achieve up to 240 measurements per year at a 20m spatial resolution using the SAR-based vegetation indices, with high accuracy compared to the Sentinel-2 ground truth data. A key advantage is the ability to detect abrupt forest changes on a sub-weekly timescale using the SAR data.

Critical Analysis

The researchers acknowledge that their SAR-based approach to estimating optical vegetation indices is less direct than using the multispectral data itself. However, they successfully demonstrate that the SAR signal contains enough relevant information, when combined with machine learning, to accurately predict these important forest metrics.

One limitation is that the study is focused on a specific region (Czechia) and time period (2021). Further research would be needed to assess the generalizability of the approach to other forest types and geographic areas. Additionally, the researchers did not explore the potential causes or explanations for why the SAR data was able to effectively estimate the optical vegetation indices.

While the results are promising, there may be cases where the direct use of multispectral data is still preferable, especially if the goal is to understand the biophysical mechanisms driving changes in vegetation. The SAR-based approach is more of an indirect proxy that leverages the power of machine learning.

Overall, this research raises interesting scientific questions about the information content of SAR data and the potential to leverage it for forest monitoring in a way that is more resilient to atmospheric conditions. Further work could explore the limits of this approach and investigate the underlying relationships between SAR signatures and optical vegetation indices.

Conclusion

This study demonstrates that synthetic aperture radar (SAR) data, combined with machine learning, can be used to accurately estimate key optical vegetation indices for monitoring forest ecosystems. This provides a promising alternative to relying solely on multispectral satellite data, which can be limited by atmospheric effects like clouds.

The researchers were able to achieve high accuracies in estimating leaf area index, fraction of absorbed photosynthetically active radiation, the Enhanced Vegetation Index, and the Normalized Difference Vegetation Index using Sentinel-1 SAR data and auxiliary features. This approach enables more consistent, high-frequency forest monitoring that is less affected by weather conditions.

While the SAR-based approach is less direct than using multispectral data, it demonstrates the potential to leverage the unique capabilities of radar satellites for environmental applications. Further research could explore the broader applicability of this technique and investigate the underlying relationships between SAR signatures and optical vegetation indices. Overall, this work highlights an innovative way to overcome limitations in optical satellite data using advanced machine learning techniques.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📊

Estimating optical vegetation indices and biophysical variables for temperate forests with Sentinel-1 SAR data using machine learning techniques: A case study for Czechia

Daniel Paluba, Bertrand Le Saux, Pv{r}emysl Stych

Current optical vegetation indices (VIs) for monitoring forest ecosystems are well established and widely used in various applications, but can be limited by atmospheric effects such as clouds. In contrast, synthetic aperture radar (SAR) data can offer insightful and systematic forest monitoring with complete time series (TS) due to signal penetration through clouds and day and night image acquisitions. This study aims to address the limitations of optical satellite data by using SAR data as an alternative for estimating optical VIs for forests through machine learning (ML). While this approach is less direct and likely only feasible through the power of ML, it raises the scientific question of whether enough relevant information is contained in the SAR signal to accurately estimate VIs. This work covers the estimation of TS of four VIs (LAI, FAPAR, EVI and NDVI) using multitemporal Sentinel-1 SAR and ancillary data. The study focused on both healthy and disturbed temperate forest areas in Czechia for the year 2021, while ground truth labels generated from Sentinel-2 multispectral data. This was enabled by creating a paired multi-modal TS dataset in Google Earth Engine (GEE), including temporally and spatially aligned Sentinel-1, Sentinel-2, DEM, weather and land cover datasets. The inclusion of DEM-derived auxiliary features and additional meteorological information, further improved the results. In the comparison of ML models, the traditional ML algorithms, RFR and XGBoost slightly outperformed the AutoML approach, auto-sklearn, for all VIs, achieving high accuracies ($R^2$ between 70-86%) and low errors (0.055-0.29 of MAE). In general, up to 240 measurements per year and a spatial resolution of 20 m can be achieved using estimated SAR-based VIs with high accuracy. A great advantage of the SAR-based VI is the ability to detect abrupt forest changes with sub-weekly temporal accuracy.

8/28/2024

3D-SAR Tomography and Machine Learning for High-Resolution Tree Height Estimation

Grace Colverd, Jumpei Takami, Laura Schade, Karol Bot, Joseph A. Gallego-Mejia

Accurately estimating forest biomass is crucial for global carbon cycle modelling and climate change mitigation. Tree height, a key factor in biomass calculations, can be measured using Synthetic Aperture Radar (SAR) technology. This study applies machine learning to extract forest height data from two SAR products: Single Look Complex (SLC) images and tomographic cubes, in preparation for the ESA Biomass Satellite mission. We use the TomoSense dataset, containing SAR and LiDAR data from Germany's Eifel National Park, to develop and evaluate height estimation models. Our approach includes classical methods, deep learning with a 3D U-Net, and Bayesian-optimized techniques. By testing various SAR frequencies and polarimetries, we establish a baseline for future height and biomass modelling. Best-performing models predict forest height to be within 2.82m mean absolute error for canopies around 30m, advancing our ability to measure global carbon stocks and support climate action.

9/10/2024

📊

A Novel Fusion of Optical and Radar Satellite Data for Crop Phenology Estimation using Machine Learning and Cloud Computing

Shahab Aldin Shojaeezadeh, Abdelrazek Elnashar, Tobias Karl David Weber

Crop phenology determines crop growth stages and is valuable information for decision makers to plant and adapt agricultural management strategies to enhance food security. In the era of big Earth observation data ubiquity, attempts have been made to accurately predict crop phenology based on Remote Sensing (RS) data. However, most studies either focused on large scale interpretations of phenology or developed methods which are not adequate to help crop modeler communities on leveraging the value of RS data evaluated using more accurate and confident methods. Here, we estimate phenological developments for eight major crops and 13 phenological stages across Germany at 30m scale using a novel framework which fuses Landsat and Sentinel 2 (Harmonized Landsat and Sentinel data base; HLS) and radar of Sentinel 1 with a Machine Learning (ML) model. We proposed a thorough feature fusion analysis to find the best combinations of RS data on detecting phenological developments based on the national phenology network of Germany (German Meteorological Service; DWD) between 2017 and 2021. The nation-wide predicted crop phenology at 30 m resolution showed a very high precision of R2 > 0.9 and a very low Mean Absolute Error (MAE) < 2 (days). These results indicate that our fusing strategy of optical and radar datasets is highly performant with an accuracy highly relevant for practical applications, too. The subsequent uncertainty analysis indicated that fusing optical and radar data increases the reliability of the RS predicted crop growth stages. These improvements are expected to be useful for crop model calibrations and evaluations, facilitate informed agricultural decisions, and contribute to sustainable food production to address the increasing global food demand.

9/4/2024

XAI-Guided Enhancement of Vegetation Indices for Crop Mapping

Hiba Najjar, Francisco Mena, Marlon Nuske, Andreas Dengel

Vegetation indices allow to efficiently monitor vegetation growth and agricultural activities. Previous generations of satellites were capturing a limited number of spectral bands, and a few expert-designed vegetation indices were sufficient to harness their potential. New generations of multi- and hyperspectral satellites can however capture additional bands, but are not yet efficiently exploited. In this work, we propose an explainable-AI-based method to select and design suitable vegetation indices. We first train a deep neural network using multispectral satellite data, then extract feature importance to identify the most influential bands. We subsequently select suitable existing vegetation indices or modify them to incorporate the identified bands and retrain our model. We validate our approach on a crop classification task. Our results indicate that models trained on individual indices achieve comparable results to the baseline model trained on all bands, while the combination of two indices surpasses the baseline in certain cases.

7/12/2024