Using Long Short-term Memory (LSTM) to merge precipitation data over mountainous area in Sierra Nevada

Read original: arXiv:2404.10135 - Published 4/23/2024 by Yihan Wang, Lujun Zhang

📊

Overview

Accurately estimating precipitation is a challenging task, especially in complex mountainous terrain.
The three main precipitation measurement approaches (rain gauges, radar, and satellite sensors) each have their own strengths and limitations.
Merging precipitation data from different sources can help improve reliability, and deep learning models like Long Short-Term Memory (LSTM) have shown promise in this area.

Plain English Explanation

Knowing how much rain or snow falls in a given area is crucial for a variety of reasons, like managing water resources and understanding the local climate. However, accurately measuring precipitation is difficult, especially in mountainous regions with complex terrain.

The three main ways to measure precipitation are:

Rain gauges: Physical instruments that directly measure rainfall at specific locations. They provide precise measurements but only for a limited area.
Radar: Systems that use radio waves to detect precipitation. They can cover larger areas but have trouble with complex terrain.
Satellite sensors: Instruments on satellites that observe precipitation from space. They have broad coverage but lower resolution.

Each of these methods has pros and cons, so researchers often try to combine, or "merge," the data from these different sources to get the most accurate overall picture. With the rise of powerful deep learning models like LSTM, there are new opportunities to improve these merged precipitation estimates.

Technical Explanation

This study employed an LSTM deep learning model to merge radar-based and satellite-based precipitation data from the Global Precipitation Measurement (GPM) mission, specifically the Integrated Multi-Satellite Retrievals for GPM (IMERG) product. The goal was to generate a more reliable hourly precipitation estimate than the existing Multi-Radar Multi-Sensor (MRMS) reanalysis product, using the LSTM to learn patterns in the input data.

The merged LSTM precipitation estimates were compared to gauge observations from the California Data Exchange Center (CDEC) to assess their accuracy. The results showed that the LSTM-based merged product tended to underestimate precipitation and sometimes failed to provide meaningful estimates, producing many near-zero values.

The researchers concluded that relying solely on the radar and satellite precipitation data, without incorporating additional meteorological information, was not sufficient to generate reliable merged precipitation estimates. However, the LSTM model did effectively capture the temporal trends in the observations, outperforming the MRMS product in this regard. This suggests that incorporating bias correction techniques could potentially improve the accuracy of the merged precipitation product.

Critical Analysis

The study highlights the challenges of accurately estimating precipitation, especially in complex mountainous terrain. While the use of an LSTM model to merge multiple precipitation data sources is a promising approach, the results suggest that more work is needed to develop truly reliable merged precipitation products.

One key limitation is the reliance on only the radar and satellite precipitation data, without incorporating other relevant meteorological variables. As the researchers note, additional inputs like temperature, humidity, and atmospheric pressure could potentially improve the model's ability to accurately estimate precipitation.

Additionally, the underestimation and frequent near-zero values in the LSTM-based merged product indicate that the model may not be fully capturing the complexity of the precipitation processes. Further research is needed to explore more sophisticated deep learning architectures or hybrid approaches that can better handle the non-linear and scale-dependent nature of precipitation.

It would also be valuable to evaluate the LSTM model's performance in other geographic regions with different terrain and climate characteristics. This could help identify any limitations or biases in the model's ability to generalize.

Despite these limitations, the study demonstrates the potential of deep learning techniques like LSTM to improve precipitation estimation. As computational power and observational data continue to advance, further research in this area could lead to more reliable precipitation products that benefit a wide range of applications, from water resource management to weather forecasting.

Conclusion

This study explored the use of a deep learning LSTM model to merge radar-based and satellite-based precipitation data in an effort to improve the reliability of precipitation estimation, especially in complex mountainous terrain. While the LSTM-based merged product showed some limitations, the research highlights the potential of deep learning techniques to integrate multiple precipitation data sources and capture the temporal dynamics of precipitation.

Continued advancements in deep learning models, alongside the incorporation of additional meteorological data, could lead to more accurate and reliable precipitation estimates in the future. This would have important implications for a wide range of applications, from water resource management to climate modeling and beyond.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📊

Using Long Short-term Memory (LSTM) to merge precipitation data over mountainous area in Sierra Nevada

Yihan Wang, Lujun Zhang

Obtaining reliable precipitation estimation with high resolutions in time and space is of great importance to hydrological studies. However, accurately estimating precipitation is a challenging task over high mountainous complex terrain. The three widely used precipitation measurement approaches, namely rainfall gauge, precipitation radars, and satellite-based precipitation sensors, have their own pros and cons in producing reliable precipitation products over complex areas. One way to decrease the detection error probability and improve data reliability is precipitation data merging. With the rapid advancements in computational capabilities and the escalating volume and diversity of earth observational data, Deep Learning (DL) models have gained considerable attention in geoscience. In this study, a deep learning technique, namely Long Short-term Memory (LSTM), was employed to merge a radar-based and a satellite-based Global Precipitation Measurement (GPM) precipitation product Integrated Multi-Satellite Retrievals for GPM (IMERG) precipitation product at hourly scale. The merged results are compared with the widely used reanalysis precipitation product, Multi-Radar Multi-Sensor (MRMS), and assessed against gauge observational data from the California Data Exchange Center (CDEC). The findings indicated that the LSTM-based merged precipitation notably underestimated gauge observations and, at times, failed to provide meaningful estimates, showing predominantly near-zero values. Relying solely on individual Quantitative Precipitation Estimates (QPEs) without additional meteorological input proved insufficient for generating reliable merged QPE. However, the merged results effectively captured the temporal trends of the observations, outperforming MRMS in this aspect. This suggested that incorporating bias correction techniques could potentially enhance the accuracy of the merged product.

4/23/2024

📊

Uncertainty estimation of machine learning spatial precipitation predictions from satellite data

Georgia Papacharalampous, Hristos Tyralis, Nikolaos Doulamis, Anastasios Doulamis

Merging satellite and gauge data with machine learning produces high-resolution precipitation datasets, but uncertainty estimates are often missing. We addressed the gap of how to optimally provide such estimates by benchmarking six algorithms, mostly novel even for the more general task of quantifying predictive uncertainty in spatial prediction settings. On 15 years of monthly data from over the contiguous United States (CONUS), we compared quantile regression (QR), quantile regression forests (QRF), generalized random forests (GRF), gradient boosting machines (GBM), light gradient boosting machine (LightGBM), and quantile regression neural networks (QRNN). Their ability to issue predictive precipitation quantiles at nine quantile levels (0.025, 0.050, 0.100, 0.250, 0.500, 0.750, 0.900, 0.950, 0.975), approximating the full probability distribution, was evaluated using quantile scoring functions and the quantile scoring rule. Predictors at a site were nearby values from two satellite precipitation retrievals, namely PERSIANN (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks) and IMERG (Integrated Multi-satellitE Retrievals), and the site's elevation. The dependent variable was the monthly mean gauge precipitation. With respect to QR, LightGBM showed improved performance in terms of the quantile scoring rule by 11.10%, also surpassing QRF (7.96%), GRF (7.44%), GBM (4.64%) and QRNN (1.73%). Notably, LightGBM outperformed all random forest variants, the current standard in spatial prediction with machine learning. To conclude, we propose a suite of machine learning algorithms for estimating uncertainty in spatial data prediction, supported with a formal evaluation framework based on scoring functions and scoring rules.

8/23/2024

⛏️

A Parsimonious Setup for Streamflow Forecasting using CNN-LSTM

Sudan Pokharel, Tirthankar Roy

Significant strides have been made in advancing streamflow predictions, notably with the introduction of cutting-edge machine-learning models. Predominantly, Long Short-Term Memories (LSTMs) and Convolution Neural Networks (CNNs) have been widely employed in this domain. While LSTMs are applicable in both rainfall-runoff and time series settings, CNN-LSTMs have primarily been utilized in rainfall-runoff scenarios. In this study, we extend the application of CNN-LSTMs to time series settings, leveraging lagged streamflow data in conjunction with precipitation and temperature data to predict streamflow. Our results show a substantial improvement in predictive performance in 21 out of 32 HUC8 basins in Nebraska, showcasing noteworthy increases in the Kling-Gupta Efficiency (KGE) values. These results highlight the effectiveness of CNN-LSTMs in time series settings, particularly for spatiotemporal hydrological modeling, for more accurate and robust streamflow predictions.

4/12/2024

Advances in Land Surface Model-based Forecasting: A comparative study of LSTM, Gradient Boosting, and Feedforward Neural Network Models as prognostic state emulators

Marieke Wesselkamp, Matthew Chantry, Ewan Pinnington, Margarita Choulga, Souhail Boussetta, Maria Kalweit, Joschka Boedecker, Carsten F. Dormann, Florian Pappenberger, Gianpaolo Balsamo

Most useful weather prediction for the public is near the surface. The processes that are most relevant for near-surface weather prediction are also those that are most interactive and exhibit positive feedback or have key role in energy partitioning. Land surface models (LSMs) consider these processes together with surface heterogeneity and forecast water, carbon and energy fluxes, and coupled with an atmospheric model provide boundary and initial conditions. This numerical parametrization of atmospheric boundaries being computationally expensive, statistical surrogate models are increasingly used to accelerated progress in experimental research. We evaluated the efficiency of three surrogate models in speeding up experimental research by simulating land surface processes, which are integral to forecasting water, carbon, and energy fluxes in coupled atmospheric models. Specifically, we compared the performance of a Long-Short Term Memory (LSTM) encoder-decoder network, extreme gradient boosting, and a feed-forward neural network within a physics-informed multi-objective framework. This framework emulates key states of the ECMWF's Integrated Forecasting System (IFS) land surface scheme, ECLand, across continental and global scales. Our findings indicate that while all models on average demonstrate high accuracy over the forecast period, the LSTM network excels in continental long-range predictions when carefully tuned, the XGB scores consistently high across tasks and the MLP provides an excellent implementation-time-accuracy trade-off. The runtime reduction achieved by the emulators in comparison to the full numerical models are significant, offering a faster, yet reliable alternative for conducting numerical experiments on land surfaces.

7/24/2024