Decomposing weather forecasting into advection and convection with neural networks

2405.06590

Published 5/13/2024 by Mengxuan Chen, Ziqi Yuan, Jinxiao Zhang, Runmin Dong, Haohuan Fu

🧠

Abstract

Operational weather forecasting models have advanced for decades on both the explicit numerical solvers and the empirical physical parameterization schemes. However, the involved high computational costs and uncertainties in these existing schemes are requiring potential improvements through alternative machine learning methods. Previous works use a unified model to learn the dynamics and physics of the atmospheric model. Contrarily, we propose a simple yet effective machine learning model that learns the horizontal movement in the dynamical core and vertical movement in the physical parameterization separately. By replacing the advection with a graph attention network and the convection with a multi-layer perceptron, our model provides a new and efficient perspective to simulate the transition of variables in atmospheric models. We also assess the model's performance over a 5-day iterative forecasting. Under the same input variables and training methods, our model outperforms existing data-driven methods with a significantly-reduced number of parameters with a resolution of 5.625 deg. Overall, this work aims to contribute to the ongoing efforts that leverage machine learning techniques for improving both the accuracy and efficiency of global weather forecasting.

Create account to get full access

Overview

Researchers propose a machine learning model that learns the horizontal and vertical movement in atmospheric models separately, using a graph attention network for advection and a multilayer perceptron for convection.
The model outperforms existing data-driven methods while using significantly fewer parameters, and it shows promise for improving the accuracy and efficiency of global weather forecasting.
The work aims to leverage machine learning techniques to address the high computational costs and uncertainties in current numerical weather prediction models.

Plain English Explanation

Predicting the weather is a complex task that relies on sophisticated computer models. These models use numerical solvers and physical parameterization schemes to simulate the movement of air and the formation of weather patterns. However, these existing models have high computational costs and uncertainties, which can limit their accuracy.

The researchers in this study propose a new machine learning approach to address these challenges. Instead of using a single model to learn the entire atmospheric system, they break it down into two parts: horizontal movement (advection) and vertical movement (convection). For the horizontal movement, they use a graph attention network, which is a type of machine learning model that can learn patterns in complex, interconnected data. For the vertical movement, they use a multilayer perceptron, a more basic type of machine learning model.

By separating the model in this way, the researchers were able to create a simpler, more efficient system that still captures the essential dynamics of the atmospheric system. When they tested their model on a 5-day weather forecasting task, it outperformed other data-driven methods while using significantly fewer parameters.

Overall, this research represents an important step towards using machine learning to improve the accuracy and efficiency of global weather forecasting. By breaking down the problem into more manageable parts and using specialized machine learning techniques, the researchers have developed a promising new approach that could help make weather prediction more reliable and accessible.

Technical Explanation

The researchers in this study propose a novel machine learning model for simulating the transition of variables in atmospheric models, which are used for global weather forecasting. Current numerical weather prediction models rely on complex numerical solvers and empirical physical parameterization schemes, which can be computationally expensive and subject to uncertainties.

To address these limitations, the researchers developed a two-part machine learning model. The first part uses a graph attention network to learn the horizontal movement (advection) in the dynamical core of the atmospheric model. The second part uses a multilayer perceptron to learn the vertical movement (convection) in the physical parameterization scheme.

By separating the model in this way, the researchers were able to create a more efficient and effective system. They tested their model on a 5-day iterative weather forecasting task and found that it outperformed existing data-driven methods, while using a significantly reduced number of parameters.

The key insights from this work are:

Decoupling the horizontal and vertical movement components of atmospheric models can lead to more efficient and accurate machine learning-based simulations.
Graph attention networks are well-suited for learning the complex, interconnected patterns in the horizontal movement of air masses.
Multilayer perceptrons can effectively capture the vertical movement dynamics associated with physical parameterization schemes.

Critical Analysis

The researchers acknowledge several limitations and areas for further research in their paper. For example, they note that their model was tested at a relatively coarse resolution (5.625 degrees) and that future work should explore higher-resolution simulations.

Additionally, the researchers did not compare their model's performance to that of traditional numerical weather prediction models, which could provide valuable insights into the tradeoffs between machine learning-based and physics-based approaches.

Another potential concern is the reliance on a single 5-day forecasting task for evaluating the model's performance. It would be beneficial to test the model on a wider range of weather conditions and forecasting scenarios to better understand its generalization capabilities.

Despite these limitations, the researchers have made a compelling case for the potential of their approach. By separating the horizontal and vertical movement components and using specialized machine learning techniques, they have demonstrated a path towards more efficient and accurate weather forecasting models.

Conclusion

This study presents a novel machine learning-based approach for simulating the transition of variables in atmospheric models used for global weather forecasting. By decoupling the horizontal and vertical movement components and using specialized machine learning architectures, the researchers have developed a model that outperforms existing data-driven methods while using significantly fewer parameters.

This work represents an important step towards leveraging machine learning to address the high computational costs and uncertainties in current numerical weather prediction models. By improving the accuracy and efficiency of weather forecasting, the researchers' approach could have significant implications for a wide range of sectors, from disaster preparedness to renewable energy planning.

As the field of machine learning in climate and weather science continues to evolve, this study offers a promising new direction for researchers and practitioners to explore. By combining domain-specific knowledge with innovative machine learning techniques, the researchers have demonstrated the potential for transformative advancements in this critical area of scientific inquiry.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Towards an end-to-end artificial intelligence driven global weather forecasting system

Kun Chen, Lei Bai, Fenghua Ling, Peng Ye, Tao Chen, Jing-Jia Luo, Hao Chen, Yi Xiao, Kang Chen, Tao Han, Wanli Ouyang

The weather forecasting system is important for science and society, and significant achievements have been made in applying artificial intelligence (AI) to medium-range weather forecasting. However, existing AI-based weather forecasting models rely on analysis or reanalysis products from traditional numerical weather prediction (NWP) systems as initial conditions for making predictions. Initial states are typically generated by traditional data assimilation components, which are computational expensive and time-consuming. Here we present an AI-based data assimilation model, i.e., Adas, for global weather variables. By introducing the confidence matrix, Adas employs gated convolution to handle sparse observations and gated cross-attention for capturing the interactions between the background and observations. Further, we combine Adas with the advanced AI-based forecasting model (i.e., FengWu) to construct the first end-to-end AI-based global weather forecasting system: FengWu-Adas. We demonstrate that Adas can assimilate global observations to produce high-quality analysis, enabling the system operate stably for long term. Moreover, we are the first to apply the methods to real-world scenarios, which is more challenging and has considerable practical application potential. We have also achieved the forecasts based on the analyses generated by AI with a skillful forecast lead time exceeding that of the IFS for the first time.

4/9/2024

cs.AI cs.LG

🖼️

Forecasting the Future with Future Technologies: Advancements in Large Meteorological Models

Hailong Shu, Yue Wang, Weiwei Song, Huichuang Guo, Zhen Song

The field of meteorological forecasting has undergone a significant transformation with the integration of large models, especially those employing deep learning techniques. This paper reviews the advancements and applications of these models in weather prediction, emphasizing their role in transforming traditional forecasting methods. Models like FourCastNet, Pangu-Weather, GraphCast, ClimaX, and FengWu have made notable contributions by providing accurate, high-resolution forecasts, surpassing the capabilities of traditional Numerical Weather Prediction (NWP) models. These models utilize advanced neural network architectures, such as Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), and Transformers, to process diverse meteorological data, enhancing predictive accuracy across various time scales and spatial resolutions. The paper addresses challenges in this domain, including data acquisition and computational demands, and explores future opportunities for model optimization and hardware advancements. It underscores the integration of artificial intelligence with conventional meteorological techniques, promising improved weather prediction accuracy and a significant contribution to addressing climate-related challenges. This synergy positions large models as pivotal in the evolving landscape of meteorological forecasting.

4/11/2024

cs.LG cs.AI

Exploring the Potential of Hybrid Machine-Learning/Physics-Based Modeling for Atmospheric/Oceanic Prediction Beyond the Medium Range

Dhruvit Patel, Troy Arcomano, Brian Hunt, Istvan Szunyogh, Edward Ott

This paper explores the potential of a hybrid modeling approach that combines machine learning (ML) with conventional physics-based modeling for weather prediction beyond the medium range. It extends the work of Arcomano et al. (2022), which tested the approach for short- and medium-range weather prediction, and the work of Arcomano et al. (2023), which investigated its potential for climate modeling. The hybrid model used for the forecast experiments of the paper is based on the low-resolution, simplified parameterization atmospheric general circulation model (AGCM) SPEEDY. In addition to the hybridized prognostic variables of SPEEDY, the current version of the model has three purely ML-based prognostic variables. One of these is 6~h cumulative precipitation, another is the sea surface temperature, while the third is the heat content of the top 300 m deep layer of the ocean. The model has skill in predicting the El Ni~no cycle and its global teleconnections with precipitation for 3-7 months depending on the season. The model captures equatorial variability of the precipitation associated with Kelvin and Rossby waves and MJO. Predictions of the precipitation in the equatorial region have skill for 15 days in the East Pacific and 11.5 days in the West Pacific. Though the model has low spatial resolution, for these tasks it has prediction skill comparable to what has been published for high-resolution, purely physics-based, conventional operational forecast models.

5/31/2024

cs.LG

🔎

Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling

Wanghan Xu, Fenghua Ling, Wenlong Zhang, Tao Han, Hao Chen, Wanli Ouyang, Lei Bai

Data-driven artificial intelligence (AI) models have made significant advancements in weather forecasting, particularly in medium-range and nowcasting. However, most data-driven weather forecasting models are black-box systems that focus on learning data mapping rather than fine-grained physical evolution in the time dimension. Consequently, the limitations in the temporal scale of datasets prevent these models from forecasting at finer time scales. This paper proposes a physics-AI hybrid model (i.e., WeatherGFT) which Generalizes weather forecasts to Finer-grained Temporal scales beyond training dataset. Specifically, we employ a carefully designed PDE kernel to simulate physical evolution on a small time scale (e.g., 300 seconds) and use a parallel neural networks with a learnable router for bias correction. Furthermore, we introduce a lead time-aware training framework to promote the generalization of the model at different lead times. The weight analysis of physics-AI modules indicates that physics conducts major evolution while AI performs corrections adaptively. Extensive experiments show that WeatherGFT trained on an hourly dataset, achieves state-of-the-art performance across multiple lead times and exhibits the capability to generalize 30-minute forecasts.

5/30/2024

cs.LG cs.AI