WeatherReal: A Benchmark Based on In-Situ Observations for Evaluating Weather Models

Read original: arXiv:2409.09371 - Published 9/17/2024 by Weixin Jin, Jonathan Weyn, Pengcheng Zhao, Siqi Xiang, Jiang Bian, Zuliang Fang, Haiyu Dong, Hongyu Sun, Kit Thambiratnam, Qi Zhang

WeatherReal: A Benchmark Based on In-Situ Observations for Evaluating Weather Models

Overview

The paper introduces a new benchmark called WeatherReal for evaluating weather models.
WeatherReal is based on in-situ observations from weather stations, providing a realistic evaluation of model performance.
The benchmark covers a wide range of weather variables and geographic regions, making it a comprehensive test of model capabilities.

Plain English Explanation

The paper presents a new benchmark called WeatherReal that is designed to evaluate the performance of weather models. Unlike previous benchmarks that may use simulated or idealized data, WeatherReal is based on real-world observations from weather stations around the world.

This provides a more realistic and comprehensive test of how well weather models can capture the complexities of actual weather patterns and conditions. The benchmark covers a wide range of weather variables, such as temperature, precipitation, and wind speed, across different geographic regions. By using this diverse dataset, researchers and developers can more thoroughly assess the strengths and weaknesses of their weather models.

The goal of WeatherReal is to create a standardized way to compare the performance of different weather models, similar to how machine learning models are evaluated on benchmark datasets like ImageNet or CIFAR-10. This allows for more meaningful comparisons and can help drive progress in the field of weather modeling.

Technical Explanation

The paper introduces the WeatherReal benchmark, which is based on in-situ observations from weather stations around the world. This provides a more realistic dataset for evaluating weather models compared to previous benchmarks that may have used simulated or idealized data.

The dataset covers a wide range of weather variables, including temperature, precipitation, wind speed, and more, across a diverse set of geographic regions. This allows for a comprehensive assessment of model capabilities, testing their ability to capture the complexities of real-world weather patterns.

The authors demonstrate the utility of the WeatherReal benchmark by evaluating several state-of-the-art weather models. The results show that the models exhibit varying levels of performance across different weather variables and locations, highlighting the need for a robust and diverse benchmark like WeatherReal.

Critical Analysis

The WeatherReal benchmark is a valuable contribution to the field of weather modeling, as it provides a more realistic and comprehensive evaluation of model performance. By using in-situ observations, the benchmark captures the complexities and uncertainties inherent in real-world weather data, which is crucial for understanding the practical applicability of weather models.

However, the paper does not address potential limitations of the dataset, such as the availability and quality of weather station data, or the potential biases that may be present in the observations. Additionally, the authors could have explored the impact of different weather patterns, seasons, or extreme events on model performance, which would provide further insights into the strengths and weaknesses of the evaluated models.

Despite these minor limitations, the WeatherReal benchmark represents a significant step forward in the evaluation of weather models and is likely to become an important tool for researchers and developers in the field.

Conclusion

The WeatherReal benchmark introduced in this paper provides a more realistic and comprehensive way to evaluate the performance of weather models. By using in-situ observations from weather stations around the world, the benchmark captures the complexities of real-world weather patterns, allowing for a more meaningful assessment of model capabilities.

The adoption of the WeatherReal benchmark has the potential to drive progress in the field of weather modeling, as it enables more meaningful comparisons between different models and approaches. This can lead to the development of more accurate and reliable weather forecasting systems, which can have significant societal and economic impacts in areas such as disaster preparedness, agriculture, and transportation.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

WeatherReal: A Benchmark Based on In-Situ Observations for Evaluating Weather Models

Weixin Jin, Jonathan Weyn, Pengcheng Zhao, Siqi Xiang, Jiang Bian, Zuliang Fang, Haiyu Dong, Hongyu Sun, Kit Thambiratnam, Qi Zhang

In recent years, AI-based weather forecasting models have matched or even outperformed numerical weather prediction systems. However, most of these models have been trained and evaluated on reanalysis datasets like ERA5. These datasets, being products of numerical models, often diverge substantially from actual observations in some crucial variables like near-surface temperature, wind, precipitation and clouds - parameters that hold significant public interest. To address this divergence, we introduce WeatherReal, a novel benchmark dataset for weather forecasting, derived from global near-surface in-situ observations. WeatherReal also features a publicly accessible quality control and evaluation framework. This paper details the sources and processing methodologies underlying the dataset, and further illustrates the advantage of in-situ observations in capturing hyper-local and extreme weather through comparative analyses and case studies. Using WeatherReal, we evaluated several data-driven models and compared them with leading numerical models. Our work aims to advance the AI-based weather forecasting research towards a more application-focused and operation-ready approach.

9/17/2024

DABench: A Benchmark Dataset for Data-Driven Weather Data Assimilation

Wuxin Wang, Weicheng Ni, Tao Han, Lei Bai, Boheng Duan, Kaijun Ren

Recent advancements in deep learning (DL) have led to the development of several Large Weather Models (LWMs) that rival state-of-the-art (SOTA) numerical weather prediction (NWP) systems. Up to now, these models still rely on traditional NWP-generated analysis fields as input and are far from being an autonomous system. While researchers are exploring data-driven data assimilation (DA) models to generate accurate initial fields for LWMs, the lack of a standard benchmark impedes the fair evaluation among different data-driven DA algorithms. Here, we introduce DABench, a benchmark dataset utilizing ERA5 data as ground truth to guide the development of end-to-end data-driven weather prediction systems. DABench contributes four standard features: (1) sparse and noisy simulated observations under the guidance of the observing system simulation experiment method; (2) a skillful pre-trained weather prediction model to generate background fields while fairly evaluating the impact of assimilation outcomes on predictions; (3) standardized evaluation metrics for model comparison; (4) a strong baseline called the DA Transformer (DaT). DaT integrates the four-dimensional variational DA prior knowledge into the Transformer model and outperforms the SOTA in physical state reconstruction, named 4DVarNet. Furthermore, we exemplify the development of an end-to-end data-driven weather prediction system by integrating DaT with the prediction model. Researchers can leverage DABench to develop their models and compare performance against established baselines, which will benefit the future advancements of data-driven weather prediction systems. The code is available on this Github repository and the dataset is available at the Baidu Drive.

8/22/2024

WEATHER-5K: A Large-scale Global Station Weather Dataset Towards Comprehensive Time-series Forecasting Benchmark

Tao Han, Song Guo, Zhenghao Chen, Wanghan Xu, Lei Bai

Global Station Weather Forecasting (GSWF) is crucial for various sectors, including aviation, agriculture, energy, and disaster preparedness. Recent advancements in deep learning have significantly improved the accuracy of weather predictions by optimizing models based on public meteorological data. However, existing public datasets for GSWF optimization and benchmarking still suffer from significant limitations, such as small sizes, limited temporal coverage, and a lack of comprehensive variables. These shortcomings prevent them from effectively reflecting the benchmarks of current forecasting methods and fail to support the real needs of operational weather forecasting. To address these challenges, we present the WEATHER-5K dataset. This dataset comprises a comprehensive collection of data from 5,672 weather stations worldwide, spanning a 10-year period with one-hour intervals. It includes multiple crucial weather elements, providing a more reliable and interpretable resource for forecasting. Furthermore, our WEATHER-5K dataset can serve as a benchmark for comprehensively evaluating existing well-known forecasting models, extending beyond GSWF methods to support future time-series research challenges and opportunities. The dataset and benchmark implementation are publicly available at: https://github.com/taohan10200/WEATHER-5K.

6/21/2024

Data driven weather forecasts trained and initialised directly from observations

Anthony McNally, Christian Lessig, Peter Lean, Eulalie Boucher, Mihai Alexe, Ewan Pinnington, Matthew Chantry, Simon Lang, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, Sean Healy

Skilful Machine Learned weather forecasts have challenged our approach to numerical weather prediction, demonstrating competitive performance compared to traditional physics-based approaches. Data-driven systems have been trained to forecast future weather by learning from long historical records of past weather such as the ECMWF ERA5. These datasets have been made freely available to the wider research community, including the commercial sector, which has been a major factor in the rapid rise of ML forecast systems and the levels of accuracy they have achieved. However, historical reanalyses used for training and real-time analyses used for initial conditions are produced by data assimilation, an optimal blending of observations with a physics-based forecast model. As such, many ML forecast systems have an implicit and unquantified dependence on the physics-based models they seek to challenge. Here we propose a new approach, training a neural network to predict future weather purely from historical observations with no dependence on reanalyses. We use raw observations to initialise a model of the atmosphere (in observation space) learned directly from the observations themselves. Forecasts of crucial weather parameters (such as surface temperature and wind) are obtained by predicting weather parameter observations (e.g. SYNOP surface data) at future times and arbitrary locations. We present preliminary results on forecasting observations 12-hours into the future. These already demonstrate successful learning of time evolutions of the physical processes captured in real observations. We argue that this new approach, by staying purely in observation space, avoids many of the challenges of traditional data assimilation, can exploit a wider range of observations and is readily expanded to simultaneous forecasting of the full Earth system (atmosphere, land, ocean and composition).

7/23/2024