Forecasting infectious disease prevalence with associated uncertainty using neural networks

Read original: arXiv:2409.01154 - Published 9/4/2024 by Michael Morris

🧠

Overview

Infectious diseases pose significant human and economic burdens.
Accurately forecasting disease incidence can enable public health agencies to respond effectively.
Developing accurate forecasting models remains a significant challenge.
This thesis proposes two methodological frameworks using neural networks (NNs) with associated uncertainty estimates.

Plain English Explanation

The paper focuses on developing better ways to forecast the spread of infectious diseases, like the flu, in the United States. Accurately predicting the number of people who will get sick can help public health agencies prepare and respond more effectively. However, creating accurate forecasting models has been very difficult.

The researchers propose two new methods that use neural networks, a type of machine learning model. These models not only make predictions, but also provide estimates of how certain they are about those predictions. This is an important feature that has been lacking in previous forecasting approaches.

The first method uses data from web searches related to the flu, along with historical flu case data, to train the neural network models. The best performing model reduced the average error in forecasts by 10.3% and improved the overall forecasting skill by 17.1% compared to other state-of-the-art methods.

The second method uses a type of neural network called a "neural ordinary differential equation" to bridge the gap between traditional mathematical models of disease spread and the more flexible neural network approach. This allows the model to benefit from the physical constraints provided by the mathematical models while still leveraging the power of neural networks.

Technical Explanation

The paper presents two novel methodological frameworks for forecasting infectious disease incidence using neural networks (NNs) with associated uncertainty estimates.

The first proposed method uses web search activity data in conjunction with historical influenza-like illness (ILI) rates as observations for training NN architectures. The models incorporate Bayesian layers to produce uncertainty intervals, positioning themselves as legitimate alternatives to more conventional approaches. The best performing architecture, the iterative recurrent neural network (IRNN), reduces mean absolute error by 10.3% and improves Skill by 17.1% on average in forecasting tasks across four flu seasons compared to the state-of-the-art.

The second framework uses neural ordinary differential equations to bridge the gap between mechanistic compartmental models and NNs, benefiting from the physical constraints that compartmental models provide. The researchers evaluate eight neural ODE models utilizing a mixture of ILI rates and web search activity data to provide forecasts, and compare them with the IRNN and a version of IRNN using only ILI rates (IRNN0). Models trained without web search activity data outperform the IRNN0 by 16% in terms of Skill.

Critical Analysis

The paper presents promising approaches for improving infectious disease forecasting by leveraging neural networks and incorporating uncertainty estimates. The use of web search data, in addition to traditional epidemiological data, is a valuable contribution, as it can capture early signals of disease activity.

However, the paper does not address the potential limitations of web search data, such as biases in who uses search engines or potential delays in the data reflecting actual disease activity. Additionally, the evaluation is limited to influenza-like illness forecasting in the United States, and further research is needed to assess the generalizability of the methods to other infectious diseases and geographic regions.

The neural ordinary differential equation approach is an interesting theoretical framework, but the paper does not provide a clear demonstration of its advantages over the more straightforward IRNN model. Further work is needed to fully capitalize on the benefits of incorporating physical constraints into the neural network architecture.

Conclusion

This thesis proposes two innovative frameworks for infectious disease forecasting using neural networks with uncertainty estimates. The first method leverages web search data to improve forecasting performance, while the second bridges the gap between mechanistic compartmental models and neural networks.

These approaches represent significant progress in the field of epidemic forecasting and have the potential to enable more effective public health decision-making. However, further research is needed to address the limitations and expand the applicability of these methods to a wider range of infectious diseases and settings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🧠

Forecasting infectious disease prevalence with associated uncertainty using neural networks

Michael Morris

Infectious diseases pose significant human and economic burdens. Accurately forecasting disease incidence can enable public health agencies to respond effectively to existing or emerging diseases. Despite progress in the field, developing accurate forecasting models remains a significant challenge. This thesis proposes two methodological frameworks using neural networks (NNs) with associated uncertainty estimates - a critical component limiting the application of NNs to epidemic forecasting thus far. We develop our frameworks by forecasting influenza-like illness (ILI) in the United States. Our first proposed method uses Web search activity data in conjunction with historical ILI rates as observations for training NN architectures. Our models incorporate Bayesian layers to produce uncertainty intervals, positioning themselves as legitimate alternatives to more conventional approaches. The best performing architecture: iterative recurrent neural network (IRNN), reduces mean absolute error by 10.3% and improves Skill by 17.1% on average in forecasting tasks across four flu seasons compared to the state-of-the-art. We build on this method by introducing IRNNs, an architecture which changes the sampling procedure in the IRNN to improve the uncertainty estimation. Our second framework uses neural ordinary differential equations to bridge the gap between mechanistic compartmental models and NNs; benefiting from the physical constraints that compartmental models provide. We evaluate eight neural ODE models utilising a mixture of ILI rates and Web search activity data to provide forecasts. These are compared with the IRNN and IRNN0 - the IRNN using only ILI rates. Models trained without Web search activity data outperform the IRNN0 by 16% in terms of Skill. Future work should focus on more effectively using neural ODEs with Web search data to compete with the best performing IRNN.

9/4/2024

🎲

COVID-19 Probability Prediction Using Machine Learning: An Infectious Approach

Mohsen Asghari Ilani, Saba Moftakhar Tehran, Ashkan Kavei, Arian Radmehr

The ongoing COVID-19 pandemic continues to pose significant challenges to global public health, despite the widespread availability of vaccines. Early detection of the disease remains paramount in curbing its transmission and mitigating its impact on public health systems. In response, this study delves into the application of advanced machine learning (ML) techniques for predicting COVID-19 infection probability. We conducted a rigorous investigation into the efficacy of various ML models, including XGBoost, LGBM, AdaBoost, Logistic Regression, Decision Tree, RandomForest, CatBoost, KNN, and Deep Neural Networks (DNN). Leveraging a dataset comprising 4000 samples, with 3200 allocated for training and 800 for testing, our experiment offers comprehensive insights into the performance of these models in COVID-19 prediction. Our findings reveal that Deep Neural Networks (DNN) emerge as the top-performing model, exhibiting superior accuracy and recall metrics. With an impressive accuracy rate of 89%, DNN demonstrates remarkable potential in early COVID-19 detection. This underscores the efficacy of deep learning approaches in leveraging complex data patterns to identify COVID-19 infections accurately. This study underscores the critical role of machine learning, particularly deep learning methodologies, in augmenting early detection efforts amidst the ongoing pandemic. The success of DNN in accurately predicting COVID-19 infection probability highlights the importance of continued research and development in leveraging advanced technologies to combat infectious diseases.

8/26/2024

Bayesian Survival Analysis by Approximate Inference of Neural Networks

Christian Marius Lillelund, Martin Magris, Christian Fischer Pedersen

Variational Inference (VI) is a commonly used technique for approximate Bayesian inference and uncertainty estimation in deep learning models, yet it comes at a computational cost, as it doubles the number of trainable parameters to represent uncertainty. This rapidly becomes challenging in high-dimensional settings and motivates the use of alternative techniques for inference, such as Monte Carlo Dropout (MCD) or Spectral-normalized Neural Gaussian Process (SNGP). However, such methods have seen little adoption in survival analysis, and VI remains the prevalent approach for training probabilistic neural networks. In this paper, we investigate how to train deep probabilistic survival models in large datasets without introducing additional overhead in model complexity. To achieve this, we adopt three probabilistic approaches, namely VI, MCD, and SNGP, and evaluate them in terms of their prediction performance, calibration performance, and model complexity. In the context of probabilistic survival analysis, we investigate whether non-VI techniques can offer comparable or possibly improved prediction performance and uncertainty calibration compared to VI. In the MIMIC-IV dataset, we find that MCD aligns with VI in terms of the concordance index (0.748 vs. 0.743) and mean absolute error (254.9 vs. 254.7) using hinge loss, while providing C-calibrated uncertainty estimates. Moreover, our SNGP implementation provides D-calibrated survival functions in all datasets compared to VI (4/4 vs. 2/4, respectively). Our work encourages the use of techniques alternative to VI for survival analysis in high-dimensional datasets, where computational efficiency and overhead are of concern.

6/21/2024

Flusion: Integrating multiple data sources for accurate influenza predictions

Evan L. Ray, Yijin Wang, Russell D. Wolfinger, Nicholas G. Reich

Over the last ten years, the US Centers for Disease Control and Prevention (CDC) has organized an annual influenza forecasting challenge with the motivation that accurate probabilistic forecasts could improve situational awareness and yield more effective public health actions. Starting with the 2021/22 influenza season, the forecasting targets for this challenge have been based on hospital admissions reported in the CDC's National Healthcare Safety Network (NHSN) surveillance system. Reporting of influenza hospital admissions through NHSN began within the last few years, and as such only a limited amount of historical data are available for this signal. To produce forecasts in the presence of limited data for the target surveillance system, we augmented these data with two signals that have a longer historical record: 1) ILI+, which estimates the proportion of outpatient doctor visits where the patient has influenza; and 2) rates of laboratory-confirmed influenza hospitalizations at a selected set of healthcare facilities. Our model, Flusion, is an ensemble that combines gradient boosting quantile regression models with a Bayesian autoregressive model. The gradient boosting models were trained on all three data signals, while the autoregressive model was trained on only the target signal; all models were trained jointly on data for multiple locations. Flusion was the top-performing model in the CDC's influenza prediction challenge for the 2023/24 season. In this article we investigate the factors contributing to Flusion's success, and we find that its strong performance was primarily driven by the use of a gradient boosting model that was trained jointly on data from multiple surveillance signals and locations. These results indicate the value of sharing information across locations and surveillance signals, especially when doing so adds to the pool of available training data.

7/30/2024