A Framework for Strategic Discovery of Credible Neural Network Surrogate Models under Uncertainty

2403.08901

Published 5/15/2024 by Pratyush Kumar Singh, Kathryn A. Farrell-Maupin, Danial Faghihi

A Framework for Strategic Discovery of Credible Neural Network Surrogate Models under Uncertainty

Abstract

The widespread integration of deep neural networks in developing data-driven surrogate models for high-fidelity simulations of complex physical systems highlights the critical necessity for robust uncertainty quantification techniques and credibility assessment methodologies, ensuring the reliable deployment of surrogate models in consequential decision-making. This study presents the Occam Plausibility Algorithm for surrogate models (OPAL-surrogate), providing a systematic framework to uncover predictive neural network-based surrogate models within the large space of potential models, including various neural network classes and choices of architecture and hyperparameters. The framework is grounded in hierarchical Bayesian inferences and employs model validation tests to evaluate the credibility and prediction reliability of the surrogate models under uncertainty. Leveraging these principles, OPAL-surrogate introduces a systematic and efficient strategy for balancing the trade-off between model complexity, accuracy, and prediction uncertainty. The effectiveness of OPAL-surrogate is demonstrated through two modeling problems, including the deformation of porous materials for building insulation and turbulent combustion flow for the ablation of solid fuels within hybrid rocket motors.

Create account to get full access

Overview

This paper presents a framework for discovering credible neural network surrogate models under uncertainty.
The authors propose a Bayesian approach to neural network learning that treats it as a probabilistic inference problem.
The framework aims to systematically discover reliable surrogate models that can accurately represent complex systems while quantifying the associated uncertainties.

Plain English Explanation

The paper introduces a new way to train neural networks that can be used to model complex systems, like physical processes or engineering designs. Traditional neural networks can struggle to capture all the uncertainties involved in these types of systems. The authors' approach treats the neural network training process as a probabilistic inference problem, which allows them to better account for the inherent uncertainties.

This framework guides the discovery of credible neural network surrogate models - simplified models that can accurately represent the behavior of the original complex system. By quantifying the uncertainties in the surrogate model, the framework ensures the models are reliable and can be used to make informed decisions, rather than being over-confident in their predictions.

The key insight is that treating neural network training as a probabilistic inference problem, rather than a deterministic optimization problem, allows the framework to systematically navigate the space of possible surrogate models and discover the most credible ones for a given application. This can be particularly useful in areas like power grid operations or medical diagnostics, where having a reliable surrogate model with well-quantified uncertainties is crucial.

Technical Explanation

The paper proposes a Bayesian framework for neural network learning that casts it as a probabilistic inference problem. The authors start by defining a parametric neural network model and a set of observed data. They then derive the posterior distribution of the network parameters given the observed data, using Bayes' theorem.

This posterior distribution encapsulates the uncertainty in the network parameters, which can then be used to make predictions with well-quantified uncertainties. The authors develop a practical algorithm for approximating the posterior distribution using variational inference techniques.

The framework then systematically explores the space of possible neural network architectures and hyperparameters to discover the most credible surrogate models. This is achieved by defining appropriate prior distributions over the network architecture and hyperparameters, and then sampling from the posterior distribution to identify the most promising candidate models.

The authors demonstrate the effectiveness of their framework through several case studies, including Bayesian optimization and uncertainty quantification in power grid operations. The results show that the proposed framework can discover reliable surrogate models that accurately capture the underlying system behavior while providing well-calibrated uncertainty estimates.

Critical Analysis

The paper presents a compelling and principled approach to discovering credible neural network surrogate models. The Bayesian treatment of neural network learning is a strength, as it allows the framework to systematically account for uncertainties in the model parameters and architecture.

However, the authors acknowledge that the variational inference techniques used to approximate the posterior distribution may not always provide accurate uncertainty estimates, especially in high-dimensional or complex settings. Further research is needed to explore more robust Bayesian inference methods that can scale to larger and more challenging problems.

Additionally, the framework assumes that the observed data is representative of the true underlying system. In practice, there may be biases or missing information in the available data, which could lead to inaccuracies in the discovered surrogate models. The authors could have discussed strategies for addressing such data-related challenges.

Despite these limitations, the proposed framework represents an important step towards making neural networks more reliable and trustworthy, particularly in applications where accurate uncertainty quantification is critical. The internal links to related works provide valuable context for understanding the broader significance of this research.

Conclusion

This paper presents a Bayesian framework for the strategic discovery of credible neural network surrogate models under uncertainty. By treating neural network learning as a probabilistic inference problem, the framework can systematically explore the space of possible models and identify the most reliable surrogates that accurately capture the underlying system behavior while providing well-quantified uncertainty estimates.

The framework's ability to discover credible surrogate models has important implications for a wide range of applications, from power grid operations to medical diagnostics, where having a reliable and transparent representation of complex systems is crucial for informed decision-making. By addressing the over-confidence and uncertainty quantification challenges in deep learning, this research represents an important step towards making neural networks more trustworthy and widely applicable.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🐍

Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

A. Diaw, M. McKerns, I. Sagert, L. G. Stanton, M. S. Murillo

Machine learning methods are increasingly used to build computationally inexpensive surrogates for complex physical models. The predictive capability of these surrogates suffers when data are noisy, sparse, or time-dependent. As we are interested in finding a surrogate that provides valid predictions of any potential future model evaluations, we introduce an online learning method empowered by optimizer-driven sampling. The method has two advantages over current approaches. First, it ensures that all turning points on the model response surface are included in the training data. Second, after any new model evaluations, surrogates are tested and retrained (updated) if the score drops below a validity threshold. Tests on benchmark functions reveal that optimizer-directed sampling generally outperforms traditional sampling methods in terms of accuracy around local extrema, even when the scoring metric favors overall accuracy. We apply our method to simulations of nuclear matter to demonstrate that highly accurate surrogates for the nuclear equation of state can be reliably auto-generated from expensive calculations using a few model evaluations.

5/20/2024

cs.LG

🧠

Operational risk quantification of power grids using graph neural network surrogates of the DC OPF

Yadong Zhang, Pranav M Karve, Sankaran Mahadevan

A DC OPF surrogate modeling framework is developed for Monte Carlo (MC) sampling-based risk quantification in power grid operation. MC simulation necessitates solving a large number of DC OPF problems corresponding to the samples of stochastic grid variables (power demand and renewable generation), which is computationally prohibitive. Computationally inexpensive surrogates of OPF provide an attractive alternative for expedited MC simulation. Graph neural network (GNN) surrogates of DC OPF, which are especially suitable to graph-structured data, are employed in this work. Previously developed DC OPF surrogate models have focused on accurate operational decision-making and not on risk quantification. Here, risk quantification-specific aspects of DC OPF surrogate evaluation is the main focus. To this end, the proposed GNN surrogates are evaluated using realistic joint probability distributions, quantification of their risk estimation accuracy, and investigation of their generalizability. Four synthetic grids (Case118, Case300, Case1354pegase, and Case2848rte) are used for surrogate model performance evaluation. It is shown that the GNN surrogates are sufficiently accurate for predicting the (bus-level, branch-level and system-level) grid state and enable fast as well as accurate operational risk quantification for power grids. The article thus develops tools for fast reliability and risk quantification in real-world power grids using GNN-based surrogates.

4/23/2024

eess.SY cs.LG cs.SY

Surrogate Neural Networks Local Stability for Aircraft Predictive Maintenance

M'elanie Ducoffe, Guillaume Pov'eda, Audrey Galametz, Ryma Boumazouza, Marion-C'ecile Martin, Julien Baris, Derk Daverschot, Eugene O'Higgins

Surrogate Neural Networks are nowadays routinely used in industry as substitutes for computationally demanding engineering simulations (e.g., in structural analysis). They allow to generate faster predictions and thus analyses in industrial applications e.g., during a product design, testing or monitoring phases. Due to their performance and time-efficiency, these surrogate models are now being developed for use in safety-critical applications. Neural network verification and in particular the assessment of their robustness (e.g., to perturbations) is the next critical step to allow their inclusion in real-life applications and certification. We assess the applicability and scalability of empirical and formal methods in the context of aircraft predictive maintenance for surrogate neural networks designed to predict the stress sustained by an aircraft part from external loads. The case study covers a high-dimensional input and output space and the verification process thus accommodates multi-objective constraints. We explore the complementarity of verification methods in assessing the local stability property of such surrogate models to input noise. We showcase the effectiveness of sequentially combining methods in one verification 'pipeline' and demonstrating the subsequent gain in runtime required to assess the targeted property.

6/6/2024

cs.LG cs.AI

Credal Wrapper of Model Averaging for Uncertainty Estimation on Out-Of-Distribution Detection

Kaizheng Wang, Fabio Cuzzolin, Keivan Shariatmadar, David Moens, Hans Hallez

This paper presents an innovative approach, called credal wrapper, to formulating a credal set representation of model averaging for Bayesian neural networks (BNNs) and deep ensembles, capable of improving uncertainty estimation in classification tasks. Given a finite collection of single distributions derived from BNNs or deep ensembles, the proposed approach extracts an upper and a lower probability bound per class, acknowledging the epistemic uncertainty due to the availability of a limited amount of sampled predictive distributions. Such probability intervals over classes can be mapped on a convex set of probabilities (a 'credal set') from which, in turn, a unique prediction can be obtained using a transformation called 'intersection probability transformation'. In this article, we conduct extensive experiments on multiple out-of-distribution (OOD) detection benchmarks, encompassing various dataset pairs (CIFAR10/100 vs SVHN/Tiny-ImageNet, CIFAR10 vs CIFAR10-C, CIFAR100 vs CIFAR100-C and ImageNet vs ImageNet-O) and using different network architectures (such as VGG16, Res18/50, EfficientNet B2, and ViT Base). Compared to BNN and deep ensemble baselines, the proposed credal representation methodology exhibits superior performance in uncertainty estimation and achieves lower expected calibration error on OOD samples.

5/27/2024

cs.LG cs.AI