Variational Bayesian surrogate modelling with application to robust design optimisation

2404.14857

Published 4/24/2024 by Thomas A. Archbold, Ieva Kazlauskaite, Fehmi Cirak

🤯

Abstract

Surrogate models provide a quick-to-evaluate approximation to complex computational models and are essential for multi-query problems like design optimisation. The inputs of current computational models are usually high-dimensional and uncertain. We consider Bayesian inference for constructing statistical surrogates with input uncertainties and intrinsic dimensionality reduction. The surrogates are trained by fitting to data from prevalent deterministic computational models. The assumed prior probability density of the surrogate is a Gaussian process. We determine the respective posterior probability density and parameters of the posited statistical model using variational Bayes. The non-Gaussian posterior is approximated by a simpler trial density with free variational parameters and the discrepancy between them is measured using the Kullback-Leibler (KL) divergence. We employ the stochastic gradient method to compute the variational parameters and other statistical model parameters by minimising the KL divergence. We demonstrate the accuracy and versatility of the proposed reduced dimension variational Gaussian process (RDVGP) surrogate on illustrative and robust structural optimisation problems with cost functions depending on a weighted sum of the mean and standard deviation of model outputs.

Create account to get full access

Overview

Surrogate models are quick-to-evaluate approximations of complex computational models, essential for multi-query problems like design optimization.
Current computational models often have high-dimensional and uncertain inputs.
This paper proposes a Bayesian approach to constructing statistical surrogates with input uncertainties and intrinsic dimensionality reduction.
The surrogates are trained by fitting to data from prevalent deterministic computational models.
The posterior probability density of the surrogate is determined using variational Bayes.

Plain English Explanation

Computational models used in engineering and science can be very complex and time-consuming to run. Surrogate models provide a way to approximate these complex models with a simpler, faster-to-evaluate version. This is particularly useful when you need to run the model many times, such as in design optimization.

The inputs to these computational models are often high-dimensional and uncertain. This means there are many different input variables, and we're not entirely sure of the exact values of those variables. The paper proposes a Bayesian approach to building surrogate models that can handle this uncertainty.

The surrogate model is trained on data from the original computational model. It uses a Gaussian process to represent the relationship between the inputs and outputs. The paper then determines the best parameters for this Gaussian process model using a technique called variational Bayes.

The key idea is to find a simpler, more tractable probability distribution that approximates the true, complex posterior distribution of the surrogate model. This allows the model parameters to be efficiently computed using stochastic gradient methods.

Technical Explanation

The paper proposes a reduced dimension variational Gaussian process (RDVGP) surrogate model that can handle high-dimensional, uncertain inputs. The surrogate is trained on data from a prevalent deterministic computational model.

The assumed prior probability density of the surrogate is a Gaussian process. The authors determine the posterior probability density of the surrogate using variational Bayes. This involves approximating the non-Gaussian posterior with a simpler trial density, and minimizing the Kullback-Leibler (KL) divergence between the two using stochastic gradient methods.

The RDVGP surrogate is demonstrated on illustrative and robust structural optimization problems, where the cost function depends on a weighted sum of the mean and standard deviation of the model outputs. This showcases the accuracy and versatility of the proposed approach.

Critical Analysis

The paper provides a comprehensive Bayesian framework for constructing surrogate models with high-dimensional, uncertain inputs. The use of variational Bayes to efficiently approximate the posterior distribution is a key contribution.

However, the paper does not extensively discuss the limitations of the proposed RDVGP approach. For example, the performance of the surrogate may degrade as the dimensionality of the input space increases, or if the true input-output relationship deviates significantly from the assumed Gaussian process prior.

Additionally, the paper could have provided more details on the computational complexity of the variational inference procedure, and how it scales with the size of the training dataset. This would help readers understand the practical applicability of the method.

Further research could explore alternative dimensionality reduction techniques within the variational Bayes framework, or investigate the use of more flexible prior distributions beyond the Gaussian process.

Conclusion

This paper presents a Bayesian approach to constructing surrogate models that can handle high-dimensional, uncertain inputs. By using variational Bayes to efficiently approximate the posterior distribution, the authors demonstrate an accurate and versatile surrogate modeling technique for complex computational problems, such as robust structural optimization.

The proposed RDVGP surrogate has the potential to significantly reduce the computational burden of multi-query tasks, enabling more efficient design, analysis, and decision-making in a wide range of engineering and scientific applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Approximation-Aware Bayesian Optimization

Natalie Maus, Kyurae Kim, Geoff Pleiss, David Eriksson, John P. Cunningham, Jacob R. Gardner

High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we modify SVGPs to better align with the goals of BO: targeting informed data acquisition rather than global posterior fidelity. Using the framework of utility-calibrated variational inference, we unify GP approximation and data acquisition into a joint optimization problem, thereby ensuring optimal decisions under a limited computational budget. Our approach can be used with any decision-theoretic acquisition function and is compatible with trust region methods like TuRBO. We derive efficient joint objectives for the expected improvement and knowledge gradient acquisition functions in both the standard and batch BO settings. Our approach outperforms standard SVGPs on high-dimensional benchmark tasks in control and molecular design.

6/7/2024

cs.LG stat.ML

🤔

Variational inference, Mixture of Gaussians, Bayesian Machine Learning

Tom Huix, Anna Korba, Alain Durmus, Eric Moulines

Variational inference (VI) is a popular approach in Bayesian inference, that looks for the best approximation of the posterior distribution within a parametric family, minimizing a loss that is typically the (reverse) Kullback-Leibler (KL) divergence. Despite its empirical success, the theoretical properties of VI have only received attention recently, and mostly when the parametric family is the one of Gaussians. This work aims to contribute to the theoretical study of VI in the non-Gaussian case by investigating the setting of Mixture of Gaussians with fixed covariance and constant weights. In this view, VI over this specific family can be casted as the minimization of a Mollified relative entropy, i.e. the KL between the convolution (with respect to a Gaussian kernel) of an atomic measure supported on Diracs, and the target distribution. The support of the atomic measure corresponds to the localization of the Gaussian components. Hence, solving variational inference becomes equivalent to optimizing the positions of the Diracs (the particles), which can be done through gradient descent and takes the form of an interacting particle system. We study two sources of error of variational inference in this context when optimizing the mollified relative entropy. The first one is an optimization result, that is a descent lemma establishing that the algorithm decreases the objective at each iteration. The second one is an approximation error, that upper bounds the objective between an optimal finite mixture and the target distribution.

6/11/2024

stat.ML cs.LG

🧠

A Study of Bayesian Neural Network Surrogates for Bayesian Optimization

Yucen Lily Li, Tim G. J. Rudner, Andrew Gordon Wilson

Bayesian optimization is a highly efficient approach to optimizing objective functions which are expensive to query. These objectives are typically represented by Gaussian process (GP) surrogate models which are easy to optimize and support exact inference. While standard GP surrogates have been well-established in Bayesian optimization, Bayesian neural networks (BNNs) have recently become practical function approximators, with many benefits over standard GPs such as the ability to naturally handle non-stationarity and learn representations for high-dimensional data. In this paper, we study BNNs as alternatives to standard GP surrogates for optimization. We consider a variety of approximate inference procedures for finite-width BNNs, including high-quality Hamiltonian Monte Carlo, low-cost stochastic MCMC, and heuristics such as deep ensembles. We also consider infinite-width BNNs, linearized Laplace approximations, and partially stochastic models such as deep kernel learning. We evaluate this collection of surrogate models on diverse problems with varying dimensionality, number of objectives, non-stationarity, and discrete and continuous inputs. We find: (i) the ranking of methods is highly problem dependent, suggesting the need for tailored inductive biases; (ii) HMC is the most successful approximate inference procedure for fully stochastic BNNs; (iii) full stochasticity may be unnecessary as deep kernel learning is relatively competitive; (iv) deep ensembles perform relatively poorly; (v) infinite-width BNNs are particularly promising, especially in high dimensions.

5/9/2024

cs.LG stat.ML

🛠️

Pseudo-Bayesian Optimization

Haoxian Chen, Henry Lam

Bayesian Optimization is a popular approach for optimizing expensive black-box functions. Its key idea is to use a surrogate model to approximate the objective and, importantly, quantify the associated uncertainty that allows a sequential search of query points that balance exploitation-exploration. Gaussian process (GP) has been a primary candidate for the surrogate model, thanks to its Bayesian-principled uncertainty quantification power and modeling flexibility. However, its challenges have also spurred an array of alternatives whose convergence properties could be more opaque. Motivated by these, we study in this paper an axiomatic framework that elicits the minimal requirements to guarantee black-box optimization convergence that could apply beyond GP-based methods. Moreover, we leverage the design freedom in our framework, which we call Pseudo-Bayesian Optimization, to construct empirically superior algorithms. In particular, we show how using simple local regression, and a suitable randomized prior construction to quantify uncertainty, not only guarantees convergence but also consistently outperforms state-of-the-art benchmarks in examples ranging from high-dimensional synthetic experiments to realistic hyperparameter tuning and robotic applications.

6/21/2024

stat.ML cs.LG