Online Calibrated and Conformal Prediction Improves Bayesian Optimization

2112.04620

Published 6/27/2024 by Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

🔮

Abstract

Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from calibration -- i.e., an 80% predictive interval should contain the true outcome 80% of the time. Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions. We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data, and we show how to integrate these algorithms in Bayesian optimization with minimal overhead. Empirically, we find that calibrated Bayesian optimization converges to better optima in fewer steps, and we demonstrate improved performance on standard benchmark functions and hyperparameter optimization tasks.

Create account to get full access

Overview

Accurate uncertainty estimates are crucial for sequential decision-making tasks like Bayesian optimization.
However, these estimates can be imperfect if the data does not meet the model's assumptions (e.g., Gaussianity).
This paper explores the uncertainties needed for model-based decision-making and Bayesian optimization, and argues that uncertainties should be calibrated - meaning an 80% predictive interval should contain the true outcome 80% of the time.
Maintaining calibration can be challenging when the data is non-stationary and depends on the actions taken.
The authors propose using simple online learning algorithms to provably maintain calibration on non-i.i.d. data, and integrate these algorithms into Bayesian optimization with minimal overhead.

Plain English Explanation

When making decisions based on a model's predictions, it's important to have accurate estimates of the uncertainty around those predictions. This helps ensure the decisions are well-informed and account for the potential for error.

However, the standard assumptions made by many models, like the data being normally distributed, don't always hold true in the real world. This can lead to inaccurate uncertainty estimates that undermine the decision-making process.

This paper looks at what kind of uncertainty information is needed for model-based decision-making, particularly in the context of Bayesian optimization. The key idea is that the uncertainty estimates should be "calibrated" - meaning if the model is 80% confident about a prediction, the true value should actually fall within that 80% range 80% of the time.

Maintaining this calibration can be tricky, especially when the data being fed into the model is constantly changing and depends on the decisions being made (a phenomenon known as "non-stationarity").

The researchers propose using some simple online learning techniques to keep the uncertainty estimates well-calibrated, even as the data shifts over time. They show how to integrate these calibration-aware methods into Bayesian optimization with minimal additional complexity.

In their experiments, the calibrated Bayesian optimization approach was able to find better optimal solutions in fewer steps compared to standard methods. This suggests that properly accounting for uncertainty can lead to smarter, more efficient decision-making.

Technical Explanation

The paper first establishes the importance of accurate uncertainty estimates in sequential decision-making tasks like Bayesian optimization. Inaccurate uncertainty can lead to poor decisions, especially when the data violates the model's assumptions (e.g., Gaussianity).

The authors then argue that the key requirement for useful uncertainties is

calibration

- meaning a 80% predictive interval should contain the true outcome 80% of the time. However, maintaining calibration is challenging when the data is non-stationary and depends on the actions taken.

To address this, the researchers propose using simple online learning algorithms to provably maintain calibration on non-i.i.d. data. These algorithms are designed to adaptively correct the model's uncertainty estimates as the data changes.

The authors demonstrate how to integrate these calibration-aware techniques into Bayesian optimization with minimal overhead. Empirically, they show this "calibrated Bayesian optimization" converges to better optima in fewer steps compared to standard methods, on both benchmark functions and real-world hyperparameter tuning tasks.

Critical Analysis

The paper makes a compelling case for the importance of calibrated uncertainty estimates in model-based decision-making. The proposed online learning algorithms for maintaining calibration, even in the face of non-stationary data, are a clever and principled solution.

That said, the authors acknowledge several limitations and directions for future work. For example, the current methods assume the data follows some unknown, but fixed, distribution over time. Extending the techniques to handle more complex, evolving data distributions could further broaden their applicability.

Additionally, while the empirical results on benchmark problems are promising, real-world deployment may surface other challenges not captured in these idealized settings. Validating the approach on a wider range of applications would help assess its practical utility.

Overall, this research represents an important step forward in addressing a fundamental challenge in sequential decision-making. By maintaining well-calibrated uncertainty estimates, the techniques developed here have the potential to significantly improve the performance and reliability of model-based optimization methods like Bayesian optimization.

Conclusion

This paper tackles the critical issue of maintaining accurate uncertainty estimates in sequential decision-making tasks, such as Bayesian optimization.

The key insight is that uncertainty estimates should be "calibrated" - meaning the model's reported confidence levels (e.g., 80% intervals) actually reflect the true likelihood of the outcomes. The authors propose practical online learning algorithms to provably maintain this calibration, even as the data changes in complex, non-stationary ways.

Integrating these calibration-aware techniques into Bayesian optimization, the researchers demonstrate improved performance in finding optimal solutions with fewer iterations. This work represents an important advance in making model-based decision-making more robust and reliable, with potential applications across a wide range of fields.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

↗️

Calibrated Regression Against An Adversary Without Regret

Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

We are interested in probabilistic prediction in online settings in which data does not follow a probability distribution. Our work seeks to achieve two goals: (1) producing valid probabilities that accurately reflect model confidence; and (2) ensuring that traditional notions of performance (e.g., high accuracy) still hold. We introduce online algorithms guaranteed to achieve these goals on arbitrary streams of data points, including data chosen by an adversary. Specifically, our algorithms produce forecasts that are (1) calibrated -- i.e., an 80% confidence interval contains the true outcome 80% of the time -- and (2) have low regret relative to a user-specified baseline model. We implement a post-hoc recalibration strategy that provably achieves these goals in regression; previous algorithms applied to classification or achieved (1) but not (2). In the context of Bayesian optimization, an online model-based decision-making task in which the data distribution shifts over time, our method yields accelerated convergence to improved optima.

6/6/2024

cs.LG

🏷️

Bayesian Adaptive Calibration and Optimal Design

Rafael Oliveira, Dino Sejdinovic, David Howard, Edwin Bonilla

The process of calibrating computer models of natural phenomena is essential for applications in the physical sciences, where plenty of domain knowledge can be embedded into simulations and then calibrated against real observations. Current machine learning approaches, however, mostly rely on rerunning simulations over a fixed set of designs available in the observed data, potentially neglecting informative correlations across the design space and requiring a large amount of simulations. Instead, we consider the calibration process from the perspective of Bayesian adaptive experimental design and propose a data-efficient algorithm to run maximally informative simulations within a batch-sequential process. At each round, the algorithm jointly estimates the parameters of the posterior distribution and optimal designs by maximising a variational lower bound of the expected information gain. The simulator is modelled as a sample from a Gaussian process, which allows us to correlate simulations and observed data with the unknown calibration parameters. We show the benefits of our method when compared to related approaches across synthetic and real-data problems.

5/24/2024

cs.LG stat.ML

🐍

Calibration-Aware Bayesian Learning

Jiayi Huang, Sangwoo Park, Osvaldo Simeone

Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent regularizers have been recently introduced in the context of conventional frequentist learning to penalize deviations between confidence and accuracy. In contrast, data-independent regularizers are at the core of Bayesian learning, enforcing adherence of the variational distribution in the model parameter space to a prior density. The former approach is unable to quantify epistemic uncertainty, while the latter is severely affected by model misspecification. In light of the limitations of both methods, this paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs), that applies both regularizers while optimizing over a variational distribution as in Bayesian learning. Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.

4/15/2024

cs.LG eess.SP

🔮

Stochastic Online Conformal Prediction with Semi-Bandit Feedback

Haosen Ge, Hamsa Bastani, Osbert Bastani

Conformal prediction has emerged as an effective strategy for uncertainty quantification by modifying a model to output sets of labels instead of a single label. These prediction sets come with the guarantee that they contain the true label with high probability. However, conformal prediction typically requires a large calibration dataset of i.i.d. examples. We consider the online learning setting, where examples arrive over time, and the goal is to construct prediction sets dynamically. Departing from existing work, we assume semi-bandit feedback, where we only observe the true label if it is contained in the prediction set. For instance, consider calibrating a document retrieval model to a new domain; in this setting, a user would only be able to provide the true label if the target document is in the prediction set of retrieved documents. We propose a novel conformal prediction algorithm targeted at this setting, and prove that it obtains sublinear regret compared to the optimal conformal predictor. We evaluate our algorithm on a retrieval task and an image classification task, and demonstrate that it empirically achieves good performance.

5/24/2024

cs.LG