Stochastic Online Conformal Prediction with Semi-Bandit Feedback

2405.13268

Published 5/24/2024 by Haosen Ge, Hamsa Bastani, Osbert Bastani

🔮

Abstract

Conformal prediction has emerged as an effective strategy for uncertainty quantification by modifying a model to output sets of labels instead of a single label. These prediction sets come with the guarantee that they contain the true label with high probability. However, conformal prediction typically requires a large calibration dataset of i.i.d. examples. We consider the online learning setting, where examples arrive over time, and the goal is to construct prediction sets dynamically. Departing from existing work, we assume semi-bandit feedback, where we only observe the true label if it is contained in the prediction set. For instance, consider calibrating a document retrieval model to a new domain; in this setting, a user would only be able to provide the true label if the target document is in the prediction set of retrieved documents. We propose a novel conformal prediction algorithm targeted at this setting, and prove that it obtains sublinear regret compared to the optimal conformal predictor. We evaluate our algorithm on a retrieval task and an image classification task, and demonstrate that it empirically achieves good performance.

Create account to get full access

Overview

The paper introduces a new conformal prediction algorithm for the online learning setting with semi-bandit feedback.
Conformal prediction is a technique that outputs prediction sets instead of single labels, with a guarantee that the true label is contained in the set with high probability.
The semi-bandit feedback setting is where the true label is only revealed if it is contained in the prediction set, e.g. when calibrating a document retrieval model to a new domain.
The proposed algorithm is shown to achieve sublinear regret compared to the optimal conformal predictor in this setting.

Plain English Explanation

In standard machine learning, models are typically trained to output a single predicted label for each input. However, this can be uncertain, especially when applying a model to a new domain. Conformal prediction is a technique that modifies models to instead output a

set

of possible labels. These prediction sets are designed to contain the true label with high probability.

This paper focuses on the

online learning

setting, where examples arrive over time, and the goal is to construct these prediction sets dynamically. The researchers assume a

semi-bandit feedback

scenario, where the true label is only revealed if it is contained in the prediction set. This is like calibrating a document retrieval model - the user can only provide feedback if the target document is in the set of retrieved documents.

The researchers propose a new conformal prediction algorithm tailored to this semi-bandit setting. They prove that their algorithm achieves

sublinear regret

, meaning it performs nearly as well as the best possible conformal predictor over time. They also demonstrate good empirical performance on retrieval and image classification tasks.

Technical Explanation

The key technical contribution of this paper is a novel conformal prediction algorithm designed for the online learning setting with

semi-bandit feedback

. In this setting, examples arrive sequentially, and the true label is only revealed if it is contained in the prediction set output by the model.

The algorithm works as follows: for each new example, it computes a

nonconformity score

that measures how different the example is from the previous examples. It then uses these scores to construct a

prediction set

that is guaranteed to contain the true label with high probability, as in standard conformal prediction.

Crucially, the algorithm updates its internal model

adaptively

based on the semi-bandit feedback received. That is, it only updates the model when the true label is revealed (i.e., when it is contained in the prediction set). The researchers prove that this approach achieves

sublinear regret

compared to the optimal conformal predictor, meaning it converges to the same performance as the optimal predictor over time.

The authors evaluate their algorithm on a document retrieval task and an image classification task. They show that it outperforms baselines that do not utilize the semi-bandit feedback, demonstrating the benefits of their adaptive approach.

Critical Analysis

The key strength of this work is its ability to handle the semi-bandit feedback setting, where the true label is only revealed when it is contained in the model's prediction set. This is a realistic scenario in many applications, such as calibrating a document retrieval model to a new domain.

That said, the paper does not address some important limitations. For example, the analysis assumes that the data is i.i.d. (independent and identically distributed), which may not hold in many real-world settings. Additionally, the algorithm requires storing all past examples, which may not be feasible for large-scale applications.

Furthermore, the paper does not discuss the

computational complexity

of the proposed algorithm, which is an important practical consideration. It would be helpful to understand the scalability of this approach compared to other conformal prediction methods.

Finally, the paper could benefit from a more thorough discussion of the implications of this work for the broader field of machine learning and uncertainty quantification. The potential applications and limitations of this approach could be explored in more depth.

Conclusion

This paper presents a novel conformal prediction algorithm for the online learning setting with semi-bandit feedback. By adaptively updating its internal model based on the limited feedback available, the algorithm is able to achieve strong performance guarantees compared to the optimal conformal predictor.

While the paper has some limitations, it represents an important step forward in developing effective uncertainty quantification techniques for real-world machine learning applications where full feedback may not be available. The researchers' work highlights the value of tailoring conformal prediction methods to specific feedback scenarios, and their algorithm could serve as a foundation for future developments in this area.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🏷️

Efficient Online Set-valued Classification with Bandit Feedback

Zhou Wang, Xingye Qiao

Conformal prediction is a distribution-free method that wraps a given machine learning model and returns a set of plausible labels that contain the true label with a prescribed coverage rate. In practice, the empirical coverage achieved highly relies on fully observed label information from data both in the training phase for model fitting and the calibration phase for quantile estimation. This dependency poses a challenge in the context of online learning with bandit feedback, where a learner only has access to the correctness of actions (i.e., pulled an arm) but not the full information of the true label. In particular, when the pulled arm is incorrect, the learner only knows that the pulled one is not the true class label, but does not know which label is true. Additionally, bandit feedback further results in a smaller labeled dataset for calibration, limited to instances with correct actions, thereby affecting the accuracy of quantile estimation. To address these limitations, we propose Bandit Class-specific Conformal Prediction (BCCP), offering coverage guarantees on a class-specific granularity. Using an unbiased estimation of an estimand involving the true label, BCCP trains the model and makes set-valued inferences through stochastic gradient descent. Our approach overcomes the challenges of sparsely labeled data in each iteration and generalizes the reliability and applicability of conformal prediction to online decision-making environments.

5/8/2024

stat.ML cs.LG

Conformal online model aggregation

Matteo Gasparin, Aaditya Ramdas

Conformal prediction equips machine learning models with a reasonable notion of uncertainty quantification without making strong distributional assumptions. It wraps around any black-box prediction model and converts point predictions into set predictions that have a predefined marginal coverage guarantee. However, conformal prediction only works if we fix the underlying machine learning model in advance. A relatively unaddressed issue in conformal prediction is that of model selection and/or aggregation: for a given problem, which of the plethora of prediction methods (random forests, neural nets, regularized linear models, etc.) should we conformalize? This paper proposes a new approach towards conformal model aggregation in online settings that is based on combining the prediction sets from several algorithms by voting, where weights on the models are adapted over time based on past performance.

5/3/2024

stat.ML cs.LG

🔮

Online Calibrated and Conformal Prediction Improves Bayesian Optimization

Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from calibration -- i.e., an 80% predictive interval should contain the true outcome 80% of the time. Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions. We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data, and we show how to integrate these algorithms in Bayesian optimization with minimal overhead. Empirically, we find that calibrated Bayesian optimization converges to better optima in fewer steps, and we demonstrate improved performance on standard benchmark functions and hyperparameter optimization tasks.

6/27/2024

cs.LG stat.ML

🔮

Self-Consistent Conformal Prediction

Lars van der Laan, Ahmed M. Alaa

In decision-making guided by machine learning, decision-makers may take identical actions in contexts with identical predicted outcomes. Conformal prediction helps decision-makers quantify uncertainty in point predictions of outcomes, allowing for better risk management for actions. Motivated by this perspective, we introduce textit{Self-Consistent Conformal Prediction} for regression, which combines two post-hoc approaches -- Venn-Abers calibration and conformal prediction -- to provide calibrated point predictions and compatible prediction intervals that are valid conditional on model predictions. Our procedure can be applied post-hoc to any black-box model to provide predictions and inferences with finite-sample prediction-conditional guarantees. Numerical experiments show our approach strikes a balance between interval efficiency and conditional validity.

4/23/2024

stat.ML cs.LG