Verifiably Robust Conformal Prediction

2405.18942

Published 6/7/2024 by Linus Jeary, Tom Kuipers, Mehran Hosseini, Nicola Paoletti

Abstract

Conformal Prediction (CP) is a popular uncertainty quantification method that provides distribution-free, statistically valid prediction sets, assuming that training and test data are exchangeable. In such a case, CP's prediction sets are guaranteed to cover the (unknown) true test output with a user-specified probability. Nevertheless, this guarantee is violated when the data is subjected to adversarial attacks, which often result in a significant loss of coverage. Recently, several approaches have been put forward to recover CP guarantees in this setting. These approaches leverage variations of randomised smoothing to produce conservative sets which account for the effect of the adversarial perturbations. They are, however, limited in that they only support $ell^2$-bounded perturbations and classification tasks. This paper introduces VRCP (Verifiably Robust Conformal Prediction), a new framework that leverages recent neural network verification methods to recover coverage guarantees under adversarial attacks. Our VRCP method is the first to support perturbations bounded by arbitrary norms including $ell^1$, $ell^2$, and $ell^infty$, as well as regression tasks. We evaluate and compare our approach on image classification tasks (CIFAR10, CIFAR100, and TinyImageNet) and regression tasks for deep reinforcement learning environments. In every case, VRCP achieves above nominal coverage and yields significantly more efficient and informative prediction regions than the SotA.

Create account to get full access

Overview

Introduces a new approach for conformal prediction, a method for constructing prediction sets with guaranteed validity
Proposes a novel technique called "verifiably robust conformal prediction" that provides improved efficiency and robustness compared to existing methods
Demonstrates the effectiveness of this approach through theoretical analysis and empirical experiments

Plain English Explanation

Conformal prediction is a statistical technique used to construct prediction sets with guaranteed validity. This means that the prediction sets produced will contain the true outcome a certain percentage of the time, even if the underlying data or model is not perfectly accurate.

The paper introduces a new approach called "verifiably robust conformal prediction" that builds on this idea. The key innovation is that it can provide these guaranteed valid prediction sets more efficiently and in a way that is more robust to various factors, such as changes in the data distribution or errors in the model.

The authors demonstrate through both theoretical analysis and experiments that this new method outperforms existing conformal prediction techniques in terms of the size of the prediction sets produced. This means you can get the same guaranteed validity, but with smaller and more informative prediction intervals.

The information-theoretic perspective provided in the paper also gives additional insight into how and why this approach works. The authors show that it can leverage learned features more effectively than prior methods.

Technical Explanation

The paper introduces a novel conformal prediction algorithm called "verifiably robust conformal prediction" (VRCP). Like standard conformal prediction, VRCP constructs prediction sets that are guaranteed to contain the true outcome with a pre-specified probability (e.g. 95% of the time), regardless of the underlying data distribution or model.

The key innovation in VRCP is that it can achieve this guarantee more efficiently than existing conformal prediction methods. This is done by leveraging an improved conformity score that is more robust to various factors, such as covariate shift or model misspecification.

Theoretically, the authors show that VRCP can lead to significantly tighter prediction sets compared to prior approaches, without sacrificing validity. This is formalized through a novel information-theoretic analysis that provides insights into the optimal structure of the conformity score.

The empirical evaluation in the paper demonstrates the practical benefits of VRCP across a range of synthetic and real-world datasets. The authors show that VRCP consistently outperforms standard conformal prediction in terms of the size of the prediction intervals, while maintaining the same coverage guarantee.

Critical Analysis

The paper presents a compelling technical contribution to the field of conformal prediction. The proposed VRCP algorithm addresses important limitations of existing methods, providing a way to construct more efficient and robust prediction sets.

That said, the authors acknowledge several caveats and areas for further research. For example, the theoretical analysis assumes the underlying model is well-specified, which may not always hold in practice. Additionally, the current formulation of VRCP requires solving a non-convex optimization problem, which could be computationally challenging for large-scale applications.

It would also be interesting to see how VRCP performs in settings with high-dimensional or structured data, beyond the relatively low-dimensional benchmarks considered in the paper. Exploring the connections to other robust or adversarially-trained machine learning techniques could also yield additional insights.

Overall, the paper makes a strong contribution, but there remain opportunities to further refine and generalize the VRCP approach. Readers are encouraged to think critically about the assumptions, limitations, and potential real-world implications of this research.

Conclusion

This paper introduces a new conformal prediction algorithm called "verifiably robust conformal prediction" (VRCP) that provides improved efficiency and robustness compared to existing methods. Through a combination of theoretical analysis and empirical evaluation, the authors demonstrate that VRCP can construct tighter prediction sets with the same validity guarantee as standard conformal prediction.

The key innovation in VRCP is a novel conformity score that leverages an information-theoretic perspective to better leverage learned features and account for potential distribution shifts or model misspecification. This results in more efficient prediction sets that are also more reliable in the face of real-world challenges.

The insights and techniques presented in this work have the potential to significantly advance the field of conformal prediction, with applications in areas like uncertainty quantification, anomaly detection, and safety-critical decision-making. By making conformal prediction more practical and robust, VRCP represents an important step towards enabling the widespread adoption of this powerful statistical framework.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🔮

Provably Robust Conformal Prediction with Improved Efficiency

Ge Yan, Yaniv Romano, Tsui-Wei Weng

Conformal prediction is a powerful tool to generate uncertainty sets with guaranteed coverage using any predictive model, under the assumption that the training and test data are i.i.d.. Recently, it has been shown that adversarial examples are able to manipulate conformal methods to construct prediction sets with invalid coverage rates, as the i.i.d. assumption is violated. To address this issue, a recent work, Randomized Smoothed Conformal Prediction (RSCP), was first proposed to certify the robustness of conformal prediction methods to adversarial noise. However, RSCP has two major limitations: (i) its robustness guarantee is flawed when used in practice and (ii) it tends to produce large uncertainty sets. To address these limitations, we first propose a novel framework called RSCP+ to provide provable robustness guarantee in evaluation, which fixes the issues in the original RSCP method. Next, we propose two novel methods, Post-Training Transformation (PTT) and Robust Conformal Training (RCT), to effectively reduce prediction set size with little computation overhead. Experimental results in CIFAR10, CIFAR100, and ImageNet suggest the baseline method only yields trivial predictions including full label set, while our methods could boost the efficiency by up to $4.36times$, $5.46times$, and $16.9times$ respectively and provide practical robustness guarantee. Our codes are available at https://github.com/Trustworthy-ML-Lab/Provably-Robust-Conformal-Prediction.

5/1/2024

cs.LG cs.AI cs.CV

A Conformal Prediction Score that is Robust to Label Noise

Coby Penso, Jacob Goldberger

Conformal Prediction (CP) quantifies network uncertainty by building a small prediction set with a pre-defined probability that the correct class is within this set. In this study we tackle the problem of CP calibration based on a validation set with noisy labels. We introduce a conformal score that is robust to label noise. The noise-free conformal score is estimated using the noisy labeled data and the noise level. In the test phase the noise-free score is used to form the prediction set. We applied the proposed algorithm to several standard medical imaging classification datasets. We show that our method outperforms current methods by a large margin, in terms of the average size of the prediction set, while maintaining the required coverage.

5/22/2024

cs.LG cs.AI cs.CV

🔮

Conformal Prediction for Class-wise Coverage via Augmented Label Rank Calibration

Yuanjie Shi, Subhankar Ghosh, Taha Belkhouja, Janardhan Rao Doppa, Yan Yan

Conformal prediction (CP) is an emerging uncertainty quantification framework that allows us to construct a prediction set to cover the true label with a pre-specified marginal or conditional probability. Although the valid coverage guarantee has been extensively studied for classification problems, CP often produces large prediction sets which may not be practically useful. This issue is exacerbated for the setting of class-conditional coverage on imbalanced classification tasks. This paper proposes the Rank Calibrated Class-conditional CP (RC3P) algorithm to reduce the prediction set sizes to achieve class-conditional coverage, where the valid coverage holds for each class. In contrast to the standard class-conditional CP (CCP) method that uniformly thresholds the class-wise conformity score for each class, the augmented label rank calibration step allows RC3P to selectively iterate this class-wise thresholding subroutine only for a subset of classes whose class-wise top-k error is small. We prove that agnostic to the classifier and data distribution, RC3P achieves class-wise coverage. We also show that RC3P reduces the size of prediction sets compared to the CCP method. Comprehensive experiments on multiple real-world datasets demonstrate that RC3P achieves class-wise coverage and 26.25% reduction in prediction set sizes on average.

6/12/2024

cs.LG

🔮

An Information Theoretic Perspective on Conformal Prediction

Alvaro H. C. Correia, Fabio Valerio Massoli, Christos Louizos, Arash Behboodi

Conformal Prediction (CP) is a distribution-free uncertainty estimation framework that constructs prediction sets guaranteed to contain the true answer with a user-specified probability. Intuitively, the size of the prediction set encodes a general notion of uncertainty, with larger sets associated with higher degrees of uncertainty. In this work, we leverage information theory to connect conformal prediction to other notions of uncertainty. More precisely, we prove three different ways to upper bound the intrinsic uncertainty, as described by the conditional entropy of the target variable given the inputs, by combining CP with information theoretical inequalities. Moreover, we demonstrate two direct and useful applications of such connection between conformal prediction and information theory: (i) more principled and effective conformal training objectives that generalize previous approaches and enable end-to-end training of machine learning models from scratch, and (ii) a natural mechanism to incorporate side information into conformal prediction. We empirically validate both applications in centralized and federated learning settings, showing our theoretical results translate to lower inefficiency (average prediction set size) for popular CP methods.

6/27/2024

cs.LG cs.IT stat.ML