Conformal Prediction for Deep Classifier via Label Ranking

2310.06430

Published 6/7/2024 by Jianguo Huang, Huajun Xi, Linjun Zhang, Huaxiu Yao, Yue Qiu, Hongxin Wei

Conformal Prediction for Deep Classifier via Label Ranking

Abstract

Conformal prediction is a statistical framework that generates prediction sets containing ground-truth labels with a desired coverage guarantee. The predicted probabilities produced by machine learning models are generally miscalibrated, leading to large prediction sets in conformal prediction. To address this issue, we propose a novel algorithm named $textit{Sorted Adaptive Prediction Sets}$ (SAPS), which discards all the probability values except for the maximum softmax probability. The key idea behind SAPS is to minimize the dependence of the non-conformity score on the probability values while retaining the uncertainty information. In this manner, SAPS can produce compact prediction sets and communicate instance-wise uncertainty. Extensive experiments validate that SAPS not only lessens the prediction sets but also broadly enhances the conditional coverage rate of prediction sets.

Create account to get full access

Motivation and method

Motivation

The researchers were motivated to develop a conformal prediction method that could provide well-calibrated prediction sets for deep learning classifiers. Conformal prediction is a technique that can produce prediction sets with guaranteed coverage properties, even when the underlying model has biases or is overconfident.

The researchers saw an opportunity to combine conformal prediction with label ranking, which involves learning to rank the class labels for a given input rather than just predicting the single most likely class. This could allow the conformal predictor to leverage the rich information in the label rankings produced by the deep classifier.

Method

The key steps in the researchers' approach are:

Train a deep neural network to output label rankings instead of just class predictions. This allows the model to express more nuanced beliefs about the possible classes.
Use a conformal prediction approach to calibrate the deep classifier's outputs and produce prediction sets with valid coverage guarantees. Specifically, they adapted the conformalized label ranking method to work with deep neural networks.
Evaluate the performance of the resulting conformal predictor on benchmark datasets, comparing it to other conformal prediction methods for deep learning.

The researchers hypothesized that this combination of deep learning and conformal prediction would yield well-calibrated and informative prediction sets, outperforming previous approaches.

Technical Explanation

The researchers trained a deep neural network to output a ranking of the class labels for a given input, rather than just predicting the single most likely class. This label ranking reflects the model's beliefs about the relative plausibility of each possible class.

They then adapted the conformalized label ranking (CLR) method to work with the deep neural network. CLR is a conformal prediction technique that can produce prediction sets with valid coverage guarantees, even when the underlying model is biased or overconfident.

The key steps in their approach were:

Train the deep neural network to output a ranking of the class labels, using a pairwise ranking loss function.
Compute nonconformity scores for each possible prediction set by comparing the label rankings of the test example to the training examples.
Use these nonconformity scores to calibrate the prediction sets and ensure they have the desired coverage level (e.g., 90% or 95%).

The researchers evaluated their method on several benchmark image classification datasets, comparing it to other conformal prediction approaches for deep learning. They found that their method produced well-calibrated and informative prediction sets that outperformed the alternatives.

Critical Analysis

The researchers acknowledge several limitations and potential areas for future work:

The computational overhead of the conformal prediction step may limit the scalability of the approach, especially for large-scale deep learning models.
The method relies on the assumption that the label rankings produced by the deep classifier are reliable and informative. If the deep model has significant biases or weaknesses, this could undermine the quality of the conformal predictions.
The researchers only evaluated their method on image classification tasks. It would be valuable to explore its performance on other types of deep learning problems, such as language modeling or reinforcement learning.

Additionally, one could question whether the benefits of conformal prediction outweigh the added complexity and computational cost, especially for applications where high-confidence predictions are less critical. The tradeoffs between predictive performance, calibration, and efficiency are an important consideration.

Conclusion

This research presents a novel approach to combining deep learning and conformal prediction for classification tasks. By training the deep model to output label rankings and then calibrating these outputs using conformal prediction, the researchers were able to produce well-calibrated and informative prediction sets that outperformed previous methods.

The work contributes to the growing body of research on verifiably robust conformal prediction and self-consistent conformal prediction, which aim to make deep learning models more reliable and trustworthy. As deep learning becomes more ubiquitous, techniques like this that can quantify and improve the uncertainty of predictions will be increasingly important.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

A Conformal Prediction Score that is Robust to Label Noise

Coby Penso, Jacob Goldberger

Conformal Prediction (CP) quantifies network uncertainty by building a small prediction set with a pre-defined probability that the correct class is within this set. In this study we tackle the problem of CP calibration based on a validation set with noisy labels. We introduce a conformal score that is robust to label noise. The noise-free conformal score is estimated using the noisy labeled data and the noise level. In the test phase the noise-free score is used to form the prediction set. We applied the proposed algorithm to several standard medical imaging classification datasets. We show that our method outperforms current methods by a large margin, in terms of the average size of the prediction set, while maintaining the required coverage.

5/22/2024

cs.LG cs.AI cs.CV

Conformal Prediction Sets Improve Human Decision Making

Jesse C. Cresswell, Yi Sui, Bhargava Kumar, Noel Vouitsis

In response to everyday queries, humans explicitly signal uncertainty and offer alternative answers when they are unsure. Machine learning models that output calibrated prediction sets through conformal prediction mimic this human behaviour; larger sets signal greater uncertainty while providing alternatives. In this work, we study the usefulness of conformal prediction sets as an aid for human decision making by conducting a pre-registered randomized controlled trial with conformal prediction sets provided to human subjects. With statistical significance, we find that when humans are given conformal prediction sets their accuracy on tasks improves compared to fixed-size prediction sets with the same coverage guarantee. The results show that quantifying model uncertainty with conformal prediction is helpful for human-in-the-loop decision making and human-AI teams.

6/11/2024

cs.LG cs.HC stat.ML

Evidential Uncertainty Sets in Deep Classifiers Using Conformal Prediction

Hamed Karimi, Reza Samavi

In this paper, we propose Evidential Conformal Prediction (ECP) method for image classifiers to generate the conformal prediction sets. Our method is designed based on a non-conformity score function that has its roots in Evidential Deep Learning (EDL) as a method of quantifying model (epistemic) uncertainty in DNN classifiers. We use evidence that are derived from the logit values of target labels to compute the components of our non-conformity score function: the heuristic notion of uncertainty in CP, uncertainty surprisal, and expected utility. Our extensive experimental evaluation demonstrates that ECP outperforms three state-of-the-art methods for generating CP sets, in terms of their set sizes and adaptivity while maintaining the coverage of true labels.

6/18/2024

cs.LG cs.AI cs.CV stat.ML

Towards Human-AI Complementarity with Predictions Sets

Giovanni De Toni, Nastaran Okati, Suhas Thejaswi, Eleni Straitouri, Manuel Gomez-Rodriguez

Decision support systems based on prediction sets have proven to be effective at helping human experts solve classification tasks. Rather than providing single-label predictions, these systems provide sets of label predictions constructed using conformal prediction, namely prediction sets, and ask human experts to predict label values from these sets. In this paper, we first show that the prediction sets constructed using conformal prediction are, in general, suboptimal in terms of average accuracy. Then, we show that the problem of finding the optimal prediction sets under which the human experts achieve the highest average accuracy is NP-hard. More strongly, unless P = NP, we show that the problem is hard to approximate to any factor less than the size of the label set. However, we introduce a simple and efficient greedy algorithm that, for a large class of expert models and non-conformity scores, is guaranteed to find prediction sets that provably offer equal or greater performance than those constructed using conformal prediction. Further, using a simulation study with both synthetic and real expert predictions, we demonstrate that, in practice, our greedy algorithm finds near-optimal prediction sets offering greater performance than conformal prediction.

5/29/2024

cs.LG cs.CY cs.HC