Just rotate it! Uncertainty estimation in closed-source models via multiple queries

2405.13864

Published 5/24/2024 by Konstantinos Pitas, Julyan Arbel

🧠

Abstract

We propose a simple and effective method to estimate the uncertainty of closed-source deep neural network image classification models. Given a base image, our method creates multiple transformed versions and uses them to query the top-1 prediction of the closed-source model. We demonstrate significant improvements in the calibration of uncertainty estimates compared to the naive baseline of assigning 100% confidence to all predictions. While we initially explore Gaussian perturbations, our empirical findings indicate that natural transformations, such as rotations and elastic deformations, yield even better-calibrated predictions. Furthermore, through empirical results and a straightforward theoretical analysis, we elucidate the reasons behind the superior performance of natural transformations over Gaussian noise. Leveraging these insights, we propose a transfer learning approach that further improves our calibration results.

Create account to get full access

Overview

Proposes a simple and effective method to estimate the uncertainty of closed-source deep neural network image classification models
Creates multiple transformed versions of a base image and uses them to query the top-1 prediction of the closed-source model
Demonstrates significant improvements in the calibration of uncertainty estimates compared to a naive baseline
Explores Gaussian perturbations and natural transformations like rotations and elastic deformations
Provides a theoretical analysis to explain the superior performance of natural transformations over Gaussian noise
Introduces a transfer learning approach to further improve calibration results

Plain English Explanation

The researchers developed a straightforward technique to estimate the uncertainty of image classification models that are not publicly available. They start with a base image and create several altered versions of it, like rotating or stretching the image. They then feed these transformed images into the closed-source model and look at the top prediction for each one.

By analyzing how the model's top prediction changes across the transformed images, the researchers can get a sense of how confident the model is in its classification. If the top prediction stays the same no matter how the image is altered, the model is likely very confident. But if the top prediction flip-flops a lot, the model is probably more uncertain.

The researchers found that this technique outperforms simply assuming the model is 100% certain about all its predictions. They also discovered that using natural transformations, like rotations and distortions, works better than just adding random noise to the images.

Through their analysis, the researchers explain why natural transformations are more effective than Gaussian noise for probing a model's uncertainty. Building on this insight, they developed a way to further improve the calibration of the uncertainty estimates by leveraging transfer learning.

Technical Explanation

The researchers propose a simple and effective method to estimate the uncertainty of closed-source deep neural network image classification models. Their approach involves creating multiple transformed versions of a base image and using them to query the top-1 prediction of the closed-source model.

They initially explore the use of Gaussian perturbations to generate the transformed images, but their empirical findings indicate that natural transformations, such as rotations and elastic deformations, yield even better-calibrated predictions. The researchers provide a straightforward theoretical analysis to elucidate the reasons behind the superior performance of natural transformations over Gaussian noise.

Specifically, the researchers demonstrate that natural transformations preserve the semantics of the input image, while Gaussian noise can introduce spurious features that mislead the model. This insight leads the researchers to propose a transfer learning approach that further improves the calibration of their uncertainty estimates.

The researchers validate their method through extensive experiments, showing significant improvements in the calibration of uncertainty estimates compared to a naive baseline of assigning 100% confidence to all predictions. Their work contributes to the growing body of research on awareness of uncertainty in classification using multivariate models and helps address the challenge of modern neural networks still struggling with certain types of transformations.

Critical Analysis

The researchers provide a thorough and well-designed study, but there are a few potential limitations and areas for further exploration:

The experiments are limited to image classification tasks, and it's unclear how well the proposed method would generalize to other domains, such as 3D rotational dynamics prediction or spatial invariance activation.
The researchers focus on a closed-source model scenario, but it would be valuable to understand how their method compares to techniques that have access to the model's internal workings, such as just rephrasing it for uncertainty estimation.
While the transfer learning approach improves calibration, the researchers do not provide a clear recipe for determining the optimal source and target tasks for this transfer. Further investigation into the most effective transfer learning strategies would be beneficial.

Overall, the researchers present a compelling and practical approach to estimating the uncertainty of closed-source deep neural network models, with promising results and interesting theoretical insights. However, there are opportunities for further research to expand the applicability and robustness of the proposed method.

Conclusion

The researchers have developed a simple and effective technique to estimate the uncertainty of closed-source deep neural network image classification models. By creating multiple transformed versions of a base image and analyzing the model's top predictions, they can obtain significantly better-calibrated uncertainty estimates compared to a naive baseline.

The key insight is that natural transformations, such as rotations and elastic deformations, are more effective than Gaussian noise for probing a model's uncertainty. The researchers provide a theoretical explanation for this finding, which then informs their development of a transfer learning approach to further improve the calibration of the uncertainty estimates.

This work contributes to the ongoing effort to better understand and quantify the uncertainty of deep learning models, particularly in scenarios where the model's internal workings are not accessible. The proposed method could have practical applications in areas like safety-critical decision-making, where reliable uncertainty estimates are crucial. The researchers have also opened up new avenues for future research, exploring the generalizability of their approach and the most effective transfer learning strategies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

💬

Just rephrase it! Uncertainty estimation in closed-source language models via multiple rephrased queries

Adam Yang, Chen Chen, Konstantinos Pitas

State-of-the-art large language models are sometimes distributed as open-source software but are also increasingly provided as a closed-source service. These closed-source large-language models typically see the widest usage by the public, however, they often do not provide an estimate of their uncertainty when responding to queries. As even the best models are prone to ``hallucinating false information with high confidence, a lack of a reliable estimate of uncertainty limits the applicability of these models in critical settings. We explore estimating the uncertainty of closed-source LLMs via multiple rephrasings of an original base query. Specifically, we ask the model, multiple rephrased questions, and use the similarity of the answers as an estimate of uncertainty. We diverge from previous work in i) providing rules for rephrasing that are simple to memorize and use in practice ii) proposing a theoretical framework for why multiple rephrased queries obtain calibrated uncertainty estimates. Our method demonstrates significant improvements in the calibration of uncertainty estimates compared to the baseline and provides intuition as to how query strategies should be designed for optimal test calibration.

6/18/2024

cs.CL cs.AI

Awareness of uncertainty in classification using a multivariate model and multi-views

Alexey Kornaev, Elena Kornaeva, Oleg Ivanov, Ilya Pershin, Danis Alukaev

One of the ways to make artificial intelligence more natural is to give it some room for doubt. Two main questions should be resolved in that way. First, how to train a model to estimate uncertainties of its own predictions? And then, what to do with the uncertain predictions if they appear? First, we proposed an uncertainty-aware negative log-likelihood loss for the case of N-dimensional multivariate normal distribution with spherical variance matrix to the solution of N-classes classification tasks. The loss is similar to the heteroscedastic regression loss. The proposed model regularizes uncertain predictions, and trains to calculate both the predictions and their uncertainty estimations. The model fits well with the label smoothing technique. Second, we expanded the limits of data augmentation at the training and test stages, and made the trained model to give multiple predictions for a given number of augmented versions of each test sample. Given the multi-view predictions together with their uncertainties and confidences, we proposed several methods to calculate final predictions, including mode values and bin counts with soft and hard weights. For the latter method, we formalized the model tuning task in the form of multimodal optimization with non-differentiable criteria of maximum accuracy, and applied particle swarm optimization to solve the tuning task. The proposed methodology was tested using CIFAR-10 dataset with clean and noisy labels and demonstrated good results in comparison with other uncertainty estimation methods related to sample selection, co-teaching, and label smoothing.

4/17/2024

cs.CV cs.LG

Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations

Ofir Shifman, Yair Weiss

Deep neural networks that achieve remarkable performance in image classification have previously been shown to be easily fooled by tiny transformations such as a one pixel translation of the input image. In order to address this problem, two approaches have been proposed in recent years. The first approach suggests using huge datasets together with data augmentation in the hope that a highly varied training set will teach the network to learn to be invariant. The second approach suggests using architectural modifications based on sampling theory to deal explicitly with image translations. In this paper, we show that these approaches still fall short in robustly handling 'natural' image translations that simulate a subtle change in camera orientation. Our findings reveal that a mere one-pixel translation can result in a significant change in the predicted image representation for approximately 40% of the test images in state-of-the-art models (e.g. open-CLIP trained on LAION-2B or DINO-v2) , while models that are explicitly constructed to be robust to cyclic translations can still be fooled with 1 pixel realistic (non-cyclic) translations 11% of the time. We present Robust Inference by Crop Selection: a simple method that can be proven to achieve any desired level of consistency, although with a modest tradeoff with the model's accuracy. Importantly, we demonstrate how employing this method reduces the ability to fool state-of-the-art models with a 1 pixel translation to less than 5% while suffering from only a 1% drop in classification accuracy. Additionally, we show that our method can be easy adjusted to deal with circular shifts as well. In such case we achieve 100% robustness to integer shifts with state-of-the-art accuracy, and with no need for any further training.

4/11/2024

cs.CV

ClaudesLens: Uncertainty Quantification in Computer Vision Models

Mohamad Al Shaar, Nils Ekstrom, Gustav Gille, Reza Rezvan, Ivan Wely

In a world where more decisions are made using artificial intelligence, it is of utmost importance to ensure these decisions are well-grounded. Neural networks are the modern building blocks for artificial intelligence. Modern neural network-based computer vision models are often used for object classification tasks. Correctly classifying objects with textit{certainty} has become of great importance in recent times. However, quantifying the inherent textit{uncertainty} of the output from neural networks is a challenging task. Here we show a possible method to quantify and evaluate the uncertainty of the output of different computer vision models based on Shannon entropy. By adding perturbation of different levels, on different parts, ranging from the input to the parameters of the network, one introduces entropy to the system. By quantifying and evaluating the perturbed models on the proposed PI and PSI metrics, we can conclude that our theoretical framework can grant insight into the uncertainty of predictions of computer vision models. We believe that this theoretical framework can be applied to different applications for neural networks. We believe that Shannon entropy may eventually have a bigger role in the SOTA (State-of-the-art) methods to quantify uncertainty in artificial intelligence. One day we might be able to apply Shannon entropy to our neural systems.

6/21/2024

cs.CV cs.AI