Epistemic Uncertainty-Weighted Loss for Visual Bias Mitigation

2204.09389

Published 6/5/2024 by Rebecca S Stone, Nishant Ravikumar, Andrew J Bulpitt, David C Hogg

📶

Abstract

Deep neural networks are highly susceptible to learning biases in visual data. While various methods have been proposed to mitigate such bias, the majority require explicit knowledge of the biases present in the training data in order to mitigate. We argue the relevance of exploring methods which are completely ignorant of the presence of any bias, but are capable of identifying and mitigating them. Furthermore, we propose using Bayesian neural networks with a predictive uncertainty-weighted loss function to dynamically identify potential bias in individual training samples and to weight them during training. We find a positive correlation between samples subject to bias and higher epistemic uncertainties. Finally, we show the method has potential to mitigate visual bias on a bias benchmark dataset and on a real-world face detection problem, and we consider the merits and weaknesses of our approach.

Create account to get full access

Overview

Deep neural networks are prone to learning biases present in the visual data used for training.
While existing methods try to mitigate these biases, they often require explicit knowledge of the biases in the training data.
This paper proposes a novel approach that does not require prior knowledge of biases, but can still identify and mitigate them.
The method uses Bayesian neural networks with a predictive uncertainty-weighted loss function to dynamically identify and downweight biased training samples.

Plain English Explanation

Deep learning models, like the ones used for image recognition, can sometimes learn unwanted biases that are present in the data they are trained on. For example, a model trained on images that depict certain demographics more than others may end up performing worse on images of underrepresented groups.

Most existing techniques to address this issue require the researchers to first identify the specific biases in the training data. However, this paper proposes a method that doesn't need that prior knowledge. Instead, it uses a special type of neural network called a Bayesian neural network to automatically detect when a training sample might be biased.

The key insight is that biased samples tend to have higher "epistemic uncertainty" - that is, the model is less confident about how to classify them. By tracking this uncertainty during training and weighting the biased samples less, the model can learn to be more robust to the biases in the data, without needing to know what those biases are ahead of time.

The paper demonstrates the effectiveness of this approach on a bias benchmark dataset, as well as a real-world face detection problem. While the method shows promise, the authors also discuss some limitations and areas for further research.

Technical Explanation

The core of the proposed approach is to leverage Bayesian neural networks and their ability to quantify epistemic uncertainty - the model's uncertainty about its own predictions.

The authors hypothesize that training samples affected by bias will tend to have higher epistemic uncertainty. By incorporating this uncertainty information into the loss function used to train the model, they can dynamically identify and downweight these biased samples during the training process.

Specifically, the loss function is modified to include a term that scales each sample's contribution inversely proportional to its epistemic uncertainty. This encourages the model to focus more on learning from samples it is more confident about, effectively mitigating the impact of biased samples.

The authors evaluate their approach on a bias benchmark dataset, as well as a real-world face detection problem. They find that their method is able to outperform existing bias mitigation techniques that require prior knowledge of the biases.

Critical Analysis

One key strength of the proposed approach is that it does not require explicit knowledge of the biases present in the training data. This makes it more broadly applicable than many existing bias mitigation methods.

However, the authors acknowledge that their technique relies on accurately estimating the epistemic uncertainty of the model. In practice, this can be challenging, especially for complex deep learning models. Inaccurate uncertainty estimates could lead to suboptimal bias mitigation.

Additionally, the paper does not provide a thorough investigation of the types of biases the method is effective against. It's possible that the approach may work better for some types of biases (e.g., demographic biases) than others.

Further research is needed to better understand the strengths, limitations, and optimal application of this uncertainty-based bias mitigation technique. Exploring ways to improve the reliability of epistemic uncertainty estimates could also help strengthen the overall approach.

Conclusion

This paper presents a novel method for mitigating biases in deep neural networks that does not require prior knowledge of the biases present in the training data. By leveraging Bayesian neural networks and the concept of epistemic uncertainty, the approach can dynamically identify and downweight biased training samples during the learning process.

While the results are promising, the authors acknowledge the need for further research to address the challenges of accurate uncertainty estimation and to better understand the types of biases the method can effectively mitigate. Nonetheless, this work represents an interesting and important step towards developing more robust and unbiased deep learning models.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🧠

Epistemic Uncertainty Quantification For Pre-trained Neural Network

Hanjing Wang, Qiang Ji

Epistemic uncertainty quantification (UQ) identifies where models lack knowledge. Traditional UQ methods, often based on Bayesian neural networks, are not suitable for pre-trained non-Bayesian models. Our study addresses quantifying epistemic uncertainty for any pre-trained model, which does not need the original training data or model modifications and can ensure broad applicability regardless of network architectures or training techniques. Specifically, we propose a gradient-based approach to assess epistemic uncertainty, analyzing the gradients of outputs relative to model parameters, and thereby indicating necessary model adjustments to accurately represent the inputs. We first explore theoretical guarantees of gradient-based methods for epistemic UQ, questioning the view that this uncertainty is only calculable through differences between multiple models. We further improve gradient-driven UQ by using class-specific weights for integrating gradients and emphasizing distinct contributions from neural network layers. Additionally, we enhance UQ accuracy by combining gradient and perturbation methods to refine the gradients. We evaluate our approach on out-of-distribution detection, uncertainty calibration, and active learning, demonstrating its superiority over current state-of-the-art UQ methods for pre-trained models.

4/17/2024

cs.LG cs.CV

Language-guided Detection and Mitigation of Unknown Dataset Bias

Zaiying Zhao, Soichiro Kumano, Toshihiko Yamasaki

Dataset bias is a significant problem in training fair classifiers. When attributes unrelated to classification exhibit strong biases towards certain classes, classifiers trained on such dataset may overfit to these bias attributes, substantially reducing the accuracy for minority groups. Mitigation techniques can be categorized according to the availability of bias information (ie, prior knowledge). Although scenarios with unknown biases are better suited for real-world settings, previous work in this field often suffers from a lack of interpretability regarding biases and lower performance. In this study, we propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions. We further propose two debiasing methods: (a) handing over to an existing debiasing approach which requires prior knowledge by assigning pseudo-labels, and (b) employing data augmentation via text-to-image generative models, using acquired bias keywords as prompts. Despite its simplicity, experimental results show that our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.

6/6/2024

cs.CV

🔮

New!Visual Analysis of Prediction Uncertainty in Neural Networks for Deep Image Synthesis

Soumya Dutta, Faheem Nizar, Ahmad Amaan, Ayan Acharya

Ubiquitous applications of Deep neural networks (DNNs) in different artificial intelligence systems have led to their adoption in solving challenging visualization problems in recent years. While sophisticated DNNs offer an impressive generalization, it is imperative to comprehend the quality, confidence, robustness, and uncertainty associated with their prediction. A thorough understanding of these quantities produces actionable insights that help application scientists make informed decisions. Unfortunately, the intrinsic design principles of the DNNs cannot beget prediction uncertainty, necessitating separate formulations for robust uncertainty-aware models for diverse visualization applications. To that end, this contribution demonstrates how the prediction uncertainty and sensitivity of DNNs can be estimated efficiently using various methods and then interactively compared and contrasted for deep image synthesis tasks. Our inspection suggests that uncertainty-aware deep visualization models generate illustrations of informative and superior quality and diversity. Furthermore, prediction uncertainty improves the robustness and interpretability of deep visualization models, making them practical and convenient for various scientific domains that thrive on visual analyses.

6/28/2024

cs.CV cs.LG

Awareness of uncertainty in classification using a multivariate model and multi-views

Alexey Kornaev, Elena Kornaeva, Oleg Ivanov, Ilya Pershin, Danis Alukaev

One of the ways to make artificial intelligence more natural is to give it some room for doubt. Two main questions should be resolved in that way. First, how to train a model to estimate uncertainties of its own predictions? And then, what to do with the uncertain predictions if they appear? First, we proposed an uncertainty-aware negative log-likelihood loss for the case of N-dimensional multivariate normal distribution with spherical variance matrix to the solution of N-classes classification tasks. The loss is similar to the heteroscedastic regression loss. The proposed model regularizes uncertain predictions, and trains to calculate both the predictions and their uncertainty estimations. The model fits well with the label smoothing technique. Second, we expanded the limits of data augmentation at the training and test stages, and made the trained model to give multiple predictions for a given number of augmented versions of each test sample. Given the multi-view predictions together with their uncertainties and confidences, we proposed several methods to calculate final predictions, including mode values and bin counts with soft and hard weights. For the latter method, we formalized the model tuning task in the form of multimodal optimization with non-differentiable criteria of maximum accuracy, and applied particle swarm optimization to solve the tuning task. The proposed methodology was tested using CIFAR-10 dataset with clean and noisy labels and demonstrated good results in comparison with other uncertainty estimation methods related to sample selection, co-teaching, and label smoothing.

4/17/2024

cs.CV cs.LG