A Structured Review of Literature on Uncertainty in Machine Learning & Deep Learning

2406.00332

Published 6/4/2024 by Fahimeh Fakour, Ali Mosleh, Ramin Ramezani

🤿

Abstract

The adaptation and use of Machine Learning (ML) in our daily lives has led to concerns in lack of transparency, privacy, reliability, among others. As a result, we are seeing research in niche areas such as interpretability, causality, bias and fairness, and reliability. In this survey paper, we focus on a critical concern for adaptation of ML in risk-sensitive applications, namely understanding and quantifying uncertainty. Our paper approaches this topic in a structured way, providing a review of the literature in the various facets that uncertainty is enveloped in the ML process. We begin by defining uncertainty and its categories (e.g., aleatoric and epistemic), understanding sources of uncertainty (e.g., data and model), and how uncertainty can be assessed in terms of uncertainty quantification techniques (Ensembles, Bayesian Neural Networks, etc.). As part of our assessment and understanding of uncertainty in the ML realm, we cover metrics for uncertainty quantification for a single sample, dataset, and metrics for accuracy of the uncertainty estimation itself. This is followed by discussions on calibration (model and uncertainty), and decision making under uncertainty. Thus, we provide a more complete treatment of uncertainty: from the sources of uncertainty to the decision-making process. We have focused the review of uncertainty quantification methods on Deep Learning (DL), while providing the necessary background for uncertainty discussion within ML in general. Key contributions in this review are broadening the scope of uncertainty discussion, as well as an updated review of uncertainty quantification methods in DL.

Create account to get full access

Overview

Machine Learning (ML) is being increasingly used in our daily lives, but this has led to concerns about transparency, privacy, reliability, and other issues.
Researchers are exploring niche areas like interpretability, causality, bias and fairness, and reliability to address these concerns.
This survey paper focuses on the critical issue of understanding and quantifying uncertainty in the context of using ML in risk-sensitive applications.

Plain English Explanation

Machine learning (ML) is a type of artificial intelligence that allows computers to learn and make predictions from data, without being explicitly programmed. As ML becomes more widespread in our daily lives, there are growing concerns about its transparency, privacy, and reliability.

Researchers are investigating ways to make ML systems more trustworthy and accountable. This includes studying how to interpret and explain the inner workings of ML models, understand the causal relationships behind the model's decisions, and ensure they are fair and unbiased.

One critical issue is understanding and quantifying the uncertainty inherent in ML systems, especially when they are used in high-stakes or risk-sensitive applications. This survey paper provides a comprehensive review of the various aspects of uncertainty in ML, with a focus on deep learning models.

The paper starts by defining different types of uncertainty, such as aleatoric (inherent randomness) and epistemic (due to lack of knowledge). It then explores the sources of uncertainty, such as the data and the model itself. The paper also covers techniques for quantifying uncertainty, including methods like ensembles and Bayesian neural networks.

Additionally, the paper discusses how to assess the accuracy of uncertainty estimates, model calibration, and how to make decisions under uncertainty. The goal is to provide a holistic understanding of uncertainty in ML, from its origins to its implications for real-world applications.

Technical Explanation

The paper begins by defining the various types of uncertainty in machine learning, such as aleatoric uncertainty (inherent randomness in the data) and epistemic uncertainty (uncertainty due to lack of knowledge or information). It then explores the different sources of uncertainty, including the data used to train the model and the model architecture itself.

The core of the paper focuses on techniques for quantifying uncertainty in machine learning, particularly in the context of deep learning models. The authors review methods like ensemble models, Bayesian neural networks, and other approaches that can provide estimates of the uncertainty associated with the model's predictions.

The paper also covers metrics for evaluating the quality of these uncertainty estimates, such as measures of uncertainty for individual samples, the overall dataset, and the accuracy of the uncertainty estimates themselves. Additionally, the authors discuss the important topics of model calibration (ensuring the model's confidence estimates align with its actual performance) and decision-making under uncertainty.

Throughout the paper, the authors provide a comprehensive and up-to-date review of the state of the art in uncertainty quantification for deep learning, drawing insights from the broader machine learning literature. The key contributions of the paper are its broad scope in covering the various facets of uncertainty, as well as its detailed review of the latest developments in deep learning-specific uncertainty quantification methods.

Critical Analysis

The paper provides a thorough and well-structured review of the topic of uncertainty quantification in machine learning, with a particular focus on deep learning models. The authors have done an admirable job of covering the various aspects of uncertainty, from its definition and sources to the techniques for quantifying and assessing it.

One potential limitation of the paper is that, while it covers a wide range of uncertainty quantification methods, the depth of the discussion on any individual technique may be limited. The authors have opted for a broad, survey-style approach, which means that readers looking for a more in-depth treatment of a specific uncertainty quantification method may need to consult additional resources.

Additionally, the paper does not delve deeply into the practical challenges and trade-offs involved in deploying uncertainty-aware ML systems in real-world, risk-sensitive applications. The authors touch on decision-making under uncertainty, but a more extensive discussion of the practical implications and potential pitfalls would have been valuable.

That said, the paper's comprehensive coverage of the uncertainty landscape in machine learning, combined with its focus on deep learning, makes it a valuable resource for researchers and practitioners working in this important and rapidly evolving field. By highlighting the critical importance of understanding and quantifying uncertainty, the authors have made a valuable contribution to the ongoing efforts to enhance the trustworthiness and reliability of AI-powered systems.

Conclusion

This survey paper provides a detailed and wide-ranging review of the topic of uncertainty quantification in machine learning, with a particular emphasis on deep learning models. The authors have done an excellent job of covering the various facets of uncertainty, from its definition and sources to the techniques for measuring and assessing it.

By highlighting the growing importance of understanding and quantifying uncertainty in the context of the increasing adoption of machine learning in risk-sensitive applications, the paper underscores the critical need for further research and development in this area. The authors' comprehensive coverage of the current state of the art, as well as their identification of key challenges and opportunities, make this paper a valuable resource for researchers and practitioners working to enhance the trustworthiness and reliability of AI-powered systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤿

A Comprehensive Survey on Uncertainty Quantification for Deep Learning

Wenchong He, Zhe Jiang

Deep neural networks (DNNs) have achieved tremendous success in making accurate predictions for computer vision, natural language processing, as well as science and engineering domains. However, it is also well-recognized that DNNs sometimes make unexpected, incorrect, but overconfident predictions. This can cause serious consequences in high-stake applications, such as autonomous driving, medical diagnosis, and disaster response. Uncertainty quantification (UQ) aims to estimate the confidence of DNN predictions beyond prediction accuracy. In recent years, many UQ methods have been developed for DNNs. It is of great practical value to systematically categorize these UQ methods and compare their advantages and disadvantages. However, existing surveys mostly focus on categorizing UQ methodologies from a neural network architecture perspective or a Bayesian perspective and ignore the source of uncertainty that each methodology can incorporate, making it difficult to select an appropriate UQ method in practice. To fill the gap, this paper presents a systematic taxonomy of UQ methods for DNNs based on the types of uncertainty sources (data uncertainty versus model uncertainty). We summarize the advantages and disadvantages of methods in each category. We show how our taxonomy of UQ methodologies can potentially help guide the choice of UQ method in different machine learning problems (e.g., active learning, robustness, and reinforcement learning). We also identify current research gaps and propose several future research directions.

4/11/2024

cs.LG stat.ML

🤿

Uncertainty Quantification for Deep Learning

Peter Jan van Leeuwen, J. Christine Chiu, C. Kevin Yang

A complete and statistically consistent uncertainty quantification for deep learning is provided, including the sources of uncertainty arising from (1) the new input data, (2) the training and testing data (3) the weight vectors of the neural network, and (4) the neural network because it is not a perfect predictor. Using Bayes Theorem and conditional probability densities, we demonstrate how each uncertainty source can be systematically quantified. We also introduce a fast and practical way to incorporate and combine all sources of errors for the first time. For illustration, the new method is applied to quantify errors in cloud autoconversion rates, predicted from an artificial neural network that was trained by aircraft cloud probe measurements in the Azores and the stochastic collection equation formulated as a two-moment bin model. For this specific example, the output uncertainty arising from uncertainty in the training and testing data is dominant, followed by uncertainty in the input data, in the trained neural network, and uncertainty in the weights. We discuss the usefulness of the methodology for machine learning practice, and how, through inclusion of uncertainty in the training data, the new methodology is less sensitive to input data that falls outside of the training data set.

6/3/2024

cs.LG stat.ML

🌐

Enhancing Trustworthiness in ML-Based Network Intrusion Detection with Uncertainty Quantification

Jacopo Talpini, Fabio Sartori, Marco Savi

The evolution of Internet and its related communication technologies have consistently increased the risk of cyber-attacks. In this context, a crucial role is played by Intrusion Detection Systems (IDSs), which are security devices designed to identify and mitigate attacks to modern networks. Data-driven approaches based on Machine Learning (ML) have gained more and more popularity for executing the classification tasks required by signature-based IDSs. However, typical ML models adopted for this purpose do not properly take into account the uncertainty associated with their prediction. This poses significant challenges, as they tend to produce misleadingly high classification scores for both misclassified inputs and inputs belonging to unknown classes (e.g. novel attacks), limiting the trustworthiness of existing ML-based solutions. In this paper, we argue that ML-based IDSs should always provide accurate uncertainty quantification to avoid overconfident predictions. In fact, an uncertainty-aware classification would be beneficial to enhance closed-set classification performance, would make it possible to carry out Active Learning, and would help recognize inputs of unknown classes as truly unknowns, unlocking open-set classification capabilities and Out-of-Distribution (OoD) detection. To verify it, we compare various ML-based methods for uncertainty quantification and for open-set classification, either specifically designed for or tailored to the domain of network intrusion detection. Moreover, we develop a custom model based on Bayesian Neural Networks to ensure reliable uncertainty estimates and improve the OoD detection capabilities, thus showing how proper uncertainty quantification can be exploited to significantly enhance the trustworthiness of ML-based IDSs.

4/10/2024

cs.CR cs.LG

✨

Between Randomness and Arbitrariness: Some Lessons for Reliable Machine Learning at Scale

A. Feder Cooper

To develop rigorous knowledge about ML models -- and the systems in which they are embedded -- we need reliable measurements. But reliable measurement is fundamentally challenging, and touches on issues of reproducibility, scalability, uncertainty quantification, epistemology, and more. This dissertation addresses criteria needed to take reliability seriously: both criteria for designing meaningful metrics, and for methodologies that ensure that we can dependably and efficiently measure these metrics at scale and in practice. In doing so, this dissertation articulates a research vision for a new field of scholarship at the intersection of machine learning, law, and policy. Within this frame, we cover topics that fit under three different themes: (1) quantifying and mitigating sources of arbitrariness in ML, (2) taming randomness in uncertainty estimation and optimization algorithms, in order to achieve scalability without sacrificing reliability, and (3) providing methods for evaluating generative-AI systems, with specific focuses on quantifying memorization in language models and training latent diffusion models on open-licensed data. By making contributions in these three themes, this dissertation serves as an empirical proof by example that research on reliable measurement for machine learning is intimately and inescapably bound up with research in law and policy. These different disciplines pose similar research questions about reliable measurement in machine learning. They are, in fact, two complementary sides of the same research vision, which, broadly construed, aims to construct machine-learning systems that cohere with broader societal values.

6/17/2024

cs.LG cs.AI cs.CY stat.ML