Explainable Anomaly Detection in Images and Videos: A Survey

Read original: arXiv:2302.06670 - Published 4/11/2024 by Yizhou Wang, Dongliang Guo, Sheng Li, Octavia Camps, Yun Fu

❗

Overview

This paper provides a comprehensive survey of explainable visual anomaly detection methods for both images and videos.
The authors introduce the background of image-level and video-level anomaly detection, then focus on reviewing explainable approaches for these two modalities.
The paper also analyzes why some explainable methods can be applied to both images and videos, while others are limited to a single modality.
Additionally, the authors summarize current 2D visual anomaly detection datasets and evaluation metrics.
Finally, the paper discusses promising future research directions and open problems in the field of explainable visual anomaly detection.

Plain English Explanation

Anomaly detection in visual data, such as images and videos, is an important task in both academic research and real-world applications. Anomaly detection involves identifying elements in the data that are unusual or different from the norm.

Despite the rapid development of visual anomaly detection techniques, the reasons why these models can distinguish anomalies are often not well-explained. This paper aims to address this gap by providing a comprehensive review of "explainable" visual anomaly detection methods.

The authors first give an overview of basic image-level and video-level anomaly detection. Then, they dive into a detailed review of explainable anomaly detection approaches for both images and videos. They analyze why some of these explainable methods can be applied to both images and videos, while others are limited to a single modality.

The paper also summarizes existing datasets and evaluation metrics used in 2D visual anomaly detection research. Finally, the authors discuss promising future directions and open problems in this field, such as exploring new ways to make visual anomaly detection models more interpretable.

Overall, this survey provides a valuable resource for researchers and practitioners interested in understanding and improving the explainability of visual anomaly detection systems.

Technical Explanation

The paper begins by introducing the importance of anomaly detection and localization in visual data, including images and videos. Despite recent advancements in visual anomaly detection techniques, the authors note that the interpretability and explainability of these "black-box" models are often lacking.

The core of the paper is a comprehensive literature review of explainable anomaly detection methods for both images and videos. The authors analyze why some of these explainable approaches, such as dynamic distinction learning and differential privacy, can be applied to both modalities, while others are limited to a single modality.

In addition, the paper summarizes current 2D visual anomaly detection datasets and evaluation metrics used in this field. This provides a useful overview of the resources and benchmarks available for researchers.

Finally, the authors discuss several promising future research directions and open problems in explainable visual anomaly detection. These include exploring new ways to interpret the decision-making process of anomaly detection models, as well as developing methods that can detect and mitigate system-level anomalies in a more comprehensive and explainable manner.

Critical Analysis

The paper provides a thorough and well-structured survey of explainable visual anomaly detection methods, which is a valuable contribution to the field. The authors' analysis of the differences between approaches that can be applied to both images and videos, versus those limited to a single modality, offers useful insights.

However, the paper does not delve deeply into the specific strengths, weaknesses, and limitations of the reviewed explainable anomaly detection techniques. A more critical assessment of the trade-offs and potential issues with these methods would have been beneficial for readers to better understand the current state of the art and areas for improvement.

Additionally, the paper does not discuss the real-world applicability and practical implications of the reviewed explainable anomaly detection approaches. Exploring how these methods could be deployed in various domains, such as healthcare, surveillance, or manufacturing, and the potential challenges or ethical considerations involved, would have further enriched the discussion.

Overall, the paper serves as a comprehensive reference for researchers and practitioners in the field of explainable visual anomaly detection. But a more in-depth critical analysis and discussion of the practical relevance of the reviewed techniques could have strengthened the contribution of this survey.

Conclusion

This paper provides a valuable survey of explainable visual anomaly detection methods for both images and videos. The authors have done an extensive literature review and analysis of the key differences between approaches that can be applied to multiple modalities versus those limited to a single modality.

The survey also covers current 2D visual anomaly detection datasets and evaluation metrics, as well as promising future research directions and open problems in this field. This information can serve as a useful resource for researchers and practitioners looking to further advance the state of explainable visual anomaly detection.

While the paper could have benefited from a more critical assessment of the reviewed techniques, it nonetheless offers a comprehensive and informative overview of this important area of machine learning research. The insights and discussions presented in this survey can help drive the development of more transparent and interpretable visual anomaly detection systems, with potential applications across a wide range of real-world domains.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

❗

Explainable Anomaly Detection in Images and Videos: A Survey

Yizhou Wang, Dongliang Guo, Sheng Li, Octavia Camps, Yun Fu

Anomaly detection and localization of visual data, including images and videos, are of great significance in both machine learning academia and applied real-world scenarios. Despite the rapid development of visual anomaly detection techniques in recent years, the interpretations of these black-box models and reasonable explanations of why anomalies can be distinguished out are scarce. This paper provides the first survey concentrated on explainable visual anomaly detection methods. We first introduce the basic background of image-level and video-level anomaly detection. Then, as the main content of this survey, a comprehensive and exhaustive literature review of explainable anomaly detection methods for both images and videos is presented. Next, we analyze why some explainable anomaly detection methods can be applied to both images and videos and why others can be only applied to one modality. Additionally, we provide summaries of current 2D visual anomaly detection datasets and evaluation metrics. Finally, we discuss several promising future directions and open problems to explore the explainability of 2D visual anomaly detection. The related resource collection is given at https://github.com/wyzjack/Awesome-XAD.

4/11/2024

Video Anomaly Detection in 10 Years: A Survey and Outlook

Moshira Abdalla, Sajid Javed, Muaz Al Radi, Anwaar Ulhaq, Naoufel Werghi

Video anomaly detection (VAD) holds immense importance across diverse domains such as surveillance, healthcare, and environmental monitoring. While numerous surveys focus on conventional VAD methods, they often lack depth in exploring specific approaches and emerging trends. This survey explores deep learning-based VAD, expanding beyond traditional supervised training paradigms to encompass emerging weakly supervised, self-supervised, and unsupervised approaches. A prominent feature of this review is the investigation of core challenges within the VAD paradigms including large-scale datasets, features extraction, learning methods, loss functions, regularization, and anomaly score prediction. Moreover, this review also investigates the vision language models (VLMs) as potent feature extractors for VAD. VLMs integrate visual data with textual descriptions or spoken language from videos, enabling a nuanced understanding of scenes crucial for anomaly detection. By addressing these challenges and proposing future research directions, this review aims to foster the development of robust and efficient VAD systems leveraging the capabilities of VLMs for enhanced anomaly detection in complex real-world scenarios. This comprehensive analysis seeks to bridge existing knowledge gaps, provide researchers with valuable insights, and contribute to shaping the future of VAD research.

7/2/2024

Multi-Image Visual Question Answering for Unsupervised Anomaly Detection

Jun Li, Su Hwan Kim, Philip Muller, Lina Felsner, Daniel Rueckert, Benedikt Wiestler, Julia A. Schnabel, Cosmin I. Bercea

This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model's generalizability to previously unseen medical conditions. The code and dataset are available at https://github.com/compai-lab/miccai-2024-junli?tab=readme-ov-file

7/24/2024

Can I trust my anomaly detection system? A case study based on explainable AI

Muhammad Rashid, Elvio Amparore, Enrico Ferrari, Damiano Verda

Generative models based on variational autoencoders are a popular technique for detecting anomalies in images in a semi-supervised context. A common approach employs the anomaly score to detect the presence of anomalies, and it is known to reach high level of accuracy on benchmark datasets. However, since anomaly scores are computed from reconstruction disparities, they often obscure the detection of various spurious features, raising concerns regarding their actual efficacy. This case study explores the robustness of an anomaly detection system based on variational autoencoder generative models through the use of eXplainable AI methods. The goal is to get a different perspective on the real performances of anomaly detectors that use reconstruction differences. In our case study we discovered that, in many cases, samples are detected as anomalous for the wrong or misleading factors.

7/30/2024