Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark

Read original: arXiv:2404.10760 - Published 4/17/2024 by Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong Liu, Guansong Pang, Dacheng Tao

✨

Overview

This paper addresses the challenge of anomaly detection (AD) in industrial quality inspection and medical lesion examination.
It constructs a large-scale COCO-AD dataset to enable fair evaluation and sustainable development of different AD methods.
It proposes several new threshold-dependent AD-specific metrics to provide a more comprehensive evaluation of AD methods.
It introduces the InvAD framework, which leverages GAN inversion's high-quality reconstruction capability to improve the effectiveness of reconstruction-based AD methods.

Plain English Explanation

Anomaly detection is a crucial task in industries like manufacturing and healthcare, where it's used to identify defects or abnormalities in products or medical images. However, the datasets used for anomaly detection are often small, and the evaluation metrics are not as well-developed as those used for other computer vision tasks like object detection and semantic segmentation.

To address these issues, the researchers in this paper first created a large-scale dataset called COCO-AD by extending the popular COCO dataset to the anomaly detection domain. This allows for more comprehensive and fair evaluation of different anomaly detection methods.

Additionally, the researchers noted that existing metrics like AU-ROC (Area Under the Receiver Operating Characteristic) have become less useful as anomaly detection methods have improved. To provide a more nuanced evaluation, they proposed several new threshold-dependent metrics that better capture the performance of anomaly detection algorithms.

Finally, the researchers introduced a new framework called InvAD, which uses a technique called GAN inversion to improve the reconstruction-based anomaly detection methods. This allows the model to better identify anomalies by generating high-quality reconstructions of the input data.

Overall, this paper makes important contributions to the field of anomaly detection by addressing key challenges in dataset size, evaluation metrics, and model performance.

Technical Explanation

The paper first highlights the limitations of current anomaly detection (AD) research, noting that the data scale is relatively small compared to classic vision tasks, and the evaluation metrics are still deficient. To address these gaps, the researchers construct a large-scale COCO-AD dataset by extending the popular COCO dataset to the AD field. This enables fair evaluation and sustainable development of different AD methods on a challenging benchmark.

Moreover, the paper observes that current metrics like AU-ROC have nearly reached saturation on simple datasets, limiting a comprehensive evaluation of different AD methods. Inspired by segmentation metrics, the researchers propose several more practical threshold-dependent AD-specific metrics, such as m$F_1$$^{.2}

{.8}$, mAcc$^{.2}

{.8}$, mIoU$^{.2}_{.8}$, and mIoU-max.

Motivated by the high-quality reconstruction capability of GAN inversion, the paper introduces the InvAD framework to achieve improved feature reconstruction for AD. The method is evaluated on popular benchmarks like MVTec AD, VisA, and the newly proposed COCO-AD dataset, demonstrating its effectiveness in a multi-class unsupervised setting where a single detection model is trained to identify anomalies from different classes.

Critical Analysis

The paper addresses important challenges in the field of anomaly detection, such as the limited size of datasets and the lack of comprehensive evaluation metrics. The construction of the COCO-AD dataset and the proposed new metrics are valuable contributions that can help drive the development of more robust and reliable anomaly detection methods.

However, the paper does not provide a detailed analysis of the limitations or potential drawbacks of the InvAD framework. For example, it would be helpful to understand the computational complexity of the approach and how it compares to other state-of-the-art anomaly detection methods in terms of performance and efficiency.

Additionally, while the InvAD framework is shown to be effective on the evaluated datasets, it would be interesting to see how it performs on a wider range of anomaly detection scenarios, including long-tailed anomaly detection or medical image anomaly detection. Further research could also explore the potential for video anomaly detection using the InvAD approach.

Conclusion

This paper makes important contributions to the field of anomaly detection by addressing key challenges in dataset size, evaluation metrics, and model performance. The construction of the COCO-AD dataset and the proposed new threshold-dependent metrics can significantly improve the ability to fairly evaluate and compare different anomaly detection methods.

The introduction of the InvAD framework, which leverages GAN inversion for high-quality feature reconstruction, represents a promising direction for improving the effectiveness of reconstruction-based anomaly detection approaches. As the field of anomaly detection continues to advance, this work provides a valuable foundation for future research and development.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark

Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong Liu, Guansong Pang, Dacheng Tao

Anomaly detection (AD) is often focused on detecting anomaly areas for industrial quality inspection and medical lesion examination. However, due to the specific scenario targets, the data scale for AD is relatively small, and evaluation metrics are still deficient compared to classic vision tasks, such as object detection and semantic segmentation. To fill these gaps, this work first constructs a large-scale and general-purpose COCO-AD dataset by extending COCO to the AD field. This enables fair evaluation and sustainable development for different methods on this challenging benchmark. Moreover, current metrics such as AU-ROC have nearly reached saturation on simple datasets, which prevents a comprehensive evaluation of different methods. Inspired by the metrics in the segmentation field, we further propose several more practical threshold-dependent AD-specific metrics, ie, m$F_1$$^{.2}_{.8}$, mAcc$^{.2}_{.8}$, mIoU$^{.2}_{.8}$, and mIoU-max. Motivated by GAN inversion's high-quality reconstruction capability, we propose a simple but more powerful InvAD framework to achieve high-quality feature reconstruction. Our method improves the effectiveness of reconstruction-based methods on popular MVTec AD, VisA, and our newly proposed COCO-AD datasets under a multi-class unsupervised setting, where only a single detection model is trained to detect anomalies from different classes. Extensive ablation experiments have demonstrated the effectiveness of each component of our InvAD. Full codes and models are available at https://github.com/zhangzjn/ader.

4/17/2024

ToCoAD: Two-Stage Contrastive Learning for Industrial Anomaly Detection

Yun Liang, Zhiguang Hu, Junjie Huang, Donglin Di, Anyang Su, Lei Fan

Current unsupervised anomaly detection approaches perform well on public datasets but struggle with specific anomaly types due to the domain gap between pre-trained feature extractors and target-specific domains. To tackle this issue, this paper presents a two-stage training strategy, called textbf{ToCoAD}. In the first stage, a discriminative network is trained by using synthetic anomalies in a self-supervised learning manner. This network is then utilized in the second stage to provide a negative feature guide, aiding in the training of the feature extractor through bootstrap contrastive learning. This approach enables the model to progressively learn the distribution of anomalies specific to industrial datasets, effectively enhancing its generalizability to various types of anomalies. Extensive experiments are conducted to demonstrate the effectiveness of our proposed two-stage training strategy, and our model produces competitive performance, achieving pixel-level AUROC scores of 98.21%, 98.43% and 97.70% on MVTec AD, VisA and BTAD respectively.

7/2/2024

👨‍🏫

Supervised Anomaly Detection for Complex Industrial Images

Aimira Baitieva, David Hurych, Victor Besnier, Olivier Bernard

Automating visual inspection in industrial production lines is essential for increasing product quality across various industries. Anomaly detection (AD) methods serve as robust tools for this purpose. However, existing public datasets primarily consist of images without anomalies, limiting the practical application of AD methods in production settings. To address this challenge, we present (1) the Valeo Anomaly Dataset (VAD), a novel real-world industrial dataset comprising 5000 images, including 2000 instances of challenging real defects across more than 20 subclasses. Acknowledging that traditional AD methods struggle with this dataset, we introduce (2) Segmentation-based Anomaly Detector (SegAD). First, SegAD leverages anomaly maps as well as segmentation maps to compute local statistics. Next, SegAD uses these statistics and an optional supervised classifier score as input features for a Boosted Random Forest (BRF) classifier, yielding the final anomaly score. Our SegAD achieves state-of-the-art performance on both VAD (+2.1% AUROC) and the VisA dataset (+0.4% AUROC). The code and the models are publicly available.

5/14/2024

Learning Multi-view Anomaly Detection

Haoyang He, Jiangning Zhang, Guanzhong Tian, Chengjie Wang, Lei Xie

This study explores the recently proposed challenging multi-view Anomaly Detection (AD) task. Single-view tasks would encounter blind spots from other perspectives, resulting in inaccuracies in sample-level prediction. Therefore, we introduce the textbf{M}ulti-textbf{V}iew textbf{A}nomaly textbf{D}etection (textbf{MVAD}) framework, which learns and integrates features from multi-views. Specifically, we proposed a textbf{M}ulti-textbf{V}iew textbf{A}daptive textbf{S}election (textbf{MVAS}) algorithm for feature learning and fusion across multiple views. The feature maps are divided into neighbourhood attention windows to calculate a semantic correlation matrix between single-view windows and all other views, which is a conducted attention mechanism for each single-view window and the top-K most correlated multi-view windows. Adjusting the window sizes and top-K can minimise the computational complexity to linear. Extensive experiments on the Real-IAD dataset for cross-setting (multi/single-class) validate the effectiveness of our approach, achieving state-of-the-art performance among sample textbf{4.1%}$uparrow$/ image textbf{5.6%}$uparrow$/pixel textbf{6.7%}$uparrow$ levels with a total of ten metrics with only textbf{18M} parameters and fewer GPU memory and training time.

7/17/2024