DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Read original: arXiv:2408.13509 - Published 8/29/2024 by Ying Jin, Jinlong Peng, Qingdong He, Teng Hu, Hao Chen, Jiafu Wu, Wenbing Zhu, Mingmin Chi, Jun Liu, Yabiao Wang and 1 other

DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Overview

The paper proposes a novel model called "DualAnoDiff" for generating anomaly images in a few-shot setting.
It combines two interrelated diffusion models to capture global and local anomalies simultaneously.
The approach aims to improve few-shot anomaly detection by generating diverse and realistic anomaly images.

Plain English Explanation

The paper introduces a new machine learning model called "DualAnoDiff" that can create unusual or abnormal images from a small number of examples. This is useful for improving the detection of anomalies or defects in real-world applications like manufacturing or medical imaging.

The key idea behind DualAnoDiff is to use two interconnected diffusion models. Diffusion models are a type of AI that can generate new images by gradually adding and then removing random noise. One diffusion model in DualAnoDiff focuses on capturing the overall shape and structure of anomalies, while the other focuses on the fine details. By having these two models work together, the system can generate a wider variety of anomaly images from just a few examples.

This is important because many real-world anomaly detection systems struggle when they only have a small number of anomaly examples to learn from. By generating more diverse anomaly images, DualAnoDiff aims to help these systems become more accurate and robust, even when faced with limited training data.

Technical Explanation

The core of the DualAnoDiff model is its use of dual-interrelated diffusion models. One diffusion model, called the "global diffusion model," focuses on capturing the overall shape and structure of anomalies. The other, called the "local diffusion model," focuses on generating the fine-grained details of anomalies.

These two diffusion models are trained simultaneously but are also interrelated. The global model provides guidance to the local model, helping it generate anomaly details that are consistent with the overall anomaly structure. Conversely, the local model provides feedback to the global model, ensuring that the generated anomaly shapes are plausible and realistic.

By leveraging this dual-interrelated approach, DualAnoDiff is able to generate a diverse set of anomaly images from just a few examples. The authors demonstrate the effectiveness of their approach through experiments on several few-shot anomaly detection benchmarks, showing consistent improvements over previous state-of-the-art methods.

Critical Analysis

The paper provides a compelling solution to the challenge of few-shot anomaly detection. By generating diverse anomaly images, DualAnoDiff has the potential to significantly improve the performance of anomaly detection systems, especially in domains where anomaly data is scarce.

However, the paper does not address some potential limitations of the approach. For example, the generated anomaly images may not fully capture the complexity and variability of real-world anomalies, which could limit the model's effectiveness in certain applications. Additionally, the computational cost of training the dual-interrelated diffusion models may be higher than other anomaly generation methods.

Further research could explore ways to address these limitations, such as investigating more efficient training strategies or incorporating additional mechanisms to ensure the generated anomalies are truly representative of real-world data. Evaluating the model's performance on a wider range of anomaly detection tasks and real-world scenarios would also be valuable.

Conclusion

The DualAnoDiff model proposed in this paper represents an important step forward in the field of few-shot anomaly detection. By leveraging the power of dual-interrelated diffusion models, the approach can generate diverse and realistic anomaly images, which can then be used to improve the performance of anomaly detection systems.

While the paper has some limitations, the core idea of using complementary global and local diffusion models is promising and could inspire further advancements in this area. As anomaly detection becomes increasingly important in various industries, tools like DualAnoDiff will play a crucial role in enabling more robust and reliable systems, even when faced with limited training data.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Ying Jin, Jinlong Peng, Qingdong He, Teng Hu, Hao Chen, Jiafu Wu, Wenbing Zhu, Mingmin Chi, Jun Liu, Yabiao Wang, Chengjie Wang

The performance of anomaly inspection in industrial manufacturing is constrained by the scarcity of anomaly data. To overcome this challenge, researchers have started employing anomaly generation approaches to augment the anomaly dataset. However, existing anomaly generation methods suffer from limited diversity in the generated anomalies and struggle to achieve a seamless blending of this anomaly with the original image. In this paper, we overcome these challenges from a new perspective, simultaneously generating a pair of the overall image and the corresponding anomaly part. We propose DualAnoDiff, a novel diffusion-based few-shot anomaly image generation model, which can generate diverse and realistic anomaly images by using a dual-interrelated diffusion model, where one of them is employed to generate the whole image while the other one generates the anomaly part. Moreover, we extract background and shape information to mitigate the distortion and blurriness phenomenon in few-shot image generation. Extensive experiments demonstrate the superiority of our proposed model over state-of-the-art methods in terms of both realism and diversity. Overall, our approach significantly improves the performance of downstream anomaly detection tasks, including anomaly detection, anomaly localization, and anomaly classification tasks.

8/29/2024

Few-shot Defect Image Generation based on Consistency Modeling

Qingfeng Shi, Jing Wei, Fei Shen, Zhengtao Zhang

Image generation can solve insufficient labeled data issues in defect detection. Most defect generation methods are only trained on a single product without considering the consistencies among multiple products, leading to poor quality and diversity of generated results. To address these issues, we propose DefectDiffu, a novel text-guided diffusion method to model both intra-product background consistency and inter-product defect consistency across multiple products and modulate the consistency perturbation directions to control product type and defect strength, achieving diversified defect image generation. Firstly, we leverage a text encoder to separately provide consistency prompts for background, defect, and fusion parts of the disentangled integrated architecture, thereby disentangling defects and normal backgrounds. Secondly, we propose the double-free strategy to generate defect images through two-stage perturbation of consistency direction, thereby controlling product type and defect strength by adjusting the perturbation scale. Besides, DefectDiffu can generate defect mask annotations utilizing cross-attention maps from the defect part. Finally, to improve the generation quality of small defects and masks, we propose the adaptive attention-enhance loss to increase the attention to defects. Experimental results demonstrate that DefectDiffu surpasses state-of-the-art methods in terms of generation quality and diversity, thus effectively improving downstream defection performance. Moreover, defect perturbation directions can be transferred among various products to achieve zero-shot defect generation, which is highly beneficial for addressing insufficient data issues. The code are available at https://github.com/FFDD-diffusion/DefectDiffu.

8/2/2024

❗

AnomalyXFusion: Multi-modal Anomaly Synthesis with Diffusion

Jie Hu, Yawen Huang, Yilin Lu, Guoyang Xie, Guannan Jiang, Yefeng Zheng, Zhichao Lu

Anomaly synthesis is one of the effective methods to augment abnormal samples for training. However, current anomaly synthesis methods predominantly rely on texture information as input, which limits the fidelity of synthesized abnormal samples. Because texture information is insufficient to correctly depict the pattern of anomalies, especially for logical anomalies. To surmount this obstacle, we present the AnomalyXFusion framework, designed to harness multi-modality information to enhance the quality of synthesized abnormal samples. The AnomalyXFusion framework comprises two distinct yet synergistic modules: the Multi-modal In-Fusion (MIF) module and the Dynamic Dif-Fusion (DDF) module. The MIF module refines modality alignment by aggregating and integrating various modality features into a unified embedding space, termed X-embedding, which includes image, text, and mask features. Concurrently, the DDF module facilitates controlled generation through an adaptive adjustment of X-embedding conditioned on the diffusion steps. In addition, to reveal the multi-modality representational power of AnomalyXFusion, we propose a new dataset, called MVTec Caption. More precisely, MVTec Caption extends 2.2k accurate image-mask-text annotations for the MVTec AD and LOCO datasets. Comprehensive evaluations demonstrate the effectiveness of AnomalyXFusion, especially regarding the fidelity and diversity for logical anomalies. Project page: http:github.com/hujiecpp/MVTec-Caption

5/3/2024

GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection

Hang Yao, Ming Liu, Haolin Wang, Zhicun Yin, Zifei Yan, Xiaopeng Hong, Wangmeng Zuo

Diffusion models have shown superior performance on unsupervised anomaly detection tasks. Since trained with normal data only, diffusion models tend to reconstruct normal counterparts of test images with certain noises added. However, these methods treat all potential anomalies equally, which may cause two main problems. From the global perspective, the difficulty of reconstructing images with different anomalies is uneven. Therefore, instead of utilizing the same setting for all samples, we propose to predict a particular denoising step for each sample by evaluating the difference between image contents and the priors extracted from diffusion models. From the local perspective, reconstructing abnormal regions differs from normal areas even in the same image. Theoretically, the diffusion model predicts a noise for each step, typically following a standard Gaussian distribution. However, due to the difference between the anomaly and its potential normal counterpart, the predicted noise in abnormal regions will inevitably deviate from the standard Gaussian distribution. To this end, we propose introducing synthetic abnormal samples in training to encourage the diffusion models to break through the limitation of standard Gaussian distribution, and a spatial-adaptive feature fusion scheme is utilized during inference. With the above modifications, we propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection, which introduces appealing flexibility and achieves anomaly-free reconstruction while retaining as much normal information as possible. Extensive experiments are conducted on three commonly used anomaly detection datasets (MVTec-AD, MPDD, and VisA) and a printed circuit board dataset (PCB-Bank) we integrated, showing the effectiveness of the proposed method.

9/10/2024