Learning Transferable Negative Prompts for Out-of-Distribution Detection

2404.03248

Published 4/5/2024 by Tianqi Li, Guansong Pang, Xiao Bai, Wenjun Miao, Jin Zheng

Learning Transferable Negative Prompts for Out-of-Distribution Detection

Abstract

Existing prompt learning methods have shown certain capabilities in Out-of-Distribution (OOD) detection, but the lack of OOD images in the target dataset in their training can lead to mismatches between OOD images and In-Distribution (ID) categories, resulting in a high false positive rate. To address this issue, we introduce a novel OOD detection method, named 'NegPrompt', to learn a set of negative prompts, each representing a negative connotation of a given class label, for delineating the boundaries between ID and OOD images. It learns such negative prompts with ID data only, without any reliance on external outlier data. Further, current methods assume the availability of samples of all ID classes, rendering them ineffective in open-vocabulary learning scenarios where the inference stage can contain novel ID classes not present during training. In contrast, our learned negative prompts are transferable to novel class labels. Experiments on various ImageNet benchmarks show that NegPrompt surpasses state-of-the-art prompt-learning-based OOD detection methods and maintains a consistent lead in hard OOD detection in closed- and open-vocabulary classification scenarios. Code is available at https://github.com/mala-lab/negprompt.

Create account to get full access

Overview

This paper introduces a new approach for detecting out-of-distribution (OOD) samples using pre-trained vision-language models.
The key idea is to learn "negative prompts" that can effectively identify OOD inputs across different datasets and tasks.
The proposed method demonstrates strong OOD detection performance compared to existing techniques.

Plain English Explanation

The paper focuses on the problem of detecting images that are "out-of-distribution" (OOD) - meaning they are different from the types of images a machine learning model was trained on. This is an important challenge, as models can sometimes make mistakes or behave unexpectedly when presented with images they weren't prepared for.

The researchers developed a new approach that uses pre-trained vision-language models, which are neural networks trained on large datasets of images and text. The core innovation is to learn "negative prompts" - short text descriptions that can effectively identify OOD images. These negative prompts act as a kind of filter, allowing the model to more reliably recognize when an input image is different from its normal training data.

The key benefit of this technique is that the negative prompts can be transferred to work across different datasets and tasks, rather than having to train a new OOD detection system from scratch for each new application. The experiments show this transfer learning approach outperforms other state-of-the-art OOD detection methods.

Technical Explanation

The paper proposes a framework called "Transferable Negative Prompts" (TNP) for out-of-distribution (OOD) detection using pre-trained vision-language models. The core idea is to learn prompt embeddings that can effectively identify OOD samples, and then leverage these prompts across different datasets and tasks.

The TNP approach involves three main steps:

Collecting a diverse set of OOD samples to serve as a "negative" dataset.
Training a prompt encoder to generate prompt embeddings that can discriminate between in-distribution and OOD inputs.
Transferring the learned prompt embeddings to new tasks and datasets for OOD detection.

The experiments evaluate TNP on several benchmark OOD detection tasks, including detecting OOD images in CIFAR-10, CIFAR-100, and ImageNet. The results demonstrate that TNP outperforms prior state-of-the-art techniques, achieving higher area under the ROC curve (AUROC) scores.

A key advantage of TNP is its ability to generalize to new domains without requiring expensive retraining. The transferable prompt embeddings can be directly applied to identify OOD samples in novel datasets and tasks.

Critical Analysis

The paper provides a compelling technical solution to the important problem of out-of-distribution detection. The authors' insight to leverage transferable prompt embeddings is novel and shows strong empirical results. However, a few caveats and limitations are worth noting:

The negative dataset used to train the prompt encoder is curated manually, which could introduce biases. An automated method for generating diverse OOD samples may be preferable.
The experiments focus on image classification tasks, but it's unclear how well the approach would generalize to other modalities like video or audio.
The paper does not deeply explore the failure modes of the TNP system or investigate potential edge cases where the negative prompts may not transfer as effectively.

Additionally, while the technical contributions are significant, the broader implications for real-world deployment of such OOD detection systems are not extensively discussed. Factors like computational efficiency, robustness to adversarial attacks, and alignment with human notions of anomalous inputs could be valuable areas for further research.

Overall, this paper represents an important step forward in addressing the challenge of out-of-distribution detection. The transferable negative prompt concept is a clever and effective approach that merits further exploration and refinement.

Conclusion

In summary, this paper introduces a novel technique called Transferable Negative Prompts (TNP) for detecting out-of-distribution (OOD) samples using pre-trained vision-language models. The key innovation is to learn prompt embeddings that can effectively discriminate between in-distribution and OOD inputs, and then leverage these transferable prompts across different datasets and tasks.

The experimental results demonstrate that TNP outperforms prior state-of-the-art OOD detection methods, highlighting its potential as a powerful and generalizable solution to this important problem. While the paper identifies some limitations, the underlying concept of using transferable negative prompts represents an exciting new direction for improving the robustness and reliability of machine learning systems deployed in the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Negative Label Guided OOD Detection with Pretrained Vision-Language Models

Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, Bo Han

Out-of-distribution (OOD) detection aims at identifying samples from unknown classes, playing a crucial role in trustworthy models against errors on unexpected inputs. Extensive research has been dedicated to exploring OOD detection in the vision modality. Vision-language models (VLMs) can leverage both textual and visual information for various multi-modal applications, whereas few OOD detection methods take into account information from the text modality. In this paper, we propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases. We design a novel scheme for the OOD score collaborated with negative labels. Theoretical analysis helps to understand the mechanism of negative labels. Extensive experiments demonstrate that our method NegLabel achieves state-of-the-art performance on various OOD detection benchmarks and generalizes well on multiple VLM architectures. Furthermore, our method NegLabel exhibits remarkable robustness against diverse domain shifts. The codes are available at https://github.com/tmlr-group/NegLabel.

4/1/2024

cs.CV cs.LG

Enhancing Near OOD Detection in Prompt Learning: Maximum Gains, Minimal Costs

Myong Chol Jung, He Zhao, Joanna Dipnall, Belinda Gabbe, Lan Du

Prompt learning has shown to be an efficient and effective fine-tuning method for vision-language models like CLIP. While numerous studies have focused on the generalisation of these models in few-shot classification, their capability in near out-of-distribution (OOD) detection has been overlooked. A few recent works have highlighted the promising performance of prompt learning in far OOD detection. However, the more challenging task of few-shot near OOD detection has not yet been addressed. In this study, we investigate the near OOD detection capabilities of prompt learning models and observe that commonly used OOD scores have limited performance in near OOD detection. To enhance the performance, we propose a fast and simple post-hoc method that complements existing logit-based scores, improving near OOD detection AUROC by up to 11.67% with minimal computational cost. Our method can be easily applied to any prompt learning model without change in architecture or re-training the models. Comprehensive empirical evaluations across 13 datasets and 8 models demonstrate the effectiveness and adaptability of our method.

5/28/2024

cs.CV

Dual-Adapter: Training-free Dual Adaptation for Few-shot Out-of-Distribution Detection

Xinyi Chen, Yaohui Li, Haoxing Chen

We study the problem of few-shot out-of-distribution (OOD) detection, which aims to detect OOD samples from unseen categories during inference time with only a few labeled in-domain (ID) samples. Existing methods mainly focus on training task-aware prompts for OOD detection. However, training on few-shot data may cause severe overfitting and textual prompts alone may not be enough for effective detection. To tackle these problems, we propose a prior-based Training-free Dual Adaptation method (Dual-Adapter) to detect OOD samples from both textual and visual perspectives. Specifically, Dual-Adapter first extracts the most significant channels as positive features and designates the remaining less relevant channels as negative features. Then, it constructs both a positive adapter and a negative adapter from a dual perspective, thereby better leveraging previously outlooked or interfering features in the training dataset. In this way, Dual-Adapter can inherit the advantages of CLIP not having to train, but also excels in distinguishing between ID and OOD samples. Extensive experimental results on four benchmark datasets demonstrate the superiority of Dual-Adapter.

5/28/2024

cs.CV

PromptAD: Learning Prompts with only Normal Samples for Few-Shot Anomaly Detection

Xiaofan Li, Zhizhong Zhang, Xin Tan, Chengwei Chen, Yanyun Qu, Yuan Xie, Lizhuang Ma

The vision-language model has brought great improvement to few-shot industrial anomaly detection, which usually needs to design of hundreds of prompts through prompt engineering. For automated scenarios, we first use conventional prompt learning with many-class paradigm as the baseline to automatically learn prompts but found that it can not work well in one-class anomaly detection. To address the above problem, this paper proposes a one-class prompt learning method for few-shot anomaly detection, termed PromptAD. First, we propose semantic concatenation which can transpose normal prompts into anomaly prompts by concatenating normal prompts with anomaly suffixes, thus constructing a large number of negative samples used to guide prompt learning in one-class setting. Furthermore, to mitigate the training challenge caused by the absence of anomaly images, we introduce the concept of explicit anomaly margin, which is used to explicitly control the margin between normal prompt features and anomaly prompt features through a hyper-parameter. For image-level/pixel-level anomaly detection, PromptAD achieves first place in 11/12 few-shot settings on MVTec and VisA.

4/9/2024

cs.CV