Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Read original: arXiv:2407.21794 - Published 8/1/2024 by Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Yueqian Lin, Qing Yu, Go Irie, Shafiq Joty, Yixuan Li, Hai Li and 3 others

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Overview

Anomaly detection, novelty detection, open set recognition, and out-of-distribution (OOD) detection are important topics in the vision-language model era
This survey paper provides a comprehensive review of the recent developments and challenges in these areas

Plain English Explanation

The paper discusses a set of closely related topics that are becoming increasingly important as large vision-language models become more prevalent. These include:

Anomaly detection - Identifying data points that are significantly different from the norm
Novelty detection - Identifying new or previously unseen types of data
Open set recognition - Classifying data into known categories while also recognizing unknown categories
Out-of-distribution (OOD) detection - Identifying data that is very different from the training data

These capabilities are increasingly important as large vision-language models are deployed in the real world, where they may encounter data that is significantly different from what they were trained on. The survey examines the latest research advances and challenges in these areas.

Technical Explanation

The paper provides a comprehensive review of recent advancements in anomaly detection, novelty detection, open set recognition, and out-of-distribution (OOD) detection in the context of large vision-language models like CLIP.

The paper covers the key technical approaches and insights from recent research in these areas, including:

Techniques for learning robust representations that can better distinguish in-distribution and out-of-distribution data
Methods for open set recognition that can identify unknown classes while still accurately classifying known classes
Strategies for continual learning and unsupervised OOD detection to handle distribution shifts over time
Challenges and limitations of current OOD detection approaches, particularly when applied to large, pre-trained vision-language models

The survey aims to provide a comprehensive overview of the state-of-the-art in these important research areas and identify promising directions for future work.

Critical Analysis

The paper provides a thorough and well-structured review of the recent advancements in anomaly detection, novelty detection, open set recognition, and OOD detection in the context of large vision-language models.

One potential limitation is that the survey may not cover the most recent developments in these fast-moving research areas. Additionally, the paper focuses primarily on the technical aspects of the research, and could potentially benefit from more discussion of the practical implications and real-world applications of these techniques.

Overall, the paper provides a valuable resource for researchers and practitioners working in these domains, and highlights important directions for future work, such as addressing the challenges of applying OOD detection to large, pre-trained models.

Conclusion

This survey paper provides a comprehensive overview of the recent advancements and challenges in anomaly detection, novelty detection, open set recognition, and OOD detection in the context of large vision-language models. These capabilities are crucial as these models become more widely deployed, and the paper highlights promising research directions to address the key challenges in these areas.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Yueqian Lin, Qing Yu, Go Irie, Shafiq Joty, Yixuan Li, Hai Li, Ziwei Liu, Toshihiko Yamasaki, Kiyoharu Aizawa

Detecting out-of-distribution (OOD) samples is crucial for ensuring the safety of machine learning systems and has shaped the field of OOD detection. Meanwhile, several other problems are closely related to OOD detection, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). To unify these problems, a generalized OOD detection framework was proposed, taxonomically categorizing these five problems. However, Vision Language Models (VLMs) such as CLIP have significantly changed the paradigm and blurred the boundaries between these fields, again confusing researchers. In this survey, we first present a generalized OOD detection v2, encapsulating the evolution of AD, ND, OSR, OOD detection, and OD in the VLM era. Our framework reveals that, with some field inactivity and integration, the demanding challenges have become OOD detection and AD. In addition, we also highlight the significant shift in the definition, problem settings, and benchmarks; we thus feature a comprehensive review of the methodology for OOD detection, including the discussion over other related tasks to clarify their relationship to OOD detection. Finally, we explore the advancements in the emerging Large Vision Language Model (LVLM) era, such as GPT-4V. We conclude this survey with open challenges and future directions.

8/1/2024

Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey

Ruiyao Xu, Kaize Ding

Detecting anomalies or out-of-distribution (OOD) samples is critical for maintaining the reliability and trustworthiness of machine learning systems. Recently, Large Language Models (LLMs) have demonstrated their effectiveness not only in natural language processing but also in broader applications due to their advanced comprehension and generative capabilities. The integration of LLMs into anomaly and OOD detection marks a significant shift from the traditional paradigm in the field. This survey focuses on the problem of anomaly and OOD detection under the context of LLMs. We propose a new taxonomy to categorize existing approaches into three classes based on the role played by LLMs. Following our proposed taxonomy, we further discuss the related work under each of the categories and finally discuss potential challenges and directions for future research in this field. We also provide an up-to-date reading list of relevant papers.

9/4/2024

Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection

Chentao Cao, Zhun Zhong, Zhanke Zhou, Yang Liu, Tongliang Liu, Bo Han

Detecting out-of-distribution (OOD) samples is essential when deploying machine learning models in open-world scenarios. Zero-shot OOD detection, requiring no training on in-distribution (ID) data, has been possible with the advent of vision-language models like CLIP. Existing methods build a text-based classifier with only closed-set labels. However, this largely restricts the inherent capability of CLIP to recognize samples from large and open label space. In this paper, we propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to Envision potential Outlier Exposure, termed EOE, without access to any actual OOD data. Owing to better adaptation to open-world scenarios, EOE can be generalized to different tasks, including far, near, and fine-grained OOD detection. Technically, we design (1) LLM prompts based on visual similarity to generate potential outlier class labels specialized for OOD detection, as well as (2) a new score function based on potential outlier penalty to distinguish hard OOD samples effectively. Empirically, EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset. The code is publicly available at: https://github.com/tmlr-group/EOE.

6/4/2024

VI-OOD: A Unified Representation Learning Framework for Textual Out-of-distribution Detection

Li-Ming Zhan, Bo Liu, Xiao-Ming Wu

Out-of-distribution (OOD) detection plays a crucial role in ensuring the safety and reliability of deep neural networks in various applications. While there has been a growing focus on OOD detection in visual data, the field of textual OOD detection has received less attention. Only a few attempts have been made to directly apply general OOD detection methods to natural language processing (NLP) tasks, without adequately considering the characteristics of textual data. In this paper, we delve into textual OOD detection with Transformers. We first identify a key problem prevalent in existing OOD detection methods: the biased representation learned through the maximization of the conditional likelihood $p(ymid x)$ can potentially result in subpar performance. We then propose a novel variational inference framework for OOD detection (VI-OOD), which maximizes the likelihood of the joint distribution $p(x, y)$ instead of $p(ymid x)$. VI-OOD is tailored for textual OOD detection by efficiently exploiting the representations of pre-trained Transformers. Through comprehensive experiments on various text classification tasks, VI-OOD demonstrates its effectiveness and wide applicability. Our code has been released at url{https://github.com/liam0949/LLM-OOD}.

4/10/2024