Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection

2405.17816

Published 5/29/2024 by Yingwen Wu, Ruiji Yu, Xinwen Cheng, Zhengbao He, Xiaolin Huang

Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection

Abstract

In the open world, detecting out-of-distribution (OOD) data, whose labels are disjoint with those of in-distribution (ID) samples, is important for reliable deep neural networks (DNNs). To achieve better detection performance, one type of approach proposes to fine-tune the model with auxiliary OOD datasets to amplify the difference between ID and OOD data through a separation loss defined on model outputs. However, none of these studies consider enlarging the feature disparity, which should be more effective compared to outputs. The main difficulty lies in the diversity of OOD samples, which makes it hard to describe their feature distribution, let alone design losses to separate them from ID features. In this paper, we neatly fence off the problem based on an aggregation property of ID features named Neural Collapse (NC). NC means that the penultimate features of ID samples within a class are nearly identical to the last layer weight of the corresponding class. Based on this property, we propose a simple but effective loss called OrthLoss, which binds the features of OOD data in a subspace orthogonal to the principal subspace of ID features formed by NC. In this way, the features of ID and OOD samples are separated by different dimensions. By optimizing the feature separation loss rather than purely enlarging output differences, our detection achieves SOTA performance on CIFAR benchmarks without any additional data augmentation or sampling, demonstrating the importance of feature separation in OOD detection. The code will be published.

Create account to get full access

Overview

The paper explores using neural collapse, a phenomenon where neural network features converge during training, for out-of-distribution (OOD) detection.
OOD detection is the task of identifying samples that come from a different distribution than the one the model was trained on, which is crucial for robust and reliable AI systems.
The researchers propose a feature separation approach based on neural collapse to improve OOD detection performance.

Plain English Explanation

Neural networks, the powerful algorithms behind many modern AI systems, can sometimes struggle to identify when they're being presented with data that is very different from what they were trained on. This can be a significant problem, as it can lead to unreliable or even dangerous behavior from the AI system.

The researchers in this paper explore a novel approach to address this issue, called "out-of-distribution (OOD) detection." The goal of OOD detection is to allow an AI system to recognize when it's being shown data that is substantially different from its training data, so that it can either refuse to make a prediction or handle the situation more cautiously.

The key insight the researchers had is that as neural networks are trained, the features they learn tend to "collapse" or converge in a particular way. By understanding this "neural collapse" phenomenon, the researchers developed a method to better separate the features of in-distribution and out-of-distribution data. This improved separation then allows the neural network to more accurately identify when it's being shown something that is out-of-distribution.

The researchers demonstrate the effectiveness of their approach through a series of experiments, showing that it outperforms other state-of-the-art OOD detection methods. This work represents an important step forward in making AI systems more robust and reliable, by giving them a better ability to recognize when they're being asked to operate outside of their comfort zone.

Technical Explanation

The core idea behind the researchers' approach is to leverage the phenomenon of "neural collapse" to improve out-of-distribution (OOD) detection. Neural collapse is an observed behavior where the features learned by a neural network during training tend to converge towards a small number of points or "prototypes" for each class.

The researchers hypothesize that this collapse of features can be used to better separate the representations of in-distribution and out-of-distribution data. They propose a "feature separation" approach that aims to maximize the distance between the prototypes of in-distribution and out-of-distribution classes during training.

Specifically, the researchers introduce a new loss function that encourages the network to: 1) collapse the features of in-distribution samples towards their class prototypes, and 2) push the prototypes of in-distribution and out-of-distribution classes as far apart as possible. This loss function is then combined with the standard classification loss during training.

Through extensive experiments on a variety of OOD detection benchmarks, including CIFAR-10/CIFAR-100, SVHN/CIFAR-10, and ImageNet/Places365, the researchers demonstrate that their feature separation approach outperforms other state-of-the-art OOD detection methods. They also provide analysis showing that the learned feature representations are indeed better separated between in-distribution and out-of-distribution samples.

Critical Analysis

The researchers' feature separation approach based on neural collapse is a promising direction for improving out-of-distribution detection in neural networks. By explicitly encouraging the network to learn more separable representations for in-distribution and out-of-distribution data, the method seems to provide a principled way to enhance OOD detection performance.

That said, the paper does not address some potential limitations and avenues for further research. For example, the experiments focus on well-established OOD detection benchmarks, but it's unclear how the method would scale to more complex, real-world OOD scenarios. Additionally, the researchers do not explore the interpretability or explainability of the learned feature representations, which could be an important consideration for deploying these models in high-stakes applications.

Further research could also investigate ways to combine the feature separation approach with other OOD detection techniques, such as subspace projection or feature density estimation, to potentially achieve even stronger performance. Overall, the paper presents a compelling method that merits further exploration and development.

Conclusion

This paper introduces a novel approach to out-of-distribution detection that leverages the phenomenon of neural collapse. By explicitly encouraging neural networks to learn more separable feature representations for in-distribution and out-of-distribution data, the researchers' feature separation method demonstrates improved OOD detection performance across a range of benchmark tasks.

The work represents an important step forward in making AI systems more robust and reliable, by enhancing their ability to recognize when they are being presented with data that is substantially different from their training distribution. As AI continues to be deployed in high-stakes applications, advances in OOD detection will be crucial for ensuring the safety and trustworthiness of these systems.

While the paper provides a strong foundation, there remain opportunities for further research to address potential limitations and explore synergies with other OOD detection techniques. By continuing to push the boundaries of this critical capability, the AI research community can help unlock the full potential of machine learning to positively transform our world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🧠

Detecting Out-of-Distribution Through the Lens of Neural Collapse

Litian Liu, Yao Qin

Efficient and versatile Out-of-Distribution (OOD) detection is essential for the safe deployment of AI yet remains challenging for existing algorithms. Inspired by Neural Collapse, we discover that features of in-distribution (ID) samples cluster closer to the weight vectors compared to features of OOD samples. In addition, we reveal that ID features tend to expand in space to structure a simplex Equiangular Tight Framework, which nicely explains the prevalent observation that ID features reside further from the origin than OOD features. Taking both insights from Neural Collapse into consideration, we propose to leverage feature proximity to weight vectors for OOD detection and further complement this perspective by using feature norms to filter OOD samples. Extensive experiments on off-the-shelf models demonstrate the efficiency and effectiveness of our method across diverse classification tasks and model architectures, enhancing the generalization capability of OOD detection.

6/3/2024

cs.LG eess.IV

🔎

Out-of-distribution detection based on subspace projection of high-dimensional features output by the last convolutional layer

Qiuyu Zhu, Yiwei He

Out-of-distribution (OOD) detection, crucial for reliable pattern classification, discerns whether a sample originates outside the training distribution. This paper concentrates on the high-dimensional features output by the final convolutional layer, which contain rich image features. Our key idea is to project these high-dimensional features into two specific feature subspaces, leveraging the dimensionality reduction capacity of the network's linear layers, trained with Predefined Evenly-Distribution Class Centroids (PEDCC)-Loss. This involves calculating the cosines of three projection angles and the norm values of features, thereby identifying distinctive information for in-distribution (ID) and OOD data, which assists in OOD detection. Building upon this, we have modified the batch normalization (BN) and ReLU layer preceding the fully connected layer, diminishing their impact on the output feature distributions and thereby widening the distribution gap between ID and OOD data features. Our method requires only the training of the classification network model, eschewing any need for input pre-processing or specific OOD data pre-tuning. Extensive experiments on several benchmark datasets demonstrates that our approach delivers state-of-the-art performance. Our code is available at https://github.com/Hewell0/ProjOOD.

5/6/2024

cs.CV

Fast Decision Boundary based Out-of-Distribution Detector

Litian Liu, Yao Qin

Efficient and effective Out-of-Distribution (OOD) detection is essential for the safe deployment of AI systems. Existing feature space methods, while effective, often incur significant computational overhead due to their reliance on auxiliary models built from training features. In this paper, we propose a computationally-efficient OOD detector without using auxiliary models while still leveraging the rich information embedded in the feature space. Specifically, we detect OOD samples based on their feature distances to decision boundaries. To minimize computational cost, we introduce an efficient closed-form estimation, analytically proven to tightly lower bound the distance. Based on our estimation, we discover that In-Distribution (ID) features tend to be further from decision boundaries than OOD features. Additionally, ID and OOD samples are better separated when compared at equal deviation levels from the mean of training features. By regularizing the distances to decision boundaries based on feature deviation from the mean, we develop a hyperparameter-free, auxiliary model-free OOD detector. Our method matches or surpasses the effectiveness of state-of-the-art methods in extensive experiments while incurring negligible overhead in inference latency. Overall, our approach significantly improves the efficiency-effectiveness trade-off in OOD detection. Code is available at: https://github.com/litianliu/fDBD-OOD.

6/5/2024

cs.LG eess.IV

Continual Unsupervised Out-of-Distribution Detection

Lars Doorenbos, Raphael Sznitman, Pablo M'arquez-Neila

Deep learning models excel when the data distribution during training aligns with testing data. Yet, their performance diminishes when faced with out-of-distribution (OOD) samples, leading to great interest in the field of OOD detection. Current approaches typically assume that OOD samples originate from an unconcentrated distribution complementary to the training distribution. While this assumption is appropriate in the traditional unsupervised OOD (U-OOD) setting, it proves inadequate when considering the place of deployment of the underlying deep learning model. To better reflect this real-world scenario, we introduce the novel setting of continual U-OOD detection. To tackle this new setting, we propose a method that starts from a U-OOD detector, which is agnostic to the OOD distribution, and slowly updates during deployment to account for the actual OOD distribution. Our method uses a new U-OOD scoring function that combines the Mahalanobis distance with a nearest-neighbor approach. Furthermore, we design a confidence-scaled few-shot OOD detector that outperforms previous methods. We show our method greatly improves upon strong baselines from related fields.

6/5/2024

cs.CV cs.LG