MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation

Read original: arXiv:2404.19644 - Published 5/1/2024 by Min Zhang, Haoxuan Li, Fei Wu, Kun Kuang

🏷️

Overview

Out-of-distribution (OOD) problems occur when novel classes from testing distributions differ from base classes in training distributions, degrading deep learning model performance.
OOD problems in few-shot classification (FSC) include: (a) cross-domain few-shot classification (CD-FSC) and (b) spurious-correlation few-shot classification (SC-FSC).
CD-FSC occurs when a classifier struggles to transfer knowledge from base classes to novel classes from unseen distributions.
SC-FSC arises when a classifier relies on non-causal features correlated with labels in base classes, but these relationships don't hold during deployment.
To address the lack of SC-FSC evaluation benchmarks, this paper introduces MetaCoCo, a benchmark with spurious-correlation shifts from real-world scenarios.
The paper also proposes a metric to quantify the extent of spurious-correlation shifts using the CLIP vision-language model.

Plain English Explanation

In machine learning, a common challenge is dealing with "out-of-distribution" (OOD) problems. This means that the data used to test a model can be quite different from the data the model was trained on. This can significantly degrade the model's performance when deployed in real-world applications.

The paper focuses on a specific type of OOD problem called "few-shot classification" (FSC). In FSC, the model is trained on a set of "base" classes, but then has to recognize "novel" classes it hasn't seen before, often with only a few examples.

The authors identify two main types of OOD problems in FSC:

Cross-domain few-shot classification (CD-FSC): This occurs when the model has to recognize novel classes from a completely different domain or distribution than the base classes it was trained on. For example, the model might be trained on images of domestic animals, but then has to recognize wild animals.
Spurious-correlation few-shot classification (SC-FSC): This happens when the model learns to rely on features or contexts that are coincidentally correlated with the labels in the base classes, but those correlations don't hold true for the novel classes. For instance, the model might learn that a particular background texture is associated with a certain animal class, but then fails when that texture doesn't indicate the same class in the novel examples.

To better study the SC-FSC problem, the researchers created a new benchmark dataset called MetaCoCo, which contains examples of these spurious-correlation shifts from real-world scenarios. They also developed a way to measure the extent of these shifts using a pre-trained vision-language model called CLIP.

By evaluating state-of-the-art FSC methods on this new benchmark, the authors found that existing techniques struggle significantly when faced with spurious-correlation problems. This highlights the importance of developing more robust few-shot learning approaches that can handle these types of OOD challenges.

Technical Explanation

The paper focuses on two key types of out-of-distribution (OOD) problems in few-shot classification (FSC): (a) cross-domain few-shot classification (CD-FSC) and (b) spurious-correlation few-shot classification (SC-FSC).

CD-FSC occurs when a classifier learns to transfer knowledge from base classes drawn from seen training distributions, but then struggles to recognize novel classes sampled from unseen testing distributions. In contrast, SC-FSC arises when a classifier relies on non-causal features or contextual cues that happen to be correlated with the labels (or concepts) in the base classes, but these relationships no longer hold true when the model is deployed.

To address the lack of evaluation benchmarks for SC-FSC, the authors present MetaCoCo, a new benchmark that contains examples of spurious-correlation shifts collected from real-world scenarios. Additionally, they propose a metric to quantify the extent of these spurious-correlation shifts using the pre-trained CLIP vision-language model.

Extensive experiments on the MetaCoCo benchmark are performed to evaluate the performance of state-of-the-art methods in FSC, cross-domain shifts, and self-supervised learning. The results show that the existing techniques degrade significantly in the presence of spurious-correlation shifts, highlighting the importance of developing more robust few-shot learning approaches that can handle these types of OOD challenges.

Critical Analysis

The paper's focus on spurious-correlation shifts in few-shot classification is a valuable contribution, as this problem has been relatively understudied compared to cross-domain shifts. The introduction of the MetaCoCo benchmark and the proposed metric for quantifying the extent of spurious-correlation shifts are particularly notable.

One potential limitation of the research is the reliance on CLIP as the pre-trained model for measuring spurious-correlation shifts. While CLIP is a powerful vision-language model, it may not capture all the nuances of the shifts present in the MetaCoCo dataset. It could be worthwhile to explore other methods for quantifying these shifts, such as simple semantic-aided few-shot learning or revisiting few-shot object detection with vision-language approaches.

Additionally, the paper focuses primarily on evaluating the performance of existing methods on the MetaCoCo benchmark, but does not propose any novel techniques for addressing spurious-correlation shifts in few-shot classification. Future research could explore new architectural designs, training strategies, or calibrating higher-order statistics for few-shot class recognition to improve robustness to these types of OOD challenges.

Overall, the paper makes a valuable contribution by highlighting the importance of the understudied SC-FSC problem and providing a benchmark and quantitative metric to facilitate further research in this area.

Conclusion

This paper addresses a critical challenge in few-shot classification: out-of-distribution (OOD) problems, particularly those related to spurious-correlation shifts. The authors introduce a new benchmark called MetaCoCo, which contains examples of these spurious-correlation shifts from real-world scenarios, and propose a metric to quantify the extent of the shifts using the CLIP vision-language model.

Evaluating state-of-the-art few-shot learning methods on the MetaCoCo benchmark reveals that existing techniques struggle significantly in the presence of spurious-correlation shifts, highlighting the need for more robust few-shot learning approaches. This research opens up new avenues for improving the performance and reliability of few-shot classification models in real-world applications, where OOD challenges are often encountered.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🏷️

MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation

Min Zhang, Haoxuan Li, Fei Wu, Kun Kuang

Out-of-distribution (OOD) problems in few-shot classification (FSC) occur when novel classes sampled from testing distributions differ from base classes drawn from training distributions, which considerably degrades the performance of deep learning models deployed in real-world applications. Recent studies suggest that the OOD problems in FSC mainly including: (a) cross-domain few-shot classification (CD-FSC) and (b) spurious-correlation few-shot classification (SC-FSC). Specifically, CD-FSC occurs when a classifier learns transferring knowledge from base classes drawn from seen training distributions but recognizes novel classes sampled from unseen testing distributions. In contrast, SC-FSC arises when a classifier relies on non-causal features (or contexts) that happen to be correlated with the labels (or concepts) in base classes but such relationships no longer hold during the model deployment. Despite CD-FSC has been extensively studied, SC-FSC remains understudied due to lack of the corresponding evaluation benchmarks. To this end, we present Meta Concept Context (MetaCoCo), a benchmark with spurious-correlation shifts collected from real-world scenarios. Moreover, to quantify the extent of spurious-correlation shifts of the presented MetaCoCo, we further propose a metric by using CLIP as a pre-trained vision-language model. Extensive experiments on the proposed benchmark are performed to evaluate the state-of-the-art methods in FSC, cross-domain shifts, and self-supervised learning. The experimental results show that the performance of the existing methods degrades significantly in the presence of spurious-correlation shifts. We open-source all codes of our benchmark and hope that the proposed MetaCoCo can facilitate future research on spurious-correlation shifts problems in FSC. The code is available at: https://github.com/remiMZ/MetaCoCo-ICLR24.

5/1/2024

🏷️

Bayesian Evidential Learning for Few-Shot Classification

Xiongkun Linghu, Yan Bai, Yihang Lou, Shengsen Wu, Jinze Li, Jianzhong He, Tao Bai

Few-Shot Classification(FSC) aims to generalize from base classes to novel classes given very limited labeled samples, which is an important step on the path toward human-like machine learning. State-of-the-art solutions involve learning to find a good metric and representation space to compute the distance between samples. Despite the promising accuracy performance, how to model uncertainty for metric-based FSC methods effectively is still a challenge. To model uncertainty, We place a distribution over class probability based on the theory of evidence. As a result, uncertainty modeling and metric learning can be decoupled. To reduce the uncertainty of classification, we propose a Bayesian evidence fusion theorem. Given observed samples, the network learns to get posterior distribution parameters given the prior parameters produced by the pre-trained network. Detailed gradient analysis shows that our method provides a smooth optimization target and can capture the uncertainty. The proposed method is agnostic to metric learning strategies and can be implemented as a plug-and-play module. We integrate our method into several newest FSC methods and demonstrate the improved accuracy and uncertainty quantification on standard FSC benchmarks.

9/5/2024

Spuriousness-Aware Meta-Learning for Learning Robust Classifiers

Guangtao Zheng, Wenqian Ye, Aidong Zhang

Spurious correlations are brittle associations between certain attributes of inputs and target variables, such as the correlation between an image background and an object class. Deep image classifiers often leverage them for predictions, leading to poor generalization on the data where the correlations do not hold. Mitigating the impact of spurious correlations is crucial towards robust model generalization, but it often requires annotations of the spurious correlations in data -- a strong assumption in practice. In this paper, we propose a novel learning framework based on meta-learning, termed SPUME -- SPUriousness-aware MEta-learning, to train an image classifier to be robust to spurious correlations. We design the framework to iteratively detect and mitigate the spurious correlations that the classifier excessively relies on for predictions. To achieve this, we first propose to utilize a pre-trained vision-language model to extract text-format attributes from images. These attributes enable us to curate data with various class-attribute correlations, and we formulate a novel metric to measure the degree of these correlations' spuriousness. Then, to mitigate the reliance on spurious correlations, we propose a meta-learning strategy in which the support (training) sets and query (test) sets in tasks are curated with different spurious correlations that have high degrees of spuriousness. By meta-training the classifier on these spuriousness-aware meta-learning tasks, our classifier can learn to be invariant to the spurious correlations. We demonstrate that our method is robust to spurious correlations without knowing them a priori and achieves the best on five benchmark datasets with different robustness measures.

6/18/2024

New!COSCO: A Sharpness-Aware Training Framework for Few-shot Multivariate Time Series Classification

Jesus Barreda, Ashley Gomez, Ruben Puga, Kaixiong Zhou, Li Zhang

Multivariate time series classification is an important task with widespread domains of applications. Recently, deep neural networks (DNN) have achieved state-of-the-art performance in time series classification. However, they often require large expert-labeled training datasets which can be infeasible in practice. In few-shot settings, i.e. only a limited number of samples per class are available in training data, DNNs show a significant drop in testing accuracy and poor generalization ability. In this paper, we propose to address these problems from an optimization and a loss function perspective. Specifically, we propose a new learning framework named COSCO consisting of a sharpness-aware minimization (SAM) optimization and a Prototypical loss function to improve the generalization ability of DNN for multivariate time series classification problems under few-shot setting. Our experiments demonstrate our proposed method outperforms the existing baseline methods. Our source code is available at: https://github.com/JRB9/COSCO.

9/17/2024