Learning Prompt with Distribution-Based Feature Replay for Few-Shot Class-Incremental Learning

Read original: arXiv:2401.01598 - Published 4/8/2024 by Zitong Huang, Ze Chen, Zhixing Chen, Erjin Zhou, Xinxing Xu, Rick Siow Mong Goh, Yong Liu, Wangmeng Zuo, Chunmei Feng

Learning Prompt with Distribution-Based Feature Replay for Few-Shot Class-Incremental Learning

Overview

The paper introduces a new approach called "Learning Prompt with Distribution-Based Feature Replay" for few-shot class-incremental learning.
This method aims to address the challenge of learning new classes in a sequential manner while retaining knowledge of previously learned classes.
The proposed approach leverages pre-trained language models, distribution-based feature replay, and a novel learning prompt mechanism to achieve this goal.

Plain English Explanation

The researchers have developed a new technique to help AI systems learn new classes of objects or tasks in a step-by-step fashion, while still remembering what they've learned before. This is an important challenge in the field of machine learning, known as "class-incremental learning."

The key ideas behind their approach are:

Pre-trained Language Models: The researchers use large, pre-trained language models, such as GPT-3, as a starting point. These models have already learned a lot about language and can be adapted to other tasks.
Distribution-Based Feature Replay: When learning new classes, the system stores a compact representation of the features (or characteristics) of the previously learned classes. This allows it to "replay" those features when learning the new classes, helping to retain the old knowledge.
Learning Prompt: The researchers introduce a novel "learning prompt" mechanism, inspired by the way large language models can be prompted to learn new tasks. This prompt helps the system effectively integrate the new class information with its existing knowledge.

By combining these three elements, the researchers have created a system that can learn new classes of objects or tasks in a step-by-step fashion, while still maintaining its knowledge of what it had learned before. This is an important advancement in the field of class-incremental learning, which is crucial for building AI systems that can continue to learn and adapt over time.

Technical Explanation

The proposed method, called "Learning Prompt with Distribution-Based Feature Replay" (LP-DBFR), builds on the idea of using pre-trained language models as a starting point for other tasks and leveraging prompting techniques to help these models learn new information.

The key components of LP-DBFR are:

Distribution-Based Feature Replay (DBFR): When learning new classes, the system stores a compact representation of the feature distributions of the previously learned classes. This allows it to "replay" those features when learning the new classes, helping to retain the old knowledge.
Learning Prompt: The researchers introduce a novel "learning prompt" mechanism that helps the system effectively integrate the new class information with its existing knowledge. This prompt is based on the idea of convolutional prompting, but adapted for the class-incremental learning setting.
Pre-trained Language Model: The system uses a pre-trained language model, such as GPT-3, as a starting point. This allows the system to leverage the rich, general-purpose knowledge already learned by these large models.

The researchers evaluate their approach on several few-shot class-incremental learning benchmarks, and show that LP-DBFR outperforms other state-of-the-art methods in terms of both learning new classes and retaining knowledge of previous classes.

Critical Analysis

The paper presents a well-designed and carefully evaluated approach to the challenging problem of few-shot class-incremental learning. The authors acknowledge several limitations and avenues for further research:

Computational Complexity: The use of pre-trained language models and distribution-based feature replay can increase the computational complexity of the system, particularly as the number of learned classes grows. The authors note that further optimizations may be needed to scale the approach to larger-scale problems.
Dependence on Pre-training: The performance of LP-DBFR relies heavily on the quality and generalization capabilities of the pre-trained language model used as a starting point. If the pre-trained model has biases or limitations, these may be reflected in the final system.
Sensitivity to Hyperparameters: The authors mention that the approach can be sensitive to the choice of hyperparameters, such as the size of the learning prompt and the specific distribution-based feature replay mechanism. More work may be needed to make the system more robust to these choices.
[object Object]: As with many machine learning systems, there may be concerns about the potential for LP-DBFR to exhibit biases or fairness issues, particularly when applied to real-world tasks. Further analysis and testing would be needed to address these concerns.

Overall, the paper presents a promising approach to the challenging problem of few-shot class-incremental learning, and the authors have taken care to acknowledge the limitations and areas for future research. As the field of class-incremental learning continues to advance, techniques like LP-DBFR may play an important role in developing AI systems that can continually learn and adapt over time.

Conclusion

The researchers have proposed a new method called "Learning Prompt with Distribution-Based Feature Replay" (LP-DBFR) to address the challenge of few-shot class-incremental learning. By leveraging pre-trained language models, distribution-based feature replay, and a novel learning prompt mechanism, LP-DBFR demonstrates strong performance in learning new classes while retaining knowledge of previously learned classes.

While the approach has some limitations, such as computational complexity and sensitivity to hyperparameters, it represents an important advancement in the field of class-incremental learning. As AI systems become more ubiquitous, the ability to continually learn and adapt will be crucial for their real-world deployment. Techniques like LP-DBFR may help pave the way for the development of more flexible and capable AI systems that can continuously expand their knowledge and skills over time.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Learning Prompt with Distribution-Based Feature Replay for Few-Shot Class-Incremental Learning

Zitong Huang, Ze Chen, Zhixing Chen, Erjin Zhou, Xinxing Xu, Rick Siow Mong Goh, Yong Liu, Wangmeng Zuo, Chunmei Feng

Few-shot Class-Incremental Learning (FSCIL) aims to continuously learn new classes based on very limited training data without forgetting the old ones encountered. Existing studies solely relied on pure visual networks, while in this paper we solved FSCIL by leveraging the Vision-Language model (e.g., CLIP) and propose a simple yet effective framework, named Learning Prompt with Distribution-based Feature Replay (LP-DiF). We observe that simply using CLIP for zero-shot evaluation can substantially outperform the most influential methods. Then, prompt tuning technique is involved to further improve its adaptation ability, allowing the model to continually capture specific knowledge from each session. To prevent the learnable prompt from forgetting old knowledge in the new session, we propose a pseudo-feature replay approach. Specifically, we preserve the old knowledge of each class by maintaining a feature-level Gaussian distribution with a diagonal covariance matrix, which is estimated by the image features of training images and synthesized features generated from a VAE. When progressing to a new session, pseudo-features are sampled from old-class distributions combined with training images of the current session to optimize the prompt, thus enabling the model to learn new knowledge while retaining old knowledge. Experiments on three prevalent benchmarks, i.e., CIFAR100, mini-ImageNet, CUB-200, and two more challenging benchmarks, i.e., SUN-397 and CUB-200$^*$ proposed in this paper showcase the superiority of LP-DiF, achieving new state-of-the-art (SOTA) in FSCIL. Code is publicly available at https://github.com/1170300714/LP-DiF.

4/8/2024

❗

Few-Shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt

Chenxi Liu, Zhenyi Wang, Tianyi Xiong, Ruibo Chen, Yihan Wu, Junfeng Guo, Heng Huang

Few-Shot Class-Incremental Learning (FSCIL) models aim to incrementally learn new classes with scarce samples while preserving knowledge of old ones. Existing FSCIL methods usually fine-tune the entire backbone, leading to overfitting and hindering the potential to learn new classes. On the other hand, recent prompt-based CIL approaches alleviate forgetting by training prompts with sufficient data in each task. In this work, we propose a novel framework named Attention-aware Self-adaptive Prompt (ASP). ASP encourages task-invariant prompts to capture shared knowledge by reducing specific information from the attention aspect. Additionally, self-adaptive task-specific prompts in ASP provide specific information and transfer knowledge from old classes to new classes with an Information Bottleneck learning objective. In summary, ASP prevents overfitting on base task and does not require enormous data in few-shot incremental tasks. Extensive experiments on three benchmark datasets validate that ASP consistently outperforms state-of-the-art FSCIL and prompt-based CIL methods in terms of both learning new classes and mitigating forgetting.

7/18/2024

Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners

Keon-Hee Park, Kyungwoo Song, Gyeong-Moon Park

Few-Shot Class Incremental Learning (FSCIL) is a task that requires a model to learn new classes incrementally without forgetting when only a few samples for each class are given. FSCIL encounters two significant challenges: catastrophic forgetting and overfitting, and these challenges have driven prior studies to primarily rely on shallow models, such as ResNet-18. Even though their limited capacity can mitigate both forgetting and overfitting issues, it leads to inadequate knowledge transfer during few-shot incremental sessions. In this paper, we argue that large models such as vision and language transformers pre-trained on large datasets can be excellent few-shot incremental learners. To this end, we propose a novel FSCIL framework called PriViLege, Pre-trained Vision and Language transformers with prompting functions and knowledge distillation. Our framework effectively addresses the challenges of catastrophic forgetting and overfitting in large models through new pre-trained knowledge tuning (PKT) and two losses: entropy-based divergence loss and semantic knowledge distillation loss. Experimental results show that the proposed PriViLege significantly outperforms the existing state-of-the-art methods with a large margin, e.g., +9.38% in CUB200, +20.58% in CIFAR-100, and +13.36% in miniImageNet. Our implementation code is available at https://github.com/KHU-AGI/PriViLege.

4/3/2024

Few Shot Class Incremental Learning using Vision-Language models

Anurag Kumar, Chinmay Bharti, Saikat Dutta, Srikrishna Karanam, Biplab Banerjee

Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The challenge emerges in seamlessly integrating new classes with few samples into the training data, demanding the model to adeptly accommodate these additions without compromising its performance on base classes. To address this exigency, the research community has introduced several solutions under the realm of few-shot class incremental learning (FSCIL). In this study, we introduce an innovative FSCIL framework that utilizes language regularizer and subspace regularizer. During base training, the language regularizer helps incorporate semantic information extracted from a Vision-Language model. The subspace regularizer helps in facilitating the model's acquisition of nuanced connections between image and text semantics inherent to base classes during incremental training. Our proposed framework not only empowers the model to embrace novel classes with limited data, but also ensures the preservation of performance on base classes. To substantiate the efficacy of our approach, we conduct comprehensive experiments on three distinct FSCIL benchmarks, where our framework attains state-of-the-art performance.

8/16/2024