Concept-1K: A Novel Benchmark for Instance Incremental Learning

Read original: arXiv:2402.08526 - Published 6/19/2024 by Junhao Zheng, Shengjie Qiu, Qianli Ma

🧠

Overview

The paper proposes a new incremental learning (IL) scenario called instance-incremental learning (IIL) and a novel dataset called Concept-1K to better assess catastrophic forgetting in large language models (LLMs).
The experiments on Concept-1K reveal that billion-parameter LLMs still suffer from catastrophic forgetting, and the forgetting is affected by model scale, pretraining, and buffer size.
Existing IL methods and the popular finetuning technique, LoRA, fail to achieve satisfactory performance in addressing the forgetting in LLMs.

Plain English Explanation

Incremental learning (IL) is essential for developing human-level intelligence in neural networks. However, the current IL scenarios and datasets do not adequately capture the problem of catastrophic forgetting in large language models (LLMs). Catastrophic forgetting refers to the tendency of neural networks to quickly forget previously learned information when trained on new tasks.

To address this, the researchers propose a new IL scenario called instance-incremental learning (IIL) and a dataset called Concept-1K. This scenario and dataset are designed to be more challenging and support larger IL steps, which are more representative of real-world learning.

The experiments on Concept-1K reveal that even the largest LLMs, with billions of parameters, still struggle with catastrophic forgetting. The researchers found that the degree of forgetting is influenced by the model's scale, pretraining, and the size of the buffer used to store previous knowledge.

Interestingly, the researchers also found that existing IL methods and a popular finetuning technique called LoRA fail to effectively mitigate the forgetting problem in LLMs. This suggests that more powerful techniques need to be developed to address the catastrophic forgetting in these large models.

Technical Explanation

The paper introduces a novel instance-incremental learning (IIL) scenario and a dataset called Concept-1K to better assess the catastrophic forgetting in large language models (LLMs). The IIL scenario involves incrementally learning new instances of concepts, rather than entirely new concepts, which is more representative of real-world learning.

The Concept-1K dataset is designed to support an order of magnitude larger IL steps compared to existing datasets, allowing the researchers to study forgetting in LLMs more effectively. The dataset consists of 1,000 visual concepts and their corresponding textual descriptions.

The researchers conduct experiments on Concept-1K using billion-parameter LLMs, including GPT-3 and T5. The results show that these large-scale models still suffer from catastrophic forgetting, and the degree of forgetting is influenced by the model's scale, pretraining, and the size of the buffer used to store previous knowledge.

The paper also explores the performance of existing incremental learning methods, such as Bayesian Learning Driven Prototypical Contrastive Loss and IncPrompt, as well as the popular finetuning technique, LoRA, in mitigating the forgetting problem. However, the researchers find that these approaches fail to achieve satisfactory performance on the Concept-1K dataset.

Critical Analysis

The paper presents a novel and challenging IL scenario and dataset that effectively reveal the catastrophic forgetting problem in large language models. The use of the Concept-1K dataset, which supports larger IL steps, is a significant improvement over existing benchmarks and provides a more realistic assessment of forgetting in LLMs.

However, the paper does not delve into the potential reasons why existing IL methods and finetuning techniques fail to address the forgetting problem in LLMs. Further investigation into the underlying mechanisms and limitations of these approaches could provide valuable insights for the development of more effective solutions.

Additionally, the paper could have explored the impact of different architectural choices, such as the use of memory modules or continual learning mechanisms, on the forgetting problem. Studying the effectiveness of these techniques in the context of the proposed IIL scenario and Concept-1K dataset could lead to a better understanding of how to mitigate catastrophic forgetting in LLMs.

Overall, the paper makes a significant contribution by introducing a novel IL scenario and dataset that uncover the pressing challenge of catastrophic forgetting in large language models. The findings highlight the need for more powerful techniques to be designed for alleviating forgetting in these advanced AI systems.

Conclusion

The paper presents a novel instance-incremental learning (IIL) scenario and the Concept-1K dataset to better assess the catastrophic forgetting problem in large language models (LLMs). The experiments reveal that even billion-parameter LLMs still suffer from catastrophic forgetting, and the degree of forgetting is influenced by model scale, pretraining, and buffer size.

Existing incremental learning methods and finetuning techniques, such as LoRA, fail to achieve satisfactory performance in addressing the forgetting problem in LLMs. This suggests the need for more powerful techniques to be developed to alleviate the catastrophic forgetting in these advanced AI systems.

The study provides a valuable benchmark for future research in this area and encourages the exploration of more effective solutions to the catastrophic forgetting challenge in large language models.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🧠

Concept-1K: A Novel Benchmark for Instance Incremental Learning

Junhao Zheng, Shengjie Qiu, Qianli Ma

Large Language Models (LLMs) have achieved remarkable success across various tasks, yet their ability to learn incrementally without forgetting remains underexplored. Incremental learning (IL) is crucial as it enables models to acquire new knowledge while retaining previously learned information, akin to human learning. Existing benchmarks for IL are insufficient due to data leakage issues and the overqualification of LLMs. To address these challenges, we introduce Concept-1K, a novel dataset comprising 1,023 recently emerged concepts across diverse domains. The concepts in Concept-1K are discrete, interpretable units of knowledge that allow for fine-grained analysis of learning and forgetting processes. Using Concept-1K as a testbed, we aim to answer the question: ``Can LLMs learn new concepts incrementally without forgetting like humans?'' Our investigation reveals that LLMs still suffer from catastrophic forgetting and that LoRA, despite fine-tuning fewer parameters, may lead to more forgetting on training data. Additionally, we explore the roles of in-context learning, model scale, buffer size, and pretraining in IL performance. These findings highlight the strengths and limitations of LLMs in IL scenarios and provide a robust benchmark for future research.

6/19/2024

Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models

Junhao Zheng, Shengjie Qiu, Qianli Ma

Incremental Learning (IL) has been a long-standing problem in both vision and Natural Language Processing (NLP) communities. In recent years, as Pre-trained Language Models (PLMs) have achieved remarkable progress in various NLP downstream tasks, utilizing PLMs as backbones has become a common practice in recent research of IL in NLP. Most assume that catastrophic forgetting is the biggest obstacle to achieving superior IL performance and propose various techniques to overcome this issue. However, we find that this assumption is problematic. Specifically, we revisit more than 20 methods on four classification tasks (Text Classification, Intent Classification, Relation Extraction, and Named Entity Recognition) under the two most popular IL settings (Class-Incremental and Task-Incremental) and reveal that most of them severely underestimate the inherent anti-forgetting ability of PLMs. Based on the observation, we propose a frustratingly easy method called SEQ* for IL with PLMs. The results show that SEQ* has competitive or superior performance compared to state-of-the-art (SOTA) IL methods and requires considerably less trainable parameters and training time. These findings urge us to revisit the IL with PLMs and encourage future studies to have a fundamental understanding of the catastrophic forgetting in PLMs. The data, code and scripts are publicly available at https://github.com/zzz47zzz/codebase-for-incremental-learning-with-llm.

5/28/2024

↗️

Class-Incremental Learning: A Survey

Da-Wei Zhou, Qi-Wei Wang, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

Deep models, e.g., CNNs and Vision Transformers, have achieved impressive achievements in many vision tasks in the closed world. However, novel classes emerge from time to time in our ever-changing world, requiring a learning system to acquire new knowledge continually. Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally and build a universal classifier among all seen classes. Correspondingly, when directly training the model with new class instances, a fatal problem occurs -- the model tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades. There have been numerous efforts to tackle catastrophic forgetting in the machine learning community. In this paper, we survey comprehensively recent advances in class-incremental learning and summarize these methods from several aspects. We also provide a rigorous and unified evaluation of 17 methods in benchmark image classification tasks to find out the characteristics of different algorithms empirically. Furthermore, we notice that the current comparison protocol ignores the influence of memory budget in model storage, which may result in unfair comparison and biased results. Hence, we advocate fair comparison by aligning the memory budget in evaluation, as well as several memory-agnostic performance measures. The source code is available at https://github.com/zhoudw-zdw/CIL_Survey/

7/16/2024

Decision Boundary-aware Knowledge Consolidation Generates Better Instance-Incremental Learner

Qiang Nie, Weifu Fu, Yuhuan Lin, Jialin Li, Yifeng Zhou, Yong Liu, Lei Zhu, Chengjie Wang

Instance-incremental learning (IIL) focuses on learning continually with data of the same classes. Compared to class-incremental learning (CIL), the IIL is seldom explored because IIL suffers less from catastrophic forgetting (CF). However, besides retaining knowledge, in real-world deployment scenarios where the class space is always predefined, continual and cost-effective model promotion with the potential unavailability of previous data is a more essential demand. Therefore, we first define a new and more practical IIL setting as promoting the model's performance besides resisting CF with only new observations. Two issues have to be tackled in the new IIL setting: 1) the notorious catastrophic forgetting because of no access to old data, and 2) broadening the existing decision boundary to new observations because of concept drift. To tackle these problems, our key insight is to moderately broaden the decision boundary to fail cases while retain old boundary. Hence, we propose a novel decision boundary-aware distillation method with consolidating knowledge to teacher to ease the student learning new knowledge. We also establish the benchmarks on existing datasets Cifar-100 and ImageNet. Notably, extensive experiments demonstrate that the teacher model can be a better incremental learner than the student model, which overturns previous knowledge distillation-based methods treating student as the main role.

6/6/2024