DiffClass: Diffusion-Based Class Incremental Learning

Read original: arXiv:2403.05016 - Published 7/23/2024 by Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang

DiffClass: Diffusion-Based Class Incremental Learning

Overview

The paper proposes a new class incremental learning (CIL) method called "DiffClass" that uses a diffusion model to generate exemplars for past classes.
This allows the model to learn new classes without forgetting old ones, a common challenge in CIL.
The paper demonstrates DiffClass achieves state-of-the-art performance on several CIL benchmarks.

Plain English Explanation

Class Incremental Learning (CIL) is a machine learning technique where a model is trained to recognize new classes of objects over time, without forgetting how to recognize old classes. This is challenging because as the model learns new information, it tends to "forget" what it previously learned.

The DiffClass method proposed in this paper uses a diffusion model to generate synthetic examples, or "exemplars", of past classes. These exemplars are then used to train the model on new classes without causing it to forget the old ones.

Diffusion models work by starting with random noise and progressively refining it into realistic-looking images through a complex process. DiffClass leverages this capability to generate high-quality exemplars that help the model maintain its knowledge of previous classes.

By using this diffusion-based approach, DiffClass is able to achieve better performance on CIL benchmarks compared to other methods that struggle with the "forgetting" problem.

Technical Explanation

The key innovation of DiffClass is the use of a diffusion model to generate exemplars for past classes.

Specifically, the authors train a diffusion model on the training data for each class as it is introduced. This allows the diffusion model to learn the underlying distribution of each class. When a new class is presented, the diffusion model can then generate high-quality synthetic exemplars of the past classes.

These exemplars are then used in conjunction with the new class data to train the main classification model. This "exemplar-free" approach allows the model to continually learn new classes without catastrophically forgetting the old ones, a common issue in class incremental learning.

The authors evaluate DiffClass on several standard CIL benchmarks, including CIFAR-100 and ImageNet-Subset. They show that DiffClass outperforms previous state-of-the-art CIL methods by a significant margin.

Critical Analysis

The authors do a thorough job of evaluating DiffClass and comparing it to other prominent CIL approaches. The results demonstrate the effectiveness of using a diffusion model to generate exemplars for past classes.

However, one potential limitation is the computational cost of training the diffusion model for each new class. This could make DiffClass challenging to scale to domains with a very large number of classes.

Additionally, the paper does not explore the model's performance under more extreme class incremental scenarios, such as learning a single new class at a time. Further research could investigate the robustness of DiffClass in these more challenging settings.

Overall, DiffClass represents a promising advance in class incremental learning by leveraging the powerful capabilities of diffusion models. The findings suggest this approach is worth further exploration and refinement.

Conclusion

This paper introduces DiffClass, a new class incremental learning method that uses a diffusion model to generate exemplars of past classes. By maintaining a memory of previous classes through these synthetic exemplars, DiffClass is able to outperform other CIL approaches on several benchmark datasets.

The diffusion-based approach is a clever solution to the "forgetting" problem that plagues many CIL models. While there are some potential scalability concerns, the strong empirical results suggest DiffClass is a valuable contribution to the field of continual learning.

As AI systems become more ubiquitous, the ability to learn new skills and knowledge over time without forgetting the old will be critical. Methods like DiffClass represent important steps toward more robust and adaptable machine learning models.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DiffClass: Diffusion-Based Class Incremental Learning

Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang

Class Incremental Learning (CIL) is challenging due to catastrophic forgetting. On top of that, Exemplar-free Class Incremental Learning is even more challenging due to forbidden access to previous task data. Recent exemplar-free CIL methods attempt to mitigate catastrophic forgetting by synthesizing previous task data. However, they fail to overcome the catastrophic forgetting due to the inability to deal with the significant domain gap between real and synthetic data. To overcome these issues, we propose a novel exemplar-free CIL method. Our method adopts multi-distribution matching (MDM) diffusion models to unify quality and bridge domain gaps among all domains of training data. Moreover, our approach integrates selective synthetic image augmentation (SSIA) to expand the distribution of the training data, thereby improving the model's plasticity and reinforcing the performance of our method's ultimate component, multi-domain adaptation (MDA). With the proposed integrations, our method then reformulates exemplar-free CIL into a multi-domain adaptation problem to implicitly address the domain gap problem to enhance model stability during incremental training. Extensive experiments on benchmark class incremental datasets and settings demonstrate that our method excels previous exemplar-free CIL methods and achieves state-of-the-art performance.

7/23/2024

↗️

Class-Incremental Learning: A Survey

Da-Wei Zhou, Qi-Wei Wang, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

Deep models, e.g., CNNs and Vision Transformers, have achieved impressive achievements in many vision tasks in the closed world. However, novel classes emerge from time to time in our ever-changing world, requiring a learning system to acquire new knowledge continually. Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally and build a universal classifier among all seen classes. Correspondingly, when directly training the model with new class instances, a fatal problem occurs -- the model tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades. There have been numerous efforts to tackle catastrophic forgetting in the machine learning community. In this paper, we survey comprehensively recent advances in class-incremental learning and summarize these methods from several aspects. We also provide a rigorous and unified evaluation of 17 methods in benchmark image classification tasks to find out the characteristics of different algorithms empirically. Furthermore, we notice that the current comparison protocol ignores the influence of memory budget in model storage, which may result in unfair comparison and biased results. Hence, we advocate fair comparison by aligning the memory budget in evaluation, as well as several memory-agnostic performance measures. The source code is available at https://github.com/zhoudw-zdw/CIL_Survey/

7/16/2024

PASS++: A Dual Bias Reduction Framework for Non-Exemplar Class-Incremental Learning

Fei Zhu, Xu-Yao Zhang, Zhen Cheng, Cheng-Lin Liu

Class-incremental learning (CIL) aims to recognize new classes incrementally while maintaining the discriminability of old classes. Most existing CIL methods are exemplar-based, i.e., storing a part of old data for retraining. Without relearning old data, those methods suffer from catastrophic forgetting. In this paper, we figure out two inherent problems in CIL, i.e., representation bias and classifier bias, that cause catastrophic forgetting of old knowledge. To address these two biases, we present a simple and novel dual bias reduction framework that employs self-supervised transformation (SST) in input space and prototype augmentation (protoAug) in deep feature space. On the one hand, SST alleviates the representation bias by learning generic and diverse representations that can transfer across different tasks. On the other hand, protoAug overcomes the classifier bias by explicitly or implicitly augmenting prototypes of old classes in the deep feature space, which poses tighter constraints to maintain previously learned decision boundaries. We further propose hardness-aware prototype augmentation and multi-view ensemble strategies, leading to significant improvements. The proposed framework can be easily integrated with pre-trained models. Without storing any samples of old classes, our method can perform comparably with state-of-the-art exemplar-based approaches which store plenty of old data. We hope to draw the attention of researchers back to non-exemplar CIL by rethinking the necessity of storing old samples in CIL.

7/22/2024

🏅

Data-Free Federated Class Incremental Learning with Diffusion-Based Generative Memory

Naibo Wang, Yuchen Deng, Wenjie Feng, Jianwei Yin, See-Kiong Ng

Federated Class Incremental Learning (FCIL) is a critical yet largely underexplored issue that deals with the dynamic incorporation of new classes within federated learning (FL). Existing methods often employ generative adversarial networks (GANs) to produce synthetic images to address privacy concerns in FL. However, GANs exhibit inherent instability and high sensitivity, compromising the effectiveness of these methods. In this paper, we introduce a novel data-free federated class incremental learning framework with diffusion-based generative memory (DFedDGM) to mitigate catastrophic forgetting by generating stable, high-quality images through diffusion models. We design a new balanced sampler to help train the diffusion models to alleviate the common non-IID problem in FL, and introduce an entropy-based sample filtering technique from an information theory perspective to enhance the quality of generative samples. Finally, we integrate knowledge distillation with a feature-based regularization term for better knowledge transfer. Our framework does not incur additional communication costs compared to the baseline FedAvg method. Extensive experiments across multiple datasets demonstrate that our method significantly outperforms existing baselines, e.g., over a 4% improvement in average accuracy on the Tiny-ImageNet dataset.

5/29/2024