Overcoming Catastrophic Forgetting in Federated Class-Incremental Learning via Federated Global Twin Generator

Read original: arXiv:2407.11078 - Published 7/17/2024 by Thinh Nguyen, Khoa D Doan, Binh T. Nguyen, Danh Le-Phuoc, Kok-Seng Wong

Overcoming Catastrophic Forgetting in Federated Class-Incremental Learning via Federated Global Twin Generator

Overview

This paper presents a novel approach called Federated Global Twin Generator (FGTG) to address the challenge of catastrophic forgetting in federated class-incremental learning.
Federated learning allows multiple clients to collaboratively train a shared model without sharing their local data. Class-incremental learning aims to continuously expand the model's capabilities by adding new classes over time.
The key idea of FGTG is to generate synthetic samples of past classes using a federated generator network, which helps the model retain knowledge of previously learned classes while acquiring new ones.

Plain English Explanation

The paper tackles the problem of "catastrophic forgetting" in federated class-incremental learning. <a href="https://aimodels.fyi/papers/arxiv/data-free-federated-class-incremental-learning-diffusion">Federated learning</a> allows multiple devices or organizations to train a shared machine learning model without sharing their private data. <a href="https://aimodels.fyi/papers/arxiv/reducing-bias-federated-class-incremental-learning-hierarchical">Class-incremental learning</a> is the ability to continuously expand the model's capabilities by adding new classes of information over time, without forgetting what it has learned before.

The key innovation in this paper is a method called the Federated Global Twin Generator (FGTG). The idea is to use a special type of neural network called a "generator" to create synthetic samples of the past classes. This helps the model retain knowledge of the previous classes while it learns the new ones, preventing it from "forgetting" what it has learned before. The generator network is trained in a federated way, meaning the different devices or organizations can collaborate to train it without sharing their private data.

By using this synthetic data generation approach, the model is able to continuously expand its knowledge without experiencing catastrophic forgetting, where it completely forgets what it has learned in the past. This is an important advance for real-world applications where models need to adapt and grow over time without losing critical information.

Technical Explanation

The paper proposes a novel approach called Federated Global Twin Generator (FGTG) to address the problem of catastrophic forgetting in <a href="https://aimodels.fyi/papers/arxiv/federated-continual-learning-goes-online-leveraging-uncertainty">federated class-incremental learning</a>.

The key components of FGTG are:

Federated Generator Network: A generative model that is trained in a federated manner to generate synthetic samples of past classes. This helps the main classification model retain knowledge of previously learned classes.
Global Twin Model: A copy of the main classification model that is stored on the server and updated with the generated samples from the federated generator. This global twin model is used to guide the local model updates on each client.
Cross-Entropy Distillation Loss: This loss function encourages the local models to match the predictions of the global twin model on the synthetic samples, helping to mitigate catastrophic forgetting.
Partial Fine-Tuning: The local models are only fine-tuned on the new classes, while the shared global model is updated with knowledge from all classes through the federated generator.

The experiments demonstrate that FGTG outperforms state-of-the-art federated class-incremental learning methods on benchmark datasets, showing its effectiveness in overcoming catastrophic forgetting. The authors also provide theoretical analysis to justify the design choices of FGTG.

Critical Analysis

The paper makes a valuable contribution by addressing the important challenge of catastrophic forgetting in federated class-incremental learning. The proposed FGTG approach is well-designed and empirically validated on several datasets.

One potential limitation is that the method relies on the ability to generate high-quality synthetic samples of past classes, which can be challenging in practice. The authors acknowledge this and suggest further research into more advanced generative models to improve sample quality.

Additionally, the paper does not explore the potential privacy implications of the federated generator network. While the main classification model is trained in a federated manner, the generator network is trained on the server, which could raise privacy concerns in certain applications. <a href="https://aimodels.fyi/papers/arxiv/fedprok-trustworthy-federated-class-incremental-learning-via">Further research into privacy-preserving federated generative models</a> could be valuable.

Overall, the paper presents a promising solution to a critical problem in federated class-incremental learning, and the FGTG approach could have significant implications for real-world applications that require models to continuously expand their capabilities without forgetting past knowledge.

Conclusion

This paper introduces a novel method called Federated Global Twin Generator (FGTG) to address the challenge of catastrophic forgetting in federated class-incremental learning. The key idea is to use a federated generator network to create synthetic samples of past classes, which are then used to guide the updates of the main classification model and prevent it from forgetting previous knowledge.

The experiments demonstrate the effectiveness of FGTG in outperforming state-of-the-art methods on benchmark datasets. This work represents an important advance in the field of federated learning, as it enables models to continuously expand their capabilities without suffering from the debilitating effects of catastrophic forgetting. The implications of this research could be far-reaching, as it paves the way for more robust and adaptable machine learning models in real-world applications.

As with any research, there are still some limitations and areas for further exploration, such as improving the quality of the generated samples and addressing potential privacy concerns. However, the FGTG approach showcases the power of federated learning and generative modeling to overcome one of the key challenges in class-incremental learning, and it is an exciting step forward for the field.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Overcoming Catastrophic Forgetting in Federated Class-Incremental Learning via Federated Global Twin Generator

Thinh Nguyen, Khoa D Doan, Binh T. Nguyen, Danh Le-Phuoc, Kok-Seng Wong

Federated Class-Incremental Learning (FCIL) increasingly becomes important in the decentralized setting, where it enables multiple participants to collaboratively train a global model to perform well on a sequence of tasks without sharing their private data. In FCIL, conventional Federated Learning algorithms such as FedAVG often suffer from catastrophic forgetting, resulting in significant performance declines on earlier tasks. Recent works, based on generative models, produce synthetic images to help mitigate this issue across all classes, but these approaches' testing accuracy on previous classes is still much lower than recent classes, i.e., having better plasticity than stability. To overcome these issues, this paper presents Federated Global Twin Generator (FedGTG), an FCIL framework that exploits privacy-preserving generative-model training on the global side without accessing client data. Specifically, the server trains a data generator and a feature generator to create two types of information from all seen classes, and then it sends the synthetic data to the client side. The clients then use feature-direction-controlling losses to make the local models retain knowledge and learn new tasks well. We extensively analyze the robustness of FedGTG on natural images, as well as its ability to converge to flat local minima and achieve better-predicting confidence (calibration). Experimental results on CIFAR-10, CIFAR-100, and tiny-ImageNet demonstrate the improvements in accuracy and forgetting measures of FedGTG compared to previous frameworks.

7/17/2024

🏅

Data-Free Federated Class Incremental Learning with Diffusion-Based Generative Memory

Naibo Wang, Yuchen Deng, Wenjie Feng, Jianwei Yin, See-Kiong Ng

Federated Class Incremental Learning (FCIL) is a critical yet largely underexplored issue that deals with the dynamic incorporation of new classes within federated learning (FL). Existing methods often employ generative adversarial networks (GANs) to produce synthetic images to address privacy concerns in FL. However, GANs exhibit inherent instability and high sensitivity, compromising the effectiveness of these methods. In this paper, we introduce a novel data-free federated class incremental learning framework with diffusion-based generative memory (DFedDGM) to mitigate catastrophic forgetting by generating stable, high-quality images through diffusion models. We design a new balanced sampler to help train the diffusion models to alleviate the common non-IID problem in FL, and introduce an entropy-based sample filtering technique from an information theory perspective to enhance the quality of generative samples. Finally, we integrate knowledge distillation with a feature-based regularization term for better knowledge transfer. Our framework does not incur additional communication costs compared to the baseline FedAvg method. Extensive experiments across multiple datasets demonstrate that our method significantly outperforms existing baselines, e.g., over a 4% improvement in average accuracy on the Tiny-ImageNet dataset.

5/29/2024

Reducing Bias in Federated Class-Incremental Learning with Hierarchical Generative Prototypes

Riccardo Salami, Pietro Buzzega, Matteo Mosconi, Mattia Verasani, Simone Calderara

Federated Learning (FL) aims at unburdening the training of deep models by distributing computation across multiple devices (clients) while safeguarding data privacy. On top of that, Federated Continual Learning (FCL) also accounts for data distribution evolving over time, mirroring the dynamic nature of real-world environments. In this work, we shed light on the Incremental and Federated biases that naturally emerge in FCL. While the former is a known problem in Continual Learning, stemming from the prioritization of recently introduced classes, the latter (i.e., the bias towards local distributions) remains relatively unexplored. Our proposal constrains both biases in the last layer by efficiently fine-tuning a pre-trained backbone using learnable prompts, resulting in clients that produce less biased representations and more biased classifiers. Therefore, instead of solely relying on parameter aggregation, we also leverage generative prototypes to effectively balance the predictions of the global model. Our method improves on the current State Of The Art, providing an average increase of +7.9% in accuracy.

6/5/2024

Federated Continual Learning Goes Online: Leveraging Uncertainty for Modality-Agnostic Class-Incremental Learning

Giuseppe Serra, Florian Buettner

Given the ability to model more realistic and dynamic problems, Federated Continual Learning (FCL) has been increasingly investigated recently. A well-known problem encountered in this setting is the so-called catastrophic forgetting, for which the learning model is inclined to focus on more recent tasks while forgetting the previously learned knowledge. The majority of the current approaches in FCL propose generative-based solutions to solve said problem. However, this setting requires multiple training epochs over the data, implying an offline setting where datasets are stored locally and remain unchanged over time. Furthermore, the proposed solutions are tailored for vision tasks solely. To overcome these limitations, we propose a new modality-agnostic approach to deal with the online scenario where new data arrive in streams of mini-batches that can only be processed once. To solve catastrophic forgetting, we propose an uncertainty-aware memory-based approach. In particular, we suggest using an estimator based on the Bregman Information (BI) to compute the model's variance at the sample level. Through measures of predictive uncertainty, we retrieve samples with specific characteristics, and - by retraining the model on such samples - we demonstrate the potential of this approach to reduce the forgetting effect in realistic settings.

7/4/2024