ISFL: Federated Learning for Non-i.i.d. Data with Local Importance Sampling

Read original: arXiv:2210.02119 - Published 5/14/2024 by Zheqi Zhu, Yuchen Shi, Pingyi Fan, Chenghui Peng, Khaled B. Letaief

📊

Overview

Federated learning (FL) is a promising learning paradigm that combines computation and communication, where local training is performed on distributed clients, and the results are periodically shared.
Due to the non-i.i.d. (non-independent and identically distributed) data distribution on clients, FL models can suffer from issues like gradient diversity, poor performance, and bad convergence.
This work aims to address this key issue by adopting importance sampling (IS) for local training, proposing an importance sampling federated learning (ISFL) framework with theoretical guarantees.

Plain English Explanation

Federated learning is a way of training machine learning models that involves multiple devices or clients, like smartphones or tablets, working together. Instead of sending all the data to a central server, each client trains a model on their local data and then shares the results with the server. This allows the model to be trained without the clients having to share their private data.

However, the data on the different clients may not be the same, which can cause problems for the model's performance and training. The research paper proposes a solution to this issue by using an "importance sampling" technique during the local training on each client. Importance sampling means that the client focuses more on the data points that are more important or influential for the model's training.

The paper provides a mathematical framework to explain how this importance sampling approach, called ISFL, can improve the performance and efficiency of federated learning, especially when the data on the clients is very different. The researchers also develop algorithms to calculate the optimal importance weights for the local training.

Technical Explanation

The paper first derives a convergence theorem for ISFL, which involves the effects of the local importance sampling. It then formulates the problem of selecting optimal importance sampling (IS) weights and obtains the theoretical solutions. A water-filling method is employed to calculate the IS weights, and the ISFL algorithms are developed.

The experimental results on the CIFAR-10 dataset show that ISFL outperforms standard federated learning in terms of performance, sampling efficiency, and explainability on non-i.i.d. data. The authors claim that ISFL is the first non-i.i.d. FL solution from the local sampling aspect that exhibits theoretical compatibility with neural network models.

Furthermore, as a local sampling approach, ISFL can be easily integrated into other emerging federated learning frameworks, such as multi-confederated learning, adaptive heterogeneous client sampling, and robust model aggregation.

Critical Analysis

The paper provides a solid theoretical foundation for the ISFL framework and demonstrates its empirical performance improvements. However, the authors do not discuss the potential computational overhead or the impact of the importance sampling on the overall communication cost of the federated learning process.

Additionally, the paper does not explore the robustness of the ISFL approach to various types of non-i.i.d. data distributions or the potential for the importance sampling to introduce bias in the model training.

Further research could investigate the scalability of ISFL to larger and more complex datasets, as well as its integration with other federated learning techniques, such as adaptive client sampling or multi-confederated learning.

Conclusion

The proposed importance sampling federated learning (ISFL) framework offers a promising solution to the non-i.i.d. data distribution problem in federated learning. By leveraging importance sampling during local training, ISFL can improve the performance, sampling efficiency, and explainability of federated learning models, especially in scenarios with heterogeneous client data.

The theoretical guarantees and experimental results presented in the paper suggest that ISFL could be a valuable contribution to the ongoing research in federated learning. As a local sampling approach, ISFL could also be easily integrated into other emerging federated learning techniques, further expanding its potential impact on the field.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📊

ISFL: Federated Learning for Non-i.i.d. Data with Local Importance Sampling

Zheqi Zhu, Yuchen Shi, Pingyi Fan, Chenghui Peng, Khaled B. Letaief

As a promising learning paradigm integrating computation and communication, federated learning (FL) proceeds the local training and the periodic sharing from distributed clients. Due to the non-i.i.d. data distribution on clients, FL model suffers from the gradient diversity, poor performance, bad convergence, etc. In this work, we aim to tackle this key issue by adopting importance sampling (IS) for local training. We propose importance sampling federated learning (ISFL), an explicit framework with theoretical guarantees. Firstly, we derive the convergence theorem of ISFL to involve the effects of local importance sampling. Then, we formulate the problem of selecting optimal IS weights and obtain the theoretical solutions. We also employ a water-filling method to calculate the IS weights and develop the ISFL algorithms. The experimental results on CIFAR-10 fit the proposed theorems well and verify that ISFL reaps better performance, sampling efficiency, as well as explainability on non-i.i.d. data. To the best of our knowledge, ISFL is the first non-i.i.d. FL solution from the local sampling aspect which exhibits theoretical compatibility with neural network models. Furthermore, as a local sampling approach, ISFL can be easily migrated into other emerging FL frameworks.

5/14/2024

One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity

Naibo Wang, Yuchen Deng, Wenjie Feng, Shichen Fan, Jianwei Yin, See-Kiong Ng

Traditional federated learning mainly focuses on parallel settings (PFL), which can suffer significant communication and computation costs. In contrast, one-shot and sequential federated learning (SFL) have emerged as innovative paradigms to alleviate these costs. However, the issue of non-IID (Independent and Identically Distributed) data persists as a significant challenge in one-shot and SFL settings, exacerbated by the restricted communication between clients. In this paper, we improve the one-shot sequential federated learning for non-IID data by proposing a local model diversity-enhancing strategy. Specifically, to leverage the potential of local model diversity for improving model performance, we introduce a local model pool for each client that comprises diverse models generated during local training, and propose two distance measurements to further enhance the model diversity and mitigate the effect of non-IID data. Consequently, our proposed framework can improve the global model performance while maintaining low communication costs. Extensive experiments demonstrate that our method exhibits superior performance to existing one-shot PFL methods and achieves better accuracy compared with state-of-the-art one-shot SFL methods on both label-skew and domain-shift tasks (e.g., 6%+ accuracy improvement on the CIFAR-10 dataset).

4/19/2024

🔄

Adaptive Federated Learning in Heterogeneous Wireless Networks with Independent Sampling

Jiaxiang Geng, Yanzhao Hou, Xiaofeng Tao, Juncheng Wang, Bing Luo

Federated Learning (FL) algorithms commonly sample a random subset of clients to address the straggler issue and improve communication efficiency. While recent works have proposed various client sampling methods, they have limitations in joint system and data heterogeneity design, which may not align with practical heterogeneous wireless networks. In this work, we advocate a new independent client sampling strategy to minimize the wall-clock training time of FL, while considering data heterogeneity and system heterogeneity in both communication and computation. We first derive a new convergence bound for non-convex loss functions with independent client sampling and then propose an adaptive bandwidth allocation scheme. Furthermore, we propose an efficient independent client sampling algorithm based on the upper bounds on the convergence rounds and the expected per-round training time, to minimize the wall-clock time of FL, while considering both the data and system heterogeneity. Experimental results under practical wireless network settings with real-world prototype demonstrate that the proposed independent sampling scheme substantially outperforms the current best sampling schemes under various training models and datasets.

5/15/2024

📊

SemiSFL: Split Federated Learning on Unlabeled and Non-IID Data

Yang Xu, Yunming Liao, Hongli Xu, Zhipeng Sun, Liusheng Huang, Chunming Qiao

Federated Learning (FL) has emerged to allow multiple clients to collaboratively train machine learning models on their private data at the network edge. However, training and deploying large-scale models on resource-constrained devices is challenging. Fortunately, Split Federated Learning (SFL) offers a feasible solution by alleviating the computation and/or communication burden on clients. However, existing SFL works often assume sufficient labeled data on clients, which is usually impractical. Besides, data non-IIDness poses another challenge to ensure efficient model training. To our best knowledge, the above two issues have not been simultaneously addressed in SFL. Herein, we propose a novel Semi-supervised SFL system, termed SemiSFL, which incorporates clustering regularization to perform SFL with unlabeled and non-IID client data. Moreover, our theoretical and experimental investigations into model convergence reveal that the inconsistent training processes on labeled and unlabeled data have an influence on the effectiveness of clustering regularization. To mitigate the training inconsistency, we develop an algorithm for dynamically adjusting the global updating frequency, so as to improve training performance. Extensive experiments on benchmark models and datasets show that our system provides a 3.8x speed-up in training time, reduces the communication cost by about 70.3% while reaching the target accuracy, and achieves up to 5.8% improvement in accuracy under non-IID scenarios compared to the state-of-the-art baselines.

8/6/2024