SemiSFL: Split Federated Learning on Unlabeled and Non-IID Data

Read original: arXiv:2307.15870 - Published 8/6/2024 by Yang Xu, Yunming Liao, Hongli Xu, Zhipeng Sun, Liusheng Huang, Chunming Qiao

📊

Overview

Federated Learning (FL) allows multiple clients to collaboratively train machine learning models on their private data at the network edge.
Training and deploying large-scale models on resource-constrained devices is challenging.
Split Federated Learning (SFL) alleviates the computation and/or communication burden on clients.
Existing SFL works often assume sufficient labeled data on clients, which is usually impractical.
Data non-IIDness (non-independent and identically distributed) poses another challenge to efficient model training.
The issues of insufficient labeled data and non-IID data have not been simultaneously addressed in SFL.

Plain English Explanation

Federated Learning (FL) is a way for multiple devices, like smartphones or sensors, to work together to train a machine learning model without sharing their private data. This is useful because it allows the model to be trained on a wider range of data without compromising the privacy of the users.

However, training large, complex models on these resource-constrained devices can be difficult. Split Federated Learning (SFL) helps by dividing the model between the device and a central server, reducing the computational and communication burden on the individual devices.

But existing SFL approaches often assume that the devices have enough labeled data to train the model, which is not always the case. They also struggle with the fact that the data on different devices may not be distributed evenly (non-IID data). These two issues – insufficient labeled data and non-IID data – have not been addressed together in SFL before.

Technical Explanation

To address these challenges, the researchers propose a novel system called SemiSFL, which incorporates clustering regularization to perform SFL with unlabeled and non-IID client data. This means that the system can still train the model effectively even when the clients don't have much labeled data, and the data is unevenly distributed across devices.

Furthermore, the researchers found that the inconsistent training processes on labeled and unlabeled data can impact the effectiveness of the clustering regularization. To mitigate this, they developed an algorithm to dynamically adjust the global updating frequency, which improves the overall training performance.

The researchers conducted extensive experiments on benchmark models and datasets, and found that their SemiSFL system provides a 3.8x speed-up in training time, reduces the communication cost by about 70.3% while reaching the target accuracy, and achieves up to 5.8% improvement in accuracy under non-IID scenarios compared to other state-of-the-art approaches.

Critical Analysis

The paper addresses important practical challenges in deploying large-scale machine learning models on resource-constrained devices, such as smartphones or sensors, using Federated Learning. The proposed SemiSFL system is a promising solution that can effectively handle the issues of insufficient labeled data and non-IID data distribution, which are common in real-world scenarios.

However, the paper does not discuss the potential privacy implications of the clustering regularization technique, which could potentially reveal insights about the underlying data distribution. Additionally, the experiments were conducted on relatively simple benchmark datasets, and it would be valuable to see how the system performs on more complex, real-world datasets.

Furthermore, the paper could have explored the trade-offs between the various optimization objectives, such as training time, communication cost, and model accuracy, and how these might be balanced in different application domains.

Conclusion

The SemiSFL system proposed in this paper represents a significant advance in the field of Federated Learning, addressing key challenges that have hindered the widespread adoption of large-scale machine learning models on resource-constrained devices. By incorporating clustering regularization and dynamic global update frequency adjustment, the system can effectively train models with limited labeled data and non-IID distributions, while also reducing the computational and communication burden on the clients.

The promising results demonstrated in the experiments suggest that SemiSFL could have a transformative impact on a wide range of applications, from smart home devices to industrial IoT, where privacy-preserving and efficient machine learning at the edge is crucial. As the researchers continue to refine and expand the system, it will be exciting to see how it evolves and the new possibilities it unlocks for the future of decentralized, edge-based machine learning.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📊

SemiSFL: Split Federated Learning on Unlabeled and Non-IID Data

Yang Xu, Yunming Liao, Hongli Xu, Zhipeng Sun, Liusheng Huang, Chunming Qiao

Federated Learning (FL) has emerged to allow multiple clients to collaboratively train machine learning models on their private data at the network edge. However, training and deploying large-scale models on resource-constrained devices is challenging. Fortunately, Split Federated Learning (SFL) offers a feasible solution by alleviating the computation and/or communication burden on clients. However, existing SFL works often assume sufficient labeled data on clients, which is usually impractical. Besides, data non-IIDness poses another challenge to ensure efficient model training. To our best knowledge, the above two issues have not been simultaneously addressed in SFL. Herein, we propose a novel Semi-supervised SFL system, termed SemiSFL, which incorporates clustering regularization to perform SFL with unlabeled and non-IID client data. Moreover, our theoretical and experimental investigations into model convergence reveal that the inconsistent training processes on labeled and unlabeled data have an influence on the effectiveness of clustering regularization. To mitigate the training inconsistency, we develop an algorithm for dynamically adjusting the global updating frequency, so as to improve training performance. Extensive experiments on benchmark models and datasets show that our system provides a 3.8x speed-up in training time, reduces the communication cost by about 70.3% while reaching the target accuracy, and achieves up to 5.8% improvement in accuracy under non-IID scenarios compared to the state-of-the-art baselines.

8/6/2024

One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity

Naibo Wang, Yuchen Deng, Wenjie Feng, Shichen Fan, Jianwei Yin, See-Kiong Ng

Traditional federated learning mainly focuses on parallel settings (PFL), which can suffer significant communication and computation costs. In contrast, one-shot and sequential federated learning (SFL) have emerged as innovative paradigms to alleviate these costs. However, the issue of non-IID (Independent and Identically Distributed) data persists as a significant challenge in one-shot and SFL settings, exacerbated by the restricted communication between clients. In this paper, we improve the one-shot sequential federated learning for non-IID data by proposing a local model diversity-enhancing strategy. Specifically, to leverage the potential of local model diversity for improving model performance, we introduce a local model pool for each client that comprises diverse models generated during local training, and propose two distance measurements to further enhance the model diversity and mitigate the effect of non-IID data. Consequently, our proposed framework can improve the global model performance while maintaining low communication costs. Extensive experiments demonstrate that our method exhibits superior performance to existing one-shot PFL methods and achieves better accuracy compared with state-of-the-art one-shot SFL methods on both label-skew and domain-shift tasks (e.g., 6%+ accuracy improvement on the CIFAR-10 dataset).

4/19/2024

📈

Exploring One-shot Semi-supervised Federated Learning with A Pre-trained Diffusion Model

Mingzhao Yang, Shangchao Su, Bin Li, Xiangyang Xue

Recently, semi-supervised federated learning (semi-FL) has been proposed to handle the commonly seen real-world scenarios with labeled data on the server and unlabeled data on the clients. However, existing methods face several challenges such as communication costs, data heterogeneity, and training pressure on client devices. To address these challenges, we introduce the powerful diffusion models (DM) into semi-FL and propose FedDISC, a Federated Diffusion-Inspired Semi-supervised Co-training method. Specifically, we first extract prototypes of the labeled server data and use these prototypes to predict pseudo-labels of the client data. For each category, we compute the cluster centroids and domain-specific representations to signify the semantic and stylistic information of their distributions. After adding noise, these representations are sent back to the server, which uses the pre-trained DM to generate synthetic datasets complying with the client distributions and train a global model on it. With the assistance of vast knowledge within DM, the synthetic datasets have comparable quality and diversity to the client images, subsequently enabling the training of global models that achieve performance equivalent to or even surpassing the ceiling of supervised centralized training. FedDISC works within one communication round, does not require any local training, and involves very minimal information uploading, greatly enhancing its practicality. Extensive experiments on three large-scale datasets demonstrate that FedDISC effectively addresses the semi-FL problem on non-IID clients and outperforms the compared SOTA methods. Sufficient visualization experiments also illustrate that the synthetic dataset generated by FedDISC exhibits comparable diversity and quality to the original client dataset, with a neglectable possibility of leaking privacy-sensitive information of the clients.

6/13/2024

🔎

Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning

Dengke Yan, Ming Hu, Zeke Xia, Yanxin Yang, Jun Xia, Xiaofei Xie, Mingsong Chen

Due to its advantages in resource constraint scenarios, Split Federated Learning (SFL) is promising in AIoT systems. However, due to data heterogeneity and stragglers, SFL suffers from the challenges of low inference accuracy and low efficiency. To address these issues, this paper presents a novel SFL approach, named Sliding Split Federated Learning (S$^2$FL), which adopts an adaptive sliding model split strategy and a data balance-based training mechanism. By dynamically dispatching different model portions to AIoT devices according to their computing capability, S$^2$FL can alleviate the low training efficiency caused by stragglers. By combining features uploaded by devices with different data distributions to generate multiple larger batches with a uniform distribution for back-propagation, S$^2$FL can alleviate the performance degradation caused by data heterogeneity. Experimental results demonstrate that, compared to conventional SFL, S$^2$FL can achieve up to 16.5% inference accuracy improvement and 3.54X training acceleration.

4/9/2024