Secure Aggregation is Not Private Against Membership Inference Attacks

Read original: arXiv:2403.17775 - Published 7/16/2024 by Khac-Hoang Ngo, Johan Ostman, Giuseppe Durisi, Alexandre Graell i Amat

🤯

Overview

The paper examines the privacy implications of a privacy-enhancing mechanism called Secure Aggregation (SecAgg) used in federated learning.
Federated learning is a machine learning technique where multiple devices or clients collaborate to train a shared model without sharing their individual data.
SecAgg is meant to protect the confidentiality of individual model updates by allowing the server to access only the aggregate of the updates, not the individual updates.
However, the paper argues that SecAgg's privacy-preserving capabilities have not been formally analyzed, casting doubt on the claims made about its privacy guarantees.

Plain English Explanation

The paper looks at a privacy feature called Secure Aggregation (SecAgg) that is commonly used in federated learning. Federated learning is a way for multiple devices or computers to work together to train a shared machine learning model without each device having to share its private data.

SecAgg is supposed to protect the privacy of the individual model updates that the devices send to the central server. Instead of the server seeing each device's individual update, SecAgg allows the server to only see the combined or "aggregated" update from all the devices. This is meant to keep the individual updates confidential.

However, the paper argues that the privacy claims around SecAgg have not been properly studied and tested. The authors design a simple attack to try to figure out which update came from a particular device, even with SecAgg in place. Their results show that SecAgg may not actually provide strong privacy protection, especially when the model updates are high-dimensional.

The paper suggests that additional privacy-enhancing techniques, like adding noise to the updates, may be necessary to truly protect user privacy in federated learning systems that use SecAgg.

Technical Explanation

The paper treats SecAgg as a local differential privacy (LDP) mechanism, which means it aims to protect the privacy of each individual device's model update. The authors design a specific attack where an adversarial server tries to determine which of two possible update vectors a client submitted in a single round of federated learning with SecAgg.

Through privacy auditing, the authors assess the success probability of this attack and quantify the LDP guarantees provided by SecAgg. Their numerical results show that, contrary to common claims, SecAgg offers weak privacy protection against membership inference attacks even in a single training round.

The authors explain that it is difficult to "hide" an individual update by adding it to other independent updates, especially when the updates are high-dimensional. This undermines SecAgg's ability to truly protect the confidentiality of the individual updates.

The paper's findings emphasize the need for additional privacy-enhancing mechanisms, such as injecting noise into the updates, to strengthen the privacy guarantees in federated learning systems that rely on SecAgg.

Critical Analysis

The paper provides a rigorous and novel analysis of the privacy properties of the SecAgg mechanism, which is widely used in federated learning but has not been formally studied before.

One limitation acknowledged by the authors is that their attack model assumes the adversary has access to two possible update vectors for a client. In practice, an adversary may have access to a larger set of possible updates, which could make the attack even more successful.

The paper also does not explore the implications of combining SecAgg with other privacy-preserving techniques, such as adding noise to the updates or using robust aggregation methods. These combined approaches may offer stronger privacy protections that are worth investigating.

Additionally, the authors only consider a single-round attack. Extending the analysis to multiple rounds of federated learning could provide a more comprehensive understanding of SecAgg's long-term privacy implications.

Overall, this paper makes an important contribution by highlighting the need for a more rigorous analysis of the privacy properties of SecAgg and other privacy-enhancing mechanisms used in federated learning. Its findings call for further research to develop more robust federated learning frameworks that can truly protect user privacy.

Conclusion

This paper challenges the widespread assumption that SecAgg, a commonly used privacy-enhancing mechanism in federated learning, can adequately protect the confidentiality of individual model updates. Through a formal privacy analysis, the authors demonstrate that SecAgg offers weak privacy guarantees, particularly against membership inference attacks.

The paper's key finding is that it is difficult to "hide" an individual high-dimensional update by adding it to other independent updates, undermining SecAgg's ability to preserve privacy. This underscores the need for additional privacy-preserving techniques, such as noise injection, to be incorporated into federated learning systems that rely on SecAgg.

The insights from this research are crucial for the development of federated learning frameworks that can truly safeguard user privacy. As the use of federated learning continues to grow, ensuring robust privacy protections will be essential for building trust and adoption of these distributed machine learning systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤯

Secure Aggregation is Not Private Against Membership Inference Attacks

Khac-Hoang Ngo, Johan Ostman, Giuseppe Durisi, Alexandre Graell i Amat

Secure aggregation (SecAgg) is a commonly-used privacy-enhancing mechanism in federated learning, affording the server access only to the aggregate of model updates while safeguarding the confidentiality of individual updates. Despite widespread claims regarding SecAgg's privacy-preserving capabilities, a formal analysis of its privacy is lacking, making such presumptions unjustified. In this paper, we delve into the privacy implications of SecAgg by treating it as a local differential privacy (LDP) mechanism for each local update. We design a simple attack wherein an adversarial server seeks to discern which update vector a client submitted, out of two possible ones, in a single training round of federated learning under SecAgg. By conducting privacy auditing, we assess the success probability of this attack and quantify the LDP guarantees provided by SecAgg. Our numerical results unveil that, contrary to prevailing claims, SecAgg offers weak privacy against membership inference attacks even in a single training round. Indeed, it is difficult to hide a local update by adding other independent local updates when the updates are of high dimension. Our findings underscore the imperative for additional privacy-enhancing mechanisms, such as noise injection, in federated learning.

7/16/2024

🛸

Secure Aggregation Meets Sparsification in Decentralized Learning

Sayan Biswas, Anne-Marie Kermarrec, Rafael Pires, Rishi Sharma, Milos Vujasinovic

Decentralized learning (DL) faces increased vulnerability to privacy breaches due to sophisticated attacks on machine learning (ML) models. Secure aggregation is a computationally efficient cryptographic technique that enables multiple parties to compute an aggregate of their private data while keeping their individual inputs concealed from each other and from any central aggregator. To enhance communication efficiency in DL, sparsification techniques are used, selectively sharing only the most crucial parameters or gradients in a model, thereby maintaining efficiency without notably compromising accuracy. However, applying secure aggregation to sparsified models in DL is challenging due to the transmission of disjoint parameter sets by distinct nodes, which can prevent masks from canceling out effectively. This paper introduces CESAR, a novel secure aggregation protocol for DL designed to be compatible with existing sparsification mechanisms. CESAR provably defends against honest-but-curious adversaries and can be formally adapted to counteract collusion between them. We provide a foundational understanding of the interaction between the sparsification carried out by the nodes and the proportion of the parameters shared under CESAR in both colluding and non-colluding environments, offering analytical insight into the working and applicability of the protocol. Experiments on a network with 48 nodes in a 3-regular topology show that with random subsampling, CESAR is always within 0.5% accuracy of decentralized parallel stochastic gradient descent (D-PSGD), while adding only 11% of data overhead. Moreover, it surpasses the accuracy on TopK by up to 0.3% on independent and identically distributed (IID) data.

5/15/2024

On Joint Noise Scaling in Differentially Private Federated Learning with Multiple Local Steps

Mikko A. Heikkila

Federated learning is a distributed learning setting where the main aim is to train machine learning models without having to share raw data but only what is required for learning. To guarantee training data privacy and high-utility models, differential privacy and secure aggregation techniques are often combined with federated learning. However, with fine-grained protection granularities the currently existing techniques require the parties to communicate for each local optimisation step, if they want to fully benefit from the secure aggregation in terms of the resulting formal privacy guarantees. In this paper, we show how a simple new analysis allows the parties to perform multiple local optimisation steps while still benefiting from joint noise scaling when using secure aggregation. We show that our analysis enables higher utility models with guaranteed privacy protection under limited number of communication rounds.

7/30/2024

An Efficient and Multi-private Key Secure Aggregation for Federated Learning

Xue Yang, Zifeng Liu, Xiaohu Tang, Rongxing Lu, Bo Liu

With the emergence of privacy leaks in federated learning, secure aggregation protocols that mainly adopt either homomorphic encryption or threshold secret sharing have been widely developed for federated learning to protect the privacy of the local training data of each client. However, these existing protocols suffer from many shortcomings, such as the dependence on a trusted third party, the vulnerability to clients being corrupted, low efficiency, the trade-off between security and fault tolerance, etc. To solve these disadvantages, we propose an efficient and multi-private key secure aggregation scheme for federated learning. Specifically, we skillfully modify the variant ElGamal encryption technique to achieve homomorphic addition operation, which has two important advantages: 1) The server and each client can freely select public and private keys without introducing a trust third party and 2) Compared to the variant ElGamal encryption, the plaintext space is relatively large, which is more suitable for the deep model. Besides, for the high dimensional deep model parameter, we introduce a super-increasing sequence to compress multi-dimensional data into 1-D, which can greatly reduce encryption and decryption times as well as communication for ciphertext transmission. Detailed security analyses show that our proposed scheme achieves the semantic security of both individual local gradients and the aggregated result while achieving optimal robustness in tolerating both client collusion and dropped clients. Extensive simulations demonstrate that the accuracy of our scheme is almost the same as the non-private approach, while the efficiency of our scheme is much better than the state-of-the-art homomorphic encryption-based secure aggregation schemes. More importantly, the efficiency advantages of our scheme will become increasingly prominent as the number of model parameters increases.

6/3/2024