MPC-Pipe: an Efficient Pipeline Scheme for Secure Multi-party Machine Learning Inference

Read original: arXiv:2209.13643 - Published 8/28/2024 by Yongqin Wang, Rachit Rajat, Murali Annavaram

🤯

Overview

MPC (Multi-party Computing) has been gaining popularity as a secure computing model.
Prior works have shown that MPC protocols pay substantial performance penalties compared to plaintext, especially for ML algorithms.
The overhead is due to the added computation and communication costs.
Most MPC protocols today perform communication and computation sequentially, which is unnecessary, particularly in the context of ML computations.

Plain English Explanation

Multi-party Computing (MPC) is a way for multiple parties to securely compute on data without anyone revealing their private information. However, previous studies have found that MPC protocols often run much slower than simply doing the computations in plaintext, especially when applied to machine learning (ML) algorithms.

The reason for this slowdown is that MPC protocols require extra computation and communication steps to keep the data secure. Typically, the parties involved must first compute on their own shares of the data, then communicate those shares to allow the next computation step. This sequential approach is unnecessary, especially for ML tasks like training neural networks or running inference on Transformer models.

Technical Explanation

In this work, the researchers propose MPC-Pipe, an efficient MPC system that can pipeline the computation and communication steps during the online phase of ML workloads. MPC-Pipe introduces three pipeline schemes to optimize the online phase of ML in the semi-honest majority adversary setting.

The researchers implement MPC-Pipe by modifying CrypTen, an existing MPC framework, to separate the online and offline phases. They then evaluate the end-to-end system performance benefits of the online phase using deep neural networks (VGG16, ResNet50) and Transformer models under different network settings.

The results show that MPC-Pipe can significantly improve the throughput and latency of ML workloads compared to traditional sequential MPC protocols.

Critical Analysis

The paper demonstrates a clever approach to overcoming the performance limitations of MPC for ML by carefully orchestrating the computation and communication steps. However, the proposed solutions are limited to the semi-honest majority adversary setting, which assumes that a majority of the parties involved are honest and follow the protocol.

In real-world deployments, there may be scenarios where this assumption does not hold, and more robust security guarantees are required. Further research could explore extending the pipelining techniques to other adversarial settings, such as malicious adversaries, to address a wider range of practical use cases.

Additionally, the paper focuses on the online phase of MPC, but the offline phase, which involves pre-computation, can also have a significant impact on overall performance. Investigating optimizations for the offline phase could further enhance the practicality of MPC-based ML systems.

Conclusion

The MPC-Pipe system proposed in this paper demonstrates a novel approach to improving the performance of MPC for ML workloads. By pipelining the computation and communication steps, the researchers were able to significantly reduce the overhead typically associated with MPC protocols.

This work highlights the potential for optimizing the integration of secure computing techniques with machine learning, which could have important implications for privacy-preserving AI applications in areas like healthcare, finance, and other sensitive domains.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤯

MPC-Pipe: an Efficient Pipeline Scheme for Secure Multi-party Machine Learning Inference

Yongqin Wang, Rachit Rajat, Murali Annavaram

Multi-party computing (MPC) has been gaining popularity as a secure computing model over the past few years. However, prior works have demonstrated that MPC protocols still pay substantial performance penalties compared to plaintext, particularly when applied to ML algorithms. The overhead is due to added computation and communication costs. Prior studies, as well as our own analysis, found that most MPC protocols today sequentially perform communication and computation. The participating parties must compute on their shares first and then perform data communication to allow the distribution of new secret shares before proceeding to the next computation step. In this work, we show that serialization is unnecessary, particularly in the context of ML computations (both in Convolutional neural networks and in Transformer-based models). We demonstrate that it is possible to carefully orchestrate the computation and communication steps to overlap. We propose MPC-Pipe, an efficient MPC system for both training and inference of ML workloads, which pipelines computations and communications in an MPC protocol during the online phase. MPC-Pipe proposes three pipeline schemes to optimize the online phase of ML in the semi-honest majority adversary setting. We implement MPC-Pipe by augmenting a modified version of CrypTen, which separates online and offline phases. We evaluate the end-to-end system performance benefits of the online phase of MPC using deep neural networks (VGG16, ResNet50) and Transformers using different network settings. We show that MPC-Pipe can improve the throughput and latency of ML workloads.

8/28/2024

Low-Latency Privacy-Preserving Deep Learning Design via Secure MPC

Ke Lin, Yasir Glani, Ping Luo

Secure multi-party computation (MPC) facilitates privacy-preserving computation between multiple parties without leaking private information. While most secure deep learning techniques utilize MPC operations to achieve feasible privacy-preserving machine learning on downstream tasks, the overhead of the computation and communication still hampers their practical application. This work proposes a low-latency secret-sharing-based MPC design that reduces unnecessary communication rounds during the execution of MPC protocols. We also present a method for improving the computation of commonly used nonlinear functions in deep learning by integrating multivariate multiplication and coalescing different packets into one to maximize network utilization. Our experimental results indicate that our method is effective in a variety of settings, with a speedup in communication latency of $10sim20%$.

7/30/2024

MPC-Minimized Secure LLM Inference

Deevashwer Rathee, Dacheng Li, Ion Stoica, Hao Zhang, Raluca Popa

Many inference services based on large language models (LLMs) pose a privacy concern, either revealing user prompts to the service or the proprietary weights to the user. Secure inference offers a solution to this problem through secure multi-party computation (MPC), however, it is still impractical for modern LLM workload due to the large overhead imposed by MPC. To address this overhead, we propose Marill, a framework that adapts LLM fine-tuning to minimize MPC usage during secure inference. Marill introduces high-level architectural changes during fine-tuning that significantly reduce the number of expensive operations needed within MPC during inference, by removing some and relocating others outside MPC without compromising security. As a result, Marill-generated models are more efficient across all secure inference protocols and our approach complements MPC-friendly approximations for such operations. Compared to standard fine-tuning, Marill results in 3.6-11.3x better runtime and 2.4-6.9x better communication during secure inference across various MPC settings, while typically preserving over 90% performance across downstream tasks.

8/9/2024

SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud

Shijin Duan, Chenghong Wang, Hongwu Peng, Yukui Luo, Wujie Wen, Caiwen Ding, Xiaolin Xu

As privacy-preserving becomes a pivotal aspect of deep learning (DL) development, multi-party computation (MPC) has gained prominence for its efficiency and strong security. However, the practice of current MPC frameworks is limited, especially when dealing with large neural networks, exemplified by the prolonged execution time of 25.8 seconds for secure inference on ResNet-152. The primary challenge lies in the reliance of current MPC approaches on additive secret sharing, which incurs significant communication overhead with non-linear operations such as comparisons. Furthermore, additive sharing suffers from poor scalability on party size. In contrast, the evolving landscape of MPC necessitates accommodating a larger number of compute parties and ensuring robust performance against malicious activities or computational failures. In light of these challenges, we propose SSNet, which for the first time, employs Shamir's secret sharing (SSS) as the backbone of MPC-based ML framework. We meticulously develop all framework primitives and operations for secure DL models tailored to seamlessly integrate with the SSS scheme. SSNet demonstrates the ability to scale up party numbers straightforwardly and embeds strategies to authenticate the computation correctness without incurring significant performance overhead. Additionally, SSNet introduces masking strategies designed to reduce communication overhead associated with non-linear operations. We conduct comprehensive experimental evaluations on commercial cloud computing infrastructure from Amazon AWS, as well as across diverse prevalent DNN models and datasets. SSNet demonstrates a substantial performance boost, achieving speed-ups ranging from 3x to 14x compared to SOTA MPC frameworks. Moreover, SSNet also represents the first framework that is evaluated on a five-party computation setup, in the context of secure DL inference.

6/6/2024