Volley Revolver: A Novel Matrix-Encoding Method for Privacy-Preserving Neural Networks (Inference)

Read original: arXiv:2201.12577 - Published 8/15/2024 by John Chiang

🧠

Overview

This paper presents a novel matrix-encoding method for neural networks to make predictions in a privacy-preserving way using homomorphic encryption.
The authors implement a convolutional neural network for handwritten image classification that can operate directly on encrypted data.
The key idea is to encode the neural network's operations, like matrix multiplication and convolution, in a way that can be efficiently computed over encrypted data.

Plain English Explanation

In this paper, the researchers developed a new way to encode matrices that makes it easier for neural networks to make predictions on encrypted data. This is important because it allows people to use machine learning models without revealing their private information.

The main idea is that to multiply two encrypted matrices, you can encrypt one matrix and the transpose of the other matrix separately. Then, you can do the multiplication over the encrypted matrices and get the encrypted result.

For convolutional neural networks, the researchers figured out a way to break down the convolution operation into simpler steps that can also be computed over encrypted data. This involves pre-computing the convolution kernels as matrices and encrypting them.

The researchers implemented this approach and tested it on the MNIST handwritten digit dataset. They found that on a public cloud server, their system could classify 32 encrypted images in about 287 seconds, while only requiring the data owner to upload a single 19.8 MB ciphertext.

Technical Explanation

The key contribution of this work is a novel matrix-encoding method that enables neural networks to make predictions on homomorphically encrypted data.

For homomorphic matrix multiplication, the main idea is to encrypt matrix A and the transpose of matrix B as two separate ciphertexts. This allows the multiplication to be performed directly on the encrypted matrices.

To handle convolutional neural network operations, the researchers pre-compute the convolution kernels as matrices and encrypt them. They then use these encrypted kernel matrices together with the encrypted input images to calculate the convolution results in a step-wise fashion.

The researchers implemented this approach and evaluated it on the MNIST dataset. On a public cloud server with 40 vCPUs, their convolutional neural network could classify 32 encrypted MNIST images of size 28x28 in around 287 seconds. The data owner only needs to upload a single 19.8 MB ciphertext containing the encrypted images.

Critical Analysis

The paper presents a promising technique for enabling privacy-preserving inference with convolutional neural networks. However, there are a few potential limitations and areas for further research:

The performance, while reasonable, may still be too slow for some real-world applications. More research is needed to optimize the efficiency of the homomorphic operations.
The paper only considers a simple convolutional neural network on the MNIST dataset. It's unclear how well the approach would scale to larger, more complex models and datasets.
The security of the homomorphic encryption scheme is crucial, but the paper does not provide a detailed security analysis or comparison to alternative privacy-preserving techniques like differential privacy.

Overall, this work demonstrates an important step towards making machine learning more privacy-preserving, but further research is needed to fully understand the capabilities and limitations of this approach.

Conclusion

This paper presents a novel matrix-encoding method that enables neural networks to make predictions on homomorphically encrypted data. The key idea is to encode the neural network's operations, like matrix multiplication and convolution, in a way that can be efficiently computed over encrypted data.

The researchers implemented this approach in a convolutional neural network for handwritten image classification and evaluated it on the MNIST dataset. Their system was able to classify 32 encrypted MNIST images in around 287 seconds on a public cloud server, with the data owner only needing to upload a single 19.8 MB ciphertext.

This work represents an important step towards making machine learning more privacy-preserving, but further research is needed to improve the efficiency and security of the approach. As privacy-preserving machine learning continues to advance, it could enable a wide range of applications that protect sensitive user data.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🧠

Volley Revolver: A Novel Matrix-Encoding Method for Privacy-Preserving Neural Networks (Inference)

John Chiang

In this work, we present a novel matrix-encoding method that is particularly convenient for neural networks to make predictions in a privacy-preserving manner using homomorphic encryption. Based on this encoding method, we implement a convolutional neural network for handwritten image classification over encryption. For two matrices $A$ and $B$ to perform homomorphic multiplication, the main idea behind it, in a simple version, is to encrypt matrix $A$ and the transpose of matrix $B$ into two ciphertexts respectively. With additional operations, the homomorphic matrix multiplication can be calculated over encrypted matrices efficiently. For the convolution operation, we in advance span each convolution kernel to a matrix space of the same size as the input image so as to generate several ciphertexts, each of which is later used together with the ciphertext encrypting input images for calculating some of the final convolution results. We accumulate all these intermediate results and thus complete the convolution operation. In a public cloud with 40 vCPUs, our convolutional neural network implementation on the MNIST testing dataset takes $sim$ 287 seconds to compute ten likelihoods of 32 encrypted images of size $28 times 28$ simultaneously. The data owner only needs to upload one ciphertext ($sim 19.8$ MB) encrypting these 32 images to the public cloud.

8/15/2024

🏋️

Privacy-Preserving CNN Training with Transfer Learning: Multiclass Logistic Regression

John Chiang

In this paper, we present a practical solution to implement privacy-preserving CNN training based on mere Homomorphic Encryption (HE) technique. To our best knowledge, this is the first attempt successfully to crack this nut and no work ever before has achieved this goal. Several techniques combine to accomplish the task:: (1) with transfer learning, privacy-preserving CNN training can be reduced to homomorphic neural network training, or even multiclass logistic regression (MLR) training; (2) via a faster gradient variant called $texttt{Quadratic Gradient}$, an enhanced gradient method for MLR with a state-of-the-art performance in convergence speed is applied in this work to achieve high performance; (3) we employ the thought of transformation in mathematics to transform approximating Softmax function in the encryption domain to the approximation of the Sigmoid function. A new type of loss function termed $texttt{Squared Likelihood Error}$ has been developed alongside to align with this change.; and (4) we use a simple but flexible matrix-encoding method named $texttt{Volley Revolver}$ to manage the data flow in the ciphertexts, which is the key factor to complete the whole homomorphic CNN training. The complete, runnable C++ code to implement our work can be found at: href{https://github.com/petitioner/HE.CNNtraining}{$texttt{https://github.com/petitioner/HE.CNNtraining}$}. We select $texttt{REGNET_X_400MF}$ as our pre-trained model for transfer learning. We use the first 128 MNIST training images as training data and the whole MNIST testing dataset as the testing data. The client only needs to upload 6 ciphertexts to the cloud and it takes $sim 21$ mins to perform 2 iterations on a cloud with 64 vCPUs, resulting in a precision of $21.49%$.

6/5/2024

🧠

Homomorphic WiSARDs: Efficient Weightless Neural Network training over encrypted data

Leonardo Neumann, Antonio Guimar~aes, Diego F. Aranha, Edson Borin

The widespread application of machine learning algorithms is a matter of increasing concern for the data privacy research community, and many have sought to develop privacy-preserving techniques for it. Among existing approaches, the homomorphic evaluation of ML algorithms stands out by performing operations directly over encrypted data, enabling strong guarantees of confidentiality. The homomorphic evaluation of inference algorithms is practical even for relatively deep Convolution Neural Networks (CNNs). However, training is still a major challenge, with current solutions often resorting to lightweight algorithms that can be unfit for solving more complex problems, such as image recognition. This work introduces the homomorphic evaluation of Wilkie, Stonham, and Aleksander's Recognition Device (WiSARD) and subsequent Weightless Neural Networks (WNNs) for training and inference on encrypted data. Compared to CNNs, WNNs offer better performance with a relatively small accuracy drop. We develop a complete framework for it, including several building blocks that can be of independent interest. Our framework achieves 91.7% accuracy on the MNIST dataset after only 3.5 minutes of encrypted training (multi-threaded), going up to 93.8% in 3.5 hours. For the HAM10000 dataset, we achieve 67.9% accuracy in just 1.5 minutes, going up to 69.9% after 1 hour. Compared to the state of the art on the HE evaluation of CNN training, Glyph (Lou et al., NeurIPS 2020), these results represent a speedup of up to 1200 times with an accuracy loss of at most 5.4%. For HAM10000, we even achieved a 0.65% accuracy improvement while being 60 times faster than Glyph. We also provide solutions for small-scale encrypted training. In a single thread on a desktop machine using less than 200MB of memory, we train over 1000 MNIST images in 12 minutes or over the entire Wisconsin Breast Cancer dataset in just 11 seconds.

4/1/2024

🧠

Privacy-Preserving 3-Layer Neural Network Training

John Chiang

In this manuscript, we consider the problem of privacy-preserving training of neural networks in the mere homomorphic encryption setting. We combine several exsiting techniques available, extend some of them, and finally enable the training of 3-layer neural networks for both the regression and classification problems using mere homomorphic encryption technique.

6/4/2024