Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

2405.14457

Published 5/24/2024 by Tudor Cebere, Aur'elien Bellet, Nicolas Papernot

📈

Abstract

Machine learning models can be trained with formal privacy guarantees via differentially private optimizers such as DP-SGD. In this work, we study such privacy guarantees when the adversary only accesses the final model, i.e., intermediate model updates are not released. In the existing literature, this hidden state threat model exhibits a significant gap between the lower bound provided by empirical privacy auditing and the theoretical upper bound provided by privacy accounting. To challenge this gap, we propose to audit this threat model with adversaries that craft a gradient sequence to maximize the privacy loss of the final model without accessing intermediate models. We demonstrate experimentally how this approach consistently outperforms prior attempts at auditing the hidden state model. When the crafted gradient is inserted at every optimization step, our results imply that releasing only the final model does not amplify privacy, providing a novel negative result. On the other hand, when the crafted gradient is not inserted at every step, we show strong evidence that a privacy amplification phenomenon emerges in the general non-convex setting (albeit weaker than in convex regimes), suggesting that existing privacy upper bounds can be improved.

Create account to get full access

Overview

This paper examines privacy guarantees for machine learning models trained using differentially private optimizers like DP-SGD, where only the final model is released and not the intermediate model updates.
The authors identify a significant gap between the lower bound provided by empirical privacy auditing and the theoretical upper bound provided by privacy accounting in this "hidden state" threat model.
To address this gap, the authors propose a novel auditing approach where the adversary crafts a gradient sequence to maximize the privacy loss of the final model without accessing the intermediate models.

Plain English Explanation

The paper looks at machine learning models that are trained with special techniques to protect the privacy of the data used for training. These techniques, called differentially private optimizers, add noise to the training process to make it harder for an attacker to figure out the original data.

In this work, the researchers are interested in a specific scenario where only the final trained model is released, and the intermediate steps of the training process are kept secret. They found that in this "hidden state" scenario, the actual privacy guarantee (measured through experiments) is much weaker than the theoretical privacy guarantee that the techniques are supposed to provide.

To better understand this gap, the researchers developed a new way to "audit" the privacy of the final model. Instead of just looking at the final model, they have the attacker try to craft a sequence of gradients (the steps taken during training) that would maximally reduce the privacy of the final model. By doing this, they can get a better sense of the true privacy guarantee in this hidden state scenario.

The researchers found that when the crafted gradient is used at every training step, releasing only the final model does not actually improve privacy at all. However, when the crafted gradient is not used at every step, they found some evidence that privacy can still be "amplified" to some degree, even in the hidden state scenario, although not as much as in simpler cases.

Technical Explanation

The paper explores the privacy guarantees of machine learning models trained using differentially private stochastic gradient descent (DP-SGD), where only the final trained model is released and the intermediate model updates are not publicly accessible (the "hidden state" threat model).

The authors identify a significant gap between the lower bound on privacy loss provided by empirical auditing and the theoretical upper bound provided by privacy accounting in this hidden state scenario. To challenge this gap, they propose a novel auditing approach where the adversary crafts a gradient sequence to maximize the privacy loss of the final model, without accessing the intermediate models.

Experimentally, the authors demonstrate that this crafted gradient approach consistently outperforms prior attempts at auditing the hidden state model. When the crafted gradient is inserted at every optimization step, their results imply that releasing only the final model does not amplify privacy, providing a novel negative result. On the other hand, when the crafted gradient is not inserted at every step, the authors show strong evidence that a privacy amplification phenomenon emerges in the general non-convex setting (albeit weaker than in convex regimes), suggesting that existing privacy upper bounds can be improved.

These findings have important implications for understanding the privacy guarantees of machine learning models trained with differentially private techniques, especially in scenarios where only the final model is released. The authors' novel auditing approach provides a more comprehensive way to assess the true privacy risks in such settings.

Critical Analysis

The paper makes a valuable contribution by highlighting the significant gap between theoretical privacy guarantees and empirical privacy risks in the "hidden state" threat model for differentially private machine learning. The authors' proposed auditing approach using crafted gradients is a novel and insightful technique that provides a more comprehensive assessment of privacy risks in this scenario.

However, the paper also acknowledges several limitations and caveats. The authors note that their results only apply to the specific threat model where the adversary cannot access the intermediate model updates, and that the privacy amplification effect they observe is weaker in the general non-convex setting compared to simpler convex regimes. Further research may be needed to fully understand the implications of these findings and explore alternative threat models or auditing techniques.

Additionally, while the authors' auditing approach is a significant advancement, it still relies on the ability of the adversary to craft a gradient sequence that maximizes the privacy loss. In practice, this may be a challenging task, and the effectiveness of the approach may depend on the specific machine learning task and dataset. Exploring alternative auditing methods could further strengthen the understanding of privacy guarantees in these settings.

Overall, the paper presents important insights and a novel technical contribution to the field of differentially private machine learning. However, as with any research, there are areas that warrant further investigation and refinement to fully understand the implications and limitations of the findings.

Conclusion

This paper tackles the important challenge of understanding the privacy guarantees of machine learning models trained with differentially private techniques, particularly in the "hidden state" scenario where only the final model is released. By proposing a novel auditing approach using crafted gradients, the authors are able to identify a significant gap between theoretical and empirical privacy risks in this setting.

The key findings of the paper have important implications for the development and deployment of privacy-preserving machine learning systems. The negative result showing that releasing only the final model does not amplify privacy, as well as the evidence of weaker privacy amplification in the general non-convex setting, suggest that existing theoretical privacy bounds may need to be re-evaluated. Additionally, the authors' auditing technique provides a more comprehensive way to assess the true privacy risks in these scenarios.

Overall, this work contributes valuable insights and methodological advancements to the field of differentially private machine learning, with potential implications for the responsible development and deployment of privacy-preserving AI systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

👀

Nearly Tight Black-Box Auditing of Differentially Private Machine Learning

Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro

This paper presents a nearly tight audit of the Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm in the black-box model. Our auditing procedure empirically estimates the privacy leakage from DP-SGD using membership inference attacks; unlike prior work, the estimates are appreciably close to the theoretical DP bounds. The main intuition is to craft worst-case initial model parameters, as DP-SGD's privacy analysis is agnostic to the choice of the initial model parameters. For models trained with theoretical $varepsilon=10.0$ on MNIST and CIFAR-10, our auditing procedure yields empirical estimates of $7.21$ and $6.95$, respectively, on 1,000-record samples and $6.48$ and $4.96$ on the full datasets. By contrast, previous work achieved tight audits only in stronger (i.e., less realistic) white-box models that allow the adversary to access the model's inner parameters and insert arbitrary gradients. Our auditing procedure can be used to detect bugs and DP violations more easily and offers valuable insight into how the privacy analysis of DP-SGD can be further improved.

5/24/2024

cs.CR cs.LG

💬

One-shot Empirical Privacy Estimation for Federated Learning

Galen Andrew, Peter Kairouz, Sewoong Oh, Alina Oprea, H. Brendan McMahan, Vinith M. Suriyakumar

Privacy estimation techniques for differentially private (DP) algorithms are useful for comparing against analytical bounds, or to empirically measure privacy loss in settings where known analytical bounds are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution), are tailored to specific tasks, model architectures, or DP algorithm, and/or require retraining the model many times (typically on the order of thousands). These shortcomings make deploying such techniques at scale difficult in practice, especially in federated settings where model training can take days or weeks. In this work, we present a novel one-shot approach that can systematically address these challenges, allowing efficient auditing or estimation of the privacy loss of a model during the same, single training run used to fit model parameters, and without requiring any a priori knowledge about the model architecture, task, or DP training algorithm. We show that our method provides provably correct estimates for the privacy loss under the Gaussian mechanism, and we demonstrate its performance on well-established FL benchmark datasets under several adversarial threat models.

4/19/2024

cs.LG cs.CR

✅

Delving into Differentially Private Transformer

Youlong Ding, Xueyang Wu, Yining Meng, Yonggang Luo, Hao Wang, Weike Pan

Deep learning with differential privacy (DP) has garnered significant attention over the past years, leading to the development of numerous methods aimed at enhancing model accuracy and training efficiency. This paper delves into the problem of training Transformer models with differential privacy. Our treatment is modular: the logic is to `reduce' the problem of training DP Transformer to the more basic problem of training DP vanilla neural nets. The latter is better understood and amenable to many model-agnostic methods. Such `reduction' is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively. We believe that our work not only casts new light on training DP Transformers but also promotes a modular treatment to advance research in the field of differentially private deep learning.

5/30/2024

cs.LG cs.CR

Differentially Private Fine-Tuning of Diffusion Models

Yu-Lin Tsai, Yizhe Li, Zekai Chen, Po-Yu Chen, Chia-Mu Yu, Xuebin Ren, Francois Buet-Golfouse

The integration of Differential Privacy (DP) with diffusion models (DMs) presents a promising yet challenging frontier, particularly due to the substantial memorization capabilities of DMs that pose significant privacy risks. Differential privacy offers a rigorous framework for safeguarding individual data points during model training, with Differential Privacy Stochastic Gradient Descent (DP-SGD) being a prominent implementation. Diffusion method decomposes image generation into iterative steps, theoretically aligning well with DP's incremental noise addition. Despite the natural fit, the unique architecture of DMs necessitates tailored approaches to effectively balance privacy-utility trade-off. Recent developments in this field have highlighted the potential for generating high-quality synthetic data by pre-training on public data (i.e., ImageNet) and fine-tuning on private data, however, there is a pronounced gap in research on optimizing the trade-offs involved in DP settings, particularly concerning parameter efficiency and model scalability. Our work addresses this by proposing a parameter-efficient fine-tuning strategy optimized for private diffusion models, which minimizes the number of trainable parameters to enhance the privacy-utility trade-off. We empirically demonstrate that our method achieves state-of-the-art performance in DP synthesis, significantly surpassing previous benchmarks on widely studied datasets (e.g., with only 0.47M trainable parameters, achieving a more than 35% improvement over the previous state-of-the-art with a small privacy budget on the CelebA-64 dataset). Anonymous codes available at https://anonymous.4open.science/r/DP-LORA-F02F.

6/4/2024

cs.CV cs.AI cs.CR