Nearly Tight Black-Box Auditing of Differentially Private Machine Learning

2405.14106

YC

0

Reddit

0

Published 5/24/2024 by Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro

👀

Abstract

This paper presents a nearly tight audit of the Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm in the black-box model. Our auditing procedure empirically estimates the privacy leakage from DP-SGD using membership inference attacks; unlike prior work, the estimates are appreciably close to the theoretical DP bounds. The main intuition is to craft worst-case initial model parameters, as DP-SGD's privacy analysis is agnostic to the choice of the initial model parameters. For models trained with theoretical $varepsilon=10.0$ on MNIST and CIFAR-10, our auditing procedure yields empirical estimates of $7.21$ and $6.95$, respectively, on 1,000-record samples and $6.48$ and $4.96$ on the full datasets. By contrast, previous work achieved tight audits only in stronger (i.e., less realistic) white-box models that allow the adversary to access the model's inner parameters and insert arbitrary gradients. Our auditing procedure can be used to detect bugs and DP violations more easily and offers valuable insight into how the privacy analysis of DP-SGD can be further improved.

Create account to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper presents a new method for auditing the privacy guarantees of the Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm in a black-box setting.
  • The authors empirically estimate the privacy leakage from DP-SGD using membership inference attacks, and show that their estimates are much closer to the theoretical DP bounds compared to previous work.
  • The key insight is to craft worst-case initial model parameters, as DP-SGD's privacy analysis is agnostic to the choice of the initial model parameters.

Plain English Explanation

Differential privacy is an important concept in machine learning that helps protect the privacy of the data used to train models. DP-SGD is a popular algorithm for training machine learning models with differential privacy guarantees.

In this paper, the authors present a new way to audit the privacy of DP-SGD models. They use a technique called "membership inference attacks" to estimate how much information about the training data is being leaked by the model. Unlike previous work, their estimates are much closer to the theoretical privacy bounds promised by differential privacy.

The key insight is that the privacy analysis of DP-SGD doesn't actually depend on the initial parameters of the model. So the authors craft "worst-case" initial parameters to get a more realistic estimate of the privacy leakage.

For models trained with a theoretical privacy parameter of 10.0 on the MNIST and CIFAR-10 datasets, the authors' auditing procedure yielded empirical estimates of around 7 and 5, respectively. This is much tighter than previous work, which could only achieve tight audits in unrealistic "white-box" settings where the adversary had full access to the model's internal parameters.

The authors' auditing procedure can be used to more easily detect bugs or violations of the promised differential privacy guarantees. It also provides valuable insights that can be used to further improve the privacy analysis of DP-SGD.

Technical Explanation

The paper presents a new auditing procedure for the Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm in a black-box setting. Unlike previous work, the authors' estimates of the privacy leakage from DP-SGD are much closer to the theoretical DP bounds.

The key insight is to craft worst-case initial model parameters, as DP-SGD's privacy analysis is agnostic to the choice of the initial model parameters. The authors empirically estimate the privacy leakage using membership inference attacks, which aim to determine whether a given data point was part of the training set.

For models trained with a theoretical privacy parameter of 10.0 on the MNIST and CIFAR-10 datasets, the authors' auditing procedure yields empirical estimates of 7.21 and 6.95, respectively, on 1,000-record samples, and 6.48 and 4.96 on the full datasets. This is in contrast to previous work, which could only achieve tight audits in stronger (i.e., less realistic) white-box models that allow the adversary to access the model's inner parameters and insert arbitrary gradients.

The authors' auditing procedure can be used to more easily detect bugs and DP violations, and offers valuable insight into how the privacy analysis of DP-SGD can be further improved. The paper also discusses the limitations of the approach and suggests areas for future research, such as extending the auditing procedure to other differential privacy mechanisms.

Critical Analysis

The paper presents a novel and promising approach for auditing the privacy guarantees of DP-SGD models in a black-box setting. The authors' key insight of crafting worst-case initial model parameters to get more realistic privacy estimates is a clever way to address a limitation in the theoretical privacy analysis of DP-SGD.

However, the paper does not address several important caveats and limitations. For example, the auditing procedure relies on membership inference attacks, which have been shown to have limitations and biases. It's unclear how robust the authors' estimates would be to more sophisticated attacks or different attack models.

Additionally, the paper only considers a single dataset (MNIST) and a single hyperparameter setting (ε=10.0) for DP-SGD. It's important to evaluate the auditing procedure's performance across a wider range of datasets, model architectures, and privacy parameters to fully understand its strengths and weaknesses.

The paper also does not address the potential implications of this work for the broader field of differential privacy. For example, it would be valuable to understand how the authors' findings relate to uncertainty quantification in DP-SGD or the linear scaling rule for private hyperparameter tuning.

Overall, the paper presents an intriguing new approach for auditing DP-SGD, but more research is needed to fully understand its strengths, limitations, and broader significance. Readers should approach the findings with appropriate caution and critical thinking.

Conclusion

This paper introduces a new method for auditing the privacy guarantees of the Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm in a black-box setting. The authors' key insight is to craft worst-case initial model parameters, which allows them to obtain empirical estimates of the privacy leakage that are much closer to the theoretical DP bounds compared to previous work.

The authors' auditing procedure can be used to more easily detect bugs or violations of the promised differential privacy guarantees, and it also provides valuable insights that can be used to further improve the privacy analysis of DP-SGD. While the paper presents a promising new approach, more research is needed to fully understand its strengths, limitations, and broader implications for the field of differential privacy.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

📈

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere, Aur'elien Bellet, Nicolas Papernot

YC

0

Reddit

0

Machine learning models can be trained with formal privacy guarantees via differentially private optimizers such as DP-SGD. In this work, we study such privacy guarantees when the adversary only accesses the final model, i.e., intermediate model updates are not released. In the existing literature, this hidden state threat model exhibits a significant gap between the lower bound provided by empirical privacy auditing and the theoretical upper bound provided by privacy accounting. To challenge this gap, we propose to audit this threat model with adversaries that craft a gradient sequence to maximize the privacy loss of the final model without accessing intermediate models. We demonstrate experimentally how this approach consistently outperforms prior attempts at auditing the hidden state model. When the crafted gradient is inserted at every optimization step, our results imply that releasing only the final model does not amplify privacy, providing a novel negative result. On the other hand, when the crafted gradient is not inserted at every step, we show strong evidence that a privacy amplification phenomenon emerges in the general non-convex setting (albeit weaker than in convex regimes), suggesting that existing privacy upper bounds can be improved.

Read more

5/24/2024

💬

One-shot Empirical Privacy Estimation for Federated Learning

Galen Andrew, Peter Kairouz, Sewoong Oh, Alina Oprea, H. Brendan McMahan, Vinith M. Suriyakumar

YC

0

Reddit

0

Privacy estimation techniques for differentially private (DP) algorithms are useful for comparing against analytical bounds, or to empirically measure privacy loss in settings where known analytical bounds are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution), are tailored to specific tasks, model architectures, or DP algorithm, and/or require retraining the model many times (typically on the order of thousands). These shortcomings make deploying such techniques at scale difficult in practice, especially in federated settings where model training can take days or weeks. In this work, we present a novel one-shot approach that can systematically address these challenges, allowing efficient auditing or estimation of the privacy loss of a model during the same, single training run used to fit model parameters, and without requiring any a priori knowledge about the model architecture, task, or DP training algorithm. We show that our method provides provably correct estimates for the privacy loss under the Gaussian mechanism, and we demonstrate its performance on well-established FL benchmark datasets under several adversarial threat models.

Read more

4/19/2024

How Private are DP-SGD Implementations?

How Private are DP-SGD Implementations?

Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

YC

0

Reddit

0

We demonstrate a substantial gap between the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism under different types of batch sampling: (i) Shuffling, and (ii) Poisson subsampling; the typical analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) follows by interpreting it as a post-processing of ABLQ. While shuffling-based DP-SGD is more commonly used in practical implementations, it has not been amenable to easy privacy analysis, either analytically or even numerically. On the other hand, Poisson subsampling-based DP-SGD is challenging to scalably implement, but has a well-understood privacy analysis, with multiple open-source numerically tight privacy accountants available. This has led to a common practice of using shuffling-based DP-SGD in practice, but using the privacy analysis for the corresponding Poisson subsampling version. Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling, and thus advises caution in reporting privacy parameters for DP-SGD.

Read more

6/7/2024

Black Box Differential Privacy Auditing Using Total Variation Distance

Black Box Differential Privacy Auditing Using Total Variation Distance

Antti Koskela, Jafar Mohammadi

YC

0

Reddit

0

We present a practical method to audit the differential privacy (DP) guarantees of a machine learning model using a small hold-out dataset that is not exposed to the model during the training. Having a score function such as the loss function employed during the training, our method estimates the total variation (TV) distance between scores obtained with a subset of the training data and the hold-out dataset. With some meta information about the underlying DP training algorithm, these TV distance values can be converted to $(varepsilon,delta)$-guarantees for any $delta$. We show that these score distributions asymptotically give lower bounds for the DP guarantees of the underlying training algorithm, however, we perform a one-shot estimation for practicality reasons. We specify conditions that lead to lower bounds for the DP guarantees with high probability. To estimate the TV distance between the score distributions, we use a simple density estimation method based on histograms. We show that the TV distance gives a very close to optimally robust estimator and has an error rate $mathcal{O}(k^{-1/3})$, where $k$ is the total number of samples. Numerical experiments on benchmark datasets illustrate the effectiveness of our approach and show improvements over baseline methods for black-box auditing.

Read more

6/10/2024