Low-Cost High-Power Membership Inference Attacks

2312.03262

Published 6/13/2024 by Sajjad Zarifzadeh, Philippe Liu, Reza Shokri

🤯

Abstract

Membership inference attacks aim to detect if a particular data point was used in training a model. We design a novel statistical test to perform robust membership inference attacks (RMIA) with low computational overhead. We achieve this by a fine-grained modeling of the null hypothesis in our likelihood ratio tests, and effectively leveraging both reference models and reference population data samples. RMIA has superior test power compared with prior methods, throughout the TPR-FPR curve (even at extremely low FPR, as low as 0). Under computational constraints, where only a limited number of pre-trained reference models (as few as 1) are available, and also when we vary other elements of the attack (e.g., data distribution), our method performs exceptionally well, unlike prior attacks that approach random guessing. RMIA lays the groundwork for practical yet accurate data privacy risk assessment in machine learning.

Create account to get full access

Overview

The paper proposes a novel statistical test called Robust Membership Inference Attack (RMIA) to detect if a particular data point was used in training a machine learning model.
RMIA achieves high test power, even under computational constraints where only a limited number of pre-trained reference models are available.
The method effectively leverages both reference models and reference population data samples to model the null hypothesis in likelihood ratio tests.
RMIA outperforms prior membership inference attack methods, even at extremely low false positive rates.

Plain English Explanation

Membership inference attacks aim to determine whether a specific data point was used to train a machine learning model. This is important for assessing the privacy risks of machine learning models.

The researchers in this paper have developed a new technique called Robust Membership Inference Attack (RMIA) that can effectively detect if a data point was used in the model's training, even when there are constraints on the available computational resources.

RMIA works by carefully modeling the "null hypothesis" - the scenario where the data point was not used in training. It does this by leveraging both pre-trained reference models and samples from the overall population of data. This allows RMIA to have high test power, meaning it can accurately identify when a data point was used in training, even in cases where other methods would perform poorly.

The paper shows that RMIA outperforms previous membership inference attack methods, including in situations where only a limited number of reference models are available, or when the distribution of the data is different from what was expected. This makes RMIA a practical tool for assessing the privacy risks of machine learning models, as described in this paper.

Technical Explanation

The key innovation in RMIA is its fine-grained modeling of the null hypothesis in the likelihood ratio tests used for membership inference. By effectively leveraging both reference models and reference population data samples, RMIA is able to achieve superior test power compared to prior methods, such as those discussed in this paper.

Specifically, the researchers design a novel statistical test that compares the likelihood of the target data point under the null hypothesis (not used in training) versus the alternative hypothesis (used in training). This likelihood ratio test is formulated to take advantage of the information contained in the reference models and population data.

The paper demonstrates RMIA's effectiveness through extensive experiments, showing that it outperforms prior attacks even in computationally constrained settings with limited reference models available. RMIA also maintains strong performance when the data distribution varies from the expected, unlike previous methods that tend to approach random guessing in such cases.

Critical Analysis

The paper provides a thorough evaluation of RMIA, including comparisons to prior state-of-the-art membership inference attacks. The researchers acknowledge that their method relies on the availability of reference models and population data samples, which may not always be readily available in real-world scenarios.

Additionally, the paper does not explore the potential for adversaries to adapt their behavior or the training process to evade RMIA. As discussed in this paper, there may be ways for model owners to obfuscate or defend against membership inference attacks.

Further research could also investigate the impact of dataset properties, as explored in this paper, on the effectiveness of RMIA and other membership inference attacks.

Conclusion

The Robust Membership Inference Attack (RMIA) proposed in this paper represents a significant advancement in the field of membership inference attacks. By leveraging reference models and population data samples, RMIA can accurately detect if a data point was used in training a machine learning model, even under computational constraints.

The paper's findings highlight the importance of carefully evaluating the privacy risks of machine learning models, as membership inference attacks can have significant implications for individual privacy and data protection. RMIA provides a practical tool for this purpose and lays the groundwork for further research in this area.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Learning-Based Difficulty Calibration for Enhanced Membership Inference Attacks

Haonan Shi, Tu Ouyang, An Wang

Machine learning models, in particular deep neural networks, are currently an integral part of various applications, from healthcare to finance. However, using sensitive data to train these models raises concerns about privacy and security. One method that has emerged to verify if the trained models are privacy-preserving is Membership Inference Attacks (MIA), which allows adversaries to determine whether a specific data point was part of a model's training dataset. While a series of MIAs have been proposed in the literature, only a few can achieve high True Positive Rates (TPR) in the low False Positive Rate (FPR) region (0.01%~1%). This is a crucial factor to consider for an MIA to be practically useful in real-world settings. In this paper, we present a novel approach to MIA that is aimed at significantly improving TPR at low FPRs. Our method, named learning-based difficulty calibration for MIA(LDC-MIA), characterizes data records by their hardness levels using a neural network classifier to determine membership. The experiment results show that LDC-MIA can improve TPR at low FPR by up to 4x compared to the other difficulty calibration based MIAs. It also has the highest Area Under ROC curve (AUC) across all datasets. Our method's cost is comparable with most of the existing MIAs, but is orders of magnitude more efficient than one of the state-of-the-art methods, LiRA, while achieving similar performance.

5/9/2024

cs.CR cs.AI cs.LG

🤯

Fundamental Limits of Membership Inference Attacks on Machine Learning Models

Eric Aubinais, Elisabeth Gassiat, Pablo Piantanida

Membership inference attacks (MIA) can reveal whether a particular data point was part of the training dataset, potentially exposing sensitive information about individuals. This article provides theoretical guarantees by exploring the fundamental statistical limitations associated with MIAs on machine learning models. More precisely, we first derive the statistical quantity that governs the effectiveness and success of such attacks. We then theoretically prove that in a non-linear regression setting with overfitting algorithms, attacks may have a high probability of success. Finally, we investigate several situations for which we provide bounds on this quantity of interest. Interestingly, our findings indicate that discretizing the data might enhance the algorithm's security. Specifically, it is demonstrated to be limited by a constant, which quantifies the diversity of the underlying data distribution. We illustrate those results through two simple simulations.

6/12/2024

stat.ML cs.AI cs.LG

Semantic Membership Inference Attack against Large Language Models

Hamid Mozaffari, Virendra J. Marathe

Membership Inference Attacks (MIAs) determine whether a specific data point was included in the training set of a target model. In this paper, we introduce the Semantic Membership Inference Attack (SMIA), a novel approach that enhances MIA performance by leveraging the semantic content of inputs and their perturbations. SMIA trains a neural network to analyze the target model's behavior on perturbed inputs, effectively capturing variations in output probability distributions between members and non-members. We conduct comprehensive evaluations on the Pythia and GPT-Neo model families using the Wikipedia dataset. Our results show that SMIA significantly outperforms existing MIAs; for instance, SMIA achieves an AUC-ROC of 67.39% on Pythia-12B, compared to 58.90% by the second-best attack.

6/17/2024

cs.LG

Lost in the Averages: A New Specific Setup to Evaluate Membership Inference Attacks Against Machine Learning Models

Florent Gu'epin (Department of Computing, Imperial College London, United Kingdom), Natav{s}a Krv{c}o (Department of Computing, Imperial College London, United Kingdom), Matthieu Meeus (Department of Computing, Imperial College London, United Kingdom), Yves-Alexandre de Montjoye (Department of Computing, Imperial College London, United Kingdom)

Membership Inference Attacks (MIAs) are widely used to evaluate the propensity of a machine learning (ML) model to memorize an individual record and the privacy risk releasing the model poses. MIAs are commonly evaluated similarly to ML models: the MIA is performed on a test set of models trained on datasets unseen during training, which are sampled from a larger pool, $D_{eval}$. The MIA is evaluated across all datasets in this test set, and is thus evaluated across the distribution of samples from $D_{eval}$. While this was a natural extension of ML evaluation to MIAs, recent work has shown that a record's risk heavily depends on its specific dataset. For example, outliers are particularly vulnerable, yet an outlier in one dataset may not be one in another. The sources of randomness currently used to evaluate MIAs may thus lead to inaccurate individual privacy risk estimates. We propose a new, specific evaluation setup for MIAs against ML models, using weight initialization as the sole source of randomness. This allows us to accurately evaluate the risk associated with the release of a model trained on a specific dataset. Using SOTA MIAs, we empirically show that the risk estimates given by the current setup lead to many records being misclassified as low risk. We derive theoretical results which, combined with empirical evidence, suggest that the risk calculated in the current setup is an average of the risks specific to each sampled dataset, validating our use of weight initialization as the only source of randomness. Finally, we consider an MIA with a stronger adversary leveraging information about the target dataset to infer membership. Taken together, our results show that current MIA evaluation is averaging the risk across datasets leading to inaccurate risk estimates, and the risk posed by attacks leveraging information about the target dataset to be potentially underestimated.

5/27/2024

cs.LG cs.CR