PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action

Read original: arXiv:2409.00138 - Published 9/4/2024 by Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang

💬

Overview

The paper explores the critical issue of ensuring language models (LMs) act in accordance with contextual privacy norms as they are widely used in personalized communication scenarios.
Quantifying the privacy norm awareness of LMs and the emerging privacy risk in LM-mediated communication is challenging due to the contextual and long-tailed nature of privacy-sensitive cases, as well as the lack of evaluation approaches that capture realistic application scenarios.
To address these challenges, the researchers propose a novel framework called PrivacyLens that extends privacy-sensitive seeds into expressive vignettes and further into agent trajectories, enabling multi-level evaluation of privacy leakage in LM agents' actions.

Plain English Explanation

As language models (LMs) become more prevalent in our daily lives, such as when sending emails or posting on social media, it's crucial that they act in a way that respects people's privacy. However, it's difficult to measure how well LMs understand and adhere to privacy norms, as these norms can vary depending on the context and can be quite complex. Additionally, existing methods for evaluating LMs' privacy awareness don't always reflect real-world usage scenarios.

To tackle this problem, the researchers developed a new framework called PrivacyLens. PrivacyLens allows them to create a wide range of privacy-sensitive scenarios, from simple seeds to more detailed vignettes and agent trajectories. This enables a comprehensive evaluation of how well LMs, including state-of-the-art models like GPT-4 and Llama-3-70B, protect user privacy when carrying out various tasks. The researchers found that even when prompted to be privacy-conscious, these LMs can still reveal sensitive information in a significant portion of cases.

Technical Explanation

The researchers propose PrivacyLens, a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories. This allows for multi-level evaluation of privacy leakage in LM agents' actions.

The PrivacyLens framework is instantiated with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. Using this dataset, the researchers reveal a discrepancy between LM performance in answering probing questions and their actual behavior when executing user instructions in an agent setup.

The study found that state-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, respectively, even when prompted with privacy-enhancing instructions. The researchers also demonstrate the dynamic nature of PrivacyLens by extending each seed into multiple trajectories to "red-team" LM privacy leakage risk.

Critical Analysis

The researchers acknowledge the contextual and long-tailed nature of privacy-sensitive cases, which makes it challenging to fully capture the nuances of privacy norms in their framework. While PrivacyLens represents a significant step forward in evaluating LM privacy awareness, there may be additional edge cases or subtle privacy considerations that are not yet addressed.

It's also worth noting that the study focuses on specific state-of-the-art LMs, and the performance of other LMs or future iterations of the models tested may differ. Ongoing research and monitoring will be essential to ensure LMs continue to respect user privacy as the technology evolves.

Conclusion

This research highlights the critical importance of ensuring language models respect privacy norms as they become more deeply integrated into our personal and professional lives. The PrivacyLens framework provides a valuable tool for evaluating LM privacy awareness and risk, but there is still work to be done to fully address the challenges posed by the contextual nature of privacy and the rapid pace of LM development.

By continuing to invest in privacy-preserving AI research, we can work towards language models that reliably protect user privacy while still providing the valuable communication and personalization capabilities that make them so useful.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action

Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang

As language models (LMs) are widely utilized in personalized communication scenarios (e.g., sending emails, writing social media posts) and endowed with a certain level of agency, ensuring they act in accordance with the contextual privacy norms becomes increasingly critical. However, quantifying the privacy norm awareness of LMs and the emerging privacy risk in LM-mediated communication is challenging due to (1) the contextual and long-tailed nature of privacy-sensitive cases, and (2) the lack of evaluation approaches that capture realistic application scenarios. To address these challenges, we propose PrivacyLens, a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories, enabling multi-level evaluation of privacy leakage in LM agents' actions. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. Using this dataset, we reveal a discrepancy between LM performance in answering probing questions and their actual behavior when executing user instructions in an agent setup. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions. We also demonstrate the dynamic nature of PrivacyLens by extending each seed into multiple trajectories to red-team LM privacy leakage risk. Dataset and code are available at https://github.com/SALT-NLP/PrivacyLens.

9/4/2024

💬

Privacy-Aware Visual Language Models

Laurens Samson, Nimrod Barazani, Sennay Ghebreab, Yuki M. Asano

This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe a generally limited understanding of privacy, highlighting a significant area for model improvement. Based on this we introduce PrivTune, a new instruction-tuning dataset aimed at equipping VLMs with knowledge about visual privacy. By tuning two pretrained VLMs, TinyLLaVa and MiniGPT-v2, on this small dataset, we achieve strong gains in their ability to recognize sensitive content, outperforming even GPT4-V. At the same time, we show that privacy-tuning only minimally affects the VLMs performance on standard benchmarks such as VQA. Overall, this paper lays out a crucial challenge for making VLMs effective in handling real-world data safely and provides a simple recipe that takes the first step towards building privacy-aware VLMs.

5/28/2024

🧪

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi

The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from multiple sources in their inputs and are expected to reason about what to share in their outputs, for what purpose and with whom, within a given context. In this work, we draw attention to the highly critical yet overlooked notion of contextual privacy by proposing ConfAIde, a benchmark designed to identify critical weaknesses in the privacy reasoning capabilities of instruction-tuned LLMs. Our experiments show that even the most capable models such as GPT-4 and ChatGPT reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. This leakage persists even when we employ privacy-inducing prompts or chain-of-thought reasoning. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.

7/2/2024

💬

PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models

Haoran Li, Dadi Guo, Donghao Li, Wei Fan, Qi Hu, Xin Liu, Chunkit Chan, Duanyi Yao, Yuan Yao, Yangqiu Song

The rapid development of language models (LMs) brings unprecedented accessibility and usage for both models and users. On the one hand, powerful LMs achieve state-of-the-art performance over numerous downstream NLP tasks. On the other hand, more and more attention is paid to unrestricted model accesses that may bring malicious privacy risks of data leakage. To address these issues, many recent works propose privacy-preserving language models (PPLMs) with differential privacy (DP). Unfortunately, different DP implementations make it challenging for a fair comparison among existing PPLMs. In this paper, we present PrivLM-Bench, a multi-perspective privacy evaluation benchmark to empirically and intuitively quantify the privacy leakage of LMs. Instead of only reporting DP parameters, PrivLM-Bench sheds light on the neglected inference data privacy during actual usage. PrivLM-Bench first clearly defines multi-faceted privacy objectives. Then, PrivLM-Bench constructs a unified pipeline to perform private fine-tuning. Lastly, PrivLM-Bench performs existing privacy attacks on LMs with pre-defined privacy objectives as the empirical evaluation results. The empirical attack results are used to fairly and intuitively evaluate the privacy leakage of various PPLMs. We conduct extensive experiments on three datasets of GLUE for mainstream LMs.

6/4/2024