Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

Read original: arXiv:2407.11438 - Published 7/23/2024 by Niloofar Mireshghallah, Maria Antoniak, Yash More, Yejin Choi, Golnoosh Farnadi

Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

Overview

This paper explores how people disclose personal information when conversing with large language models (LLMs) in the real world, and the potential privacy implications.
The researchers collected and analyzed a dataset of natural conversations between humans and LLMs to identify instances of personal disclosures.
Key findings include the prevalence of sensitive personal information being shared, even when users express skepticism about the AI's trustworthiness.

Plain English Explanation

In this study, the researchers were interested in how people interact with and share personal information with AI language models in real-world scenarios. They collected a dataset of actual conversations between humans and AI chatbots to see what types of sensitive or private details people would reveal, even when they claimed not to trust the AI.

The researchers found that people frequently disclosed all sorts of personal information to the AI systems, including details about their relationships, health, finances, and other private matters. This happened even when the humans expressed doubts about whether they could trust the AI to keep their information confidential.

This is an important finding because it shows that people may not always be as cautious about privacy when talking to AI as they claim to be. The researchers suggest this could have significant implications for personal data security and the responsible development of AI technologies that interact with humans.

Technical Explanation

The researchers collected a large dataset of real-world conversations between humans and AI language models. They used a combination of web scraping, crowdsourcing, and direct interactions to capture these natural dialogues "in the wild."

Through careful analysis of the conversation transcripts, the team identified numerous instances where users disclosed sensitive personal information, such as details about their relationships, health, finances, and more. Importantly, this occurred even when the human participants explicitly expressed skepticism or distrust of the AI's trustworthiness.

The paper presents a taxonomy of the different types of personal disclosures observed, as well as an examination of the conversational contexts and user behaviors that seemed to contribute to this phenomenon. The researchers also discuss the potential privacy risks and ethical concerns raised by their findings.

Critical Analysis

The study provides valuable insights into how people actually interact with AI language models in uncontrolled, real-world settings. By analyzing naturalistic conversations, the researchers were able to uncover behavioral patterns that may not emerge in more constrained, experimental studies.

That said, the dataset is limited to a specific set of interactions and may not generalize to all possible human-AI dialogues. Additional research is needed to further validate and expand on these findings across a wider range of AI systems, user populations, and conversational contexts.

The paper also does not delve deeply into the psychological or social factors that may drive people to disclose personal information to AI, even when they claim not to trust it. Exploring these underlying mechanisms could lead to a richer understanding of the phenomenon and inform the development of more privacy-preserving AI interactions.

Conclusion

This study sheds important light on the complex dynamics of human-AI interaction and the potential risks to personal privacy. The researchers found that people often share sensitive information with language models, even when expressing distrust, highlighting the need for careful consideration of privacy implications as AI becomes more prevalent in our daily lives.

The findings underscore the importance of responsible AI development, with a focus on building systems that respect user privacy and prevent unintended disclosures of personal data. Continued research in this area can help inform best practices and design principles to ensure AI technologies empower users without compromising their digital wellbeing.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

Niloofar Mireshghallah, Maria Antoniak, Yash More, Yejin Choi, Golnoosh Farnadi

Measuring personal disclosures made in human-chatbot interactions can provide a better understanding of users' AI literacy and facilitate privacy research for large language models (LLMs). We run an extensive, fine-grained analysis on the personal disclosures made by real users to commercial GPT models, investigating the leakage of personally identifiable and sensitive information. To understand the contexts in which users disclose to chatbots, we develop a taxonomy of tasks and sensitive topics, based on qualitative and quantitative analysis of naturally occurring conversations. We discuss these potential privacy harms and observe that: (1) personally identifiable information (PII) appears in unexpected contexts such as in translation or code editing (48% and 16% of the time, respectively) and (2) PII detection alone is insufficient to capture the sensitive topics that are common in human-chatbot interactions, such as detailed sexual preferences or specific drug use habits. We believe that these high disclosure rates are of significant importance for researchers and data curators, and we call for the design of appropriate nudging mechanisms to help users moderate their interactions.

7/23/2024

It's a Fair Game, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents

Zhiping Zhang, Michelle Jia, Hao-Ping Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, Tianshi Li

The widespread use of Large Language Model (LLM)-based conversational agents (CAs), especially in high-stakes domains, raises many privacy concerns. Building ethical LLM-based CAs that respect user privacy requires an in-depth understanding of the privacy risks that concern users the most. However, existing research, primarily model-centered, does not provide insight into users' perspectives. To bridge this gap, we analyzed sensitive disclosures in real-world ChatGPT conversations and conducted semi-structured interviews with 19 LLM-based CA users. We found that users are constantly faced with trade-offs between privacy, utility, and convenience when using LLM-based CAs. However, users' erroneous mental models and the dark patterns in system design limited their awareness and comprehension of the privacy risks. Additionally, the human-like interactions encouraged more sensitive disclosures, which complicated users' ability to navigate the trade-offs. We discuss practical design guidelines and the needs for paradigm shifts to protect the privacy of LLM-based CA users.

4/3/2024

💬

Reducing Privacy Risks in Online Self-Disclosures with Language Models

Yao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan Ritter, Wei Xu

Self-disclosure, while being common and rewarding in social media interaction, also poses privacy risks. In this paper, we take the initiative to protect the user-side privacy associated with online self-disclosure through detection and abstraction. We develop a taxonomy of 19 self-disclosure categories and curate a large corpus consisting of 4.8K annotated disclosure spans. We then fine-tune a language model for detection, achieving over 65% partial span F$_1$. We further conduct an HCI user study, with 82% of participants viewing the model positively, highlighting its real-world applicability. Motivated by the user feedback, we introduce the task of self-disclosure abstraction, which is rephrasing disclosures into less specific terms while preserving their utility, e.g., Im 16F to I'm a teenage girl. We explore various fine-tuning strategies, and our best model can generate diverse abstractions that moderately reduce privacy risks while maintaining high utility according to human evaluation. To help users in deciding which disclosures to abstract, we present a task of rating their importance for context understanding. Our fine-tuned model achieves 80% accuracy, on-par with GPT-3.5. Given safety and privacy considerations, we will only release our corpus and models to researcher who agree to the ethical guidelines outlined in Ethics Statement.

6/26/2024

🧪

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi

The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from multiple sources in their inputs and are expected to reason about what to share in their outputs, for what purpose and with whom, within a given context. In this work, we draw attention to the highly critical yet overlooked notion of contextual privacy by proposing ConfAIde, a benchmark designed to identify critical weaknesses in the privacy reasoning capabilities of instruction-tuned LLMs. Our experiments show that even the most capable models such as GPT-4 and ChatGPT reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. This leakage persists even when we employ privacy-inducing prompts or chain-of-thought reasoning. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.

7/2/2024