Reducing Privacy Risks in Online Self-Disclosures with Language Models

Read original: arXiv:2311.09538 - Published 6/26/2024 by Yao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan Ritter, Wei Xu

💬

Overview

• This paper explores the privacy risks associated with self-disclosure on social media and proposes methods to detect and abstract such disclosures to protect user privacy.

• The researchers develop a taxonomy of 19 self-disclosure categories, curate a large dataset of annotated disclosure spans, and fine-tune a language model to detect these disclosures with over 65% accuracy.

• The paper also introduces the task of self-disclosure abstraction, where disclosures are rephrased into less specific terms while maintaining their utility, and explores various fine-tuning strategies to achieve this.

• Additionally, the researchers present a task of rating the importance of disclosures for context understanding, with their fine-tuned model achieving 80% accuracy, on-par with GPT-3.5.

Plain English Explanation

• People often share personal information, or "self-disclose," on social media, which can be rewarding but also poses privacy risks.

• The researchers in this paper wanted to find ways to identify self-disclosures and rephrase them in less specific ways to help protect people's privacy, while still keeping the overall meaning.

• They created a detailed list of 19 different types of self-disclosure, like age, location, or relationships, and gathered a large dataset of examples. Then, they trained a machine learning model to automatically detect these disclosures in text with good accuracy.

• To abstract the disclosures, the researchers tried different techniques to rephrase them in more general terms, like changing "I'm 16F" to "I'm a teenage girl." They found ways to do this that still preserved the overall meaning.

• Finally, the researchers developed a system to rate how important each disclosure is for understanding the context. This can help people decide which disclosures are most important to keep and which ones might be good to rephrase.

Technical Explanation

• The researchers first developed a comprehensive taxonomy of 19 self-disclosure categories, ranging from demographic information to opinions and emotions. They then curated a large dataset of over 4,800 annotated disclosure spans from online text to serve as a benchmark.

• Using this dataset, the researchers fine-tuned a pre-trained language model to detect self-disclosures, achieving an F1-score of over 65% on partial span matching. An HCI user study found that 82% of participants viewed the model positively, highlighting its real-world applicability.

• Motivated by the user feedback, the researchers introduced the task of self-disclosure abstraction. They explored various fine-tuning strategies to generate more general paraphrases of disclosures, such as changing "I'm 16F" to "I'm a teenage girl," while preserving the overall utility.

• To help users decide which disclosures to abstract, the researchers also presented a task of rating the importance of disclosures for context understanding. Their fine-tuned model achieved 80% accuracy on this task, on-par with GPT-3.5.

Critical Analysis

• The paper provides a comprehensive and well-designed solution for detecting and abstracting self-disclosures to mitigate privacy risks. However, the researchers acknowledge that their approach may not capture all nuances of self-disclosure, and there could be cases where the abstraction process inadvertently removes important contextual information.

• Additionally, the user study was relatively small, and it would be valuable to conduct a larger-scale evaluation to better understand the real-world applicability and user acceptance of the proposed system.

• While the researchers have provided ethical guidelines for the release of their corpus and models, there may be concerns about the potential misuse of such technologies, especially in sensitive domains. Ongoing monitoring and responsible deployment strategies will be crucial.

• It would also be interesting to see further research on the long-term impacts of self-disclosure abstraction on social interactions and whether it can effectively balance privacy protection and the need for authentic self-expression.

Conclusion

• This paper presents a comprehensive framework for detecting and abstracting self-disclosures on social media, aiming to protect user privacy while preserving the overall utility of the information shared.

• The researchers' development of a self-disclosure taxonomy, a large annotated dataset, and fine-tuned language models for detection and abstraction are significant contributions to the field of online privacy protection.

• While the proposed solutions show promise, ongoing research and careful implementation will be needed to address potential limitations and ensure the responsible use of these technologies to empower users and maintain the benefits of social media interaction.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

Reducing Privacy Risks in Online Self-Disclosures with Language Models

Yao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan Ritter, Wei Xu

Self-disclosure, while being common and rewarding in social media interaction, also poses privacy risks. In this paper, we take the initiative to protect the user-side privacy associated with online self-disclosure through detection and abstraction. We develop a taxonomy of 19 self-disclosure categories and curate a large corpus consisting of 4.8K annotated disclosure spans. We then fine-tune a language model for detection, achieving over 65% partial span F$_1$. We further conduct an HCI user study, with 82% of participants viewing the model positively, highlighting its real-world applicability. Motivated by the user feedback, we introduce the task of self-disclosure abstraction, which is rephrasing disclosures into less specific terms while preserving their utility, e.g., Im 16F to I'm a teenage girl. We explore various fine-tuning strategies, and our best model can generate diverse abstractions that moderately reduce privacy risks while maintaining high utility according to human evaluation. To help users in deciding which disclosures to abstract, we present a task of rating their importance for context understanding. Our fine-tuned model achieves 80% accuracy, on-par with GPT-3.5. Given safety and privacy considerations, we will only release our corpus and models to researcher who agree to the ethical guidelines outlined in Ethics Statement.

6/26/2024

Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

Niloofar Mireshghallah, Maria Antoniak, Yash More, Yejin Choi, Golnoosh Farnadi

Measuring personal disclosures made in human-chatbot interactions can provide a better understanding of users' AI literacy and facilitate privacy research for large language models (LLMs). We run an extensive, fine-grained analysis on the personal disclosures made by real users to commercial GPT models, investigating the leakage of personally identifiable and sensitive information. To understand the contexts in which users disclose to chatbots, we develop a taxonomy of tasks and sensitive topics, based on qualitative and quantitative analysis of naturally occurring conversations. We discuss these potential privacy harms and observe that: (1) personally identifiable information (PII) appears in unexpected contexts such as in translation or code editing (48% and 16% of the time, respectively) and (2) PII detection alone is insufficient to capture the sensitive topics that are common in human-chatbot interactions, such as detailed sexual preferences or specific drug use habits. We believe that these high disclosure rates are of significant importance for researchers and data curators, and we call for the design of appropriate nudging mechanisms to help users moderate their interactions.

7/23/2024

💬

Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey

Victoria Smith, Ali Shahin Shamsabadi, Carolyn Ashurst, Adrian Weller

Large Language Models (LLMs) have shown greatly enhanced performance in recent years, attributed to increased size and extensive training data. This advancement has led to widespread interest and adoption across industries and the public. However, training data memorization in Machine Learning models scales with model size, particularly concerning for LLMs. Memorized text sequences have the potential to be directly leaked from LLMs, posing a serious threat to data privacy. Various techniques have been developed to attack LLMs and extract their training data. As these models continue to grow, this issue becomes increasingly critical. To help researchers and policymakers understand the state of knowledge around privacy attacks and mitigations, including where more work is needed, we present the first SoK on data privacy for LLMs. We (i) identify a taxonomy of salient dimensions where attacks differ on LLMs, (ii) systematize existing attacks, using our taxonomy of dimensions to highlight key trends, (iii) survey existing mitigation strategies, highlighting their strengths and limitations, and (iv) identify key gaps, demonstrating open problems and areas for concern.

6/19/2024

It's a Fair Game, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents

Zhiping Zhang, Michelle Jia, Hao-Ping Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, Tianshi Li

The widespread use of Large Language Model (LLM)-based conversational agents (CAs), especially in high-stakes domains, raises many privacy concerns. Building ethical LLM-based CAs that respect user privacy requires an in-depth understanding of the privacy risks that concern users the most. However, existing research, primarily model-centered, does not provide insight into users' perspectives. To bridge this gap, we analyzed sensitive disclosures in real-world ChatGPT conversations and conducted semi-structured interviews with 19 LLM-based CA users. We found that users are constantly faced with trade-offs between privacy, utility, and convenience when using LLM-based CAs. However, users' erroneous mental models and the dark patterns in system design limited their awareness and comprehension of the privacy risks. Additionally, the human-like interactions encouraged more sensitive disclosures, which complicated users' ability to navigate the trade-offs. We discuss practical design guidelines and the needs for paradigm shifts to protect the privacy of LLM-based CA users.

4/3/2024