Extracting Training Data from Document-Based VQA Models

Read original: arXiv:2407.08707 - Published 7/12/2024 by Francesco Pinto, Nathalie Rauschmayr, Florian Tram`er, Philip Torr, Federico Tombari

🏋️

Overview

This paper examines the ability of vision-language models (VLMs) to memorize and regurgitate specific details from their training data, even when the relevant visual information is removed.
The researchers found that VLMs can memorize and reproduce personal identifiable information (PII) that appears only once in the training set, posing a potential privacy risk.
They quantitatively measured the extractability of information in controlled experiments and differentiated between cases where it arises from generalization capabilities or from memorization.
The paper also investigates the factors that influence memorization across multiple state-of-the-art VLM models and proposes an effective countermeasure to prevent the extractability of PII.

Plain English Explanation

Vision-language models (VLMs) are AI systems that can answer questions about the contents of an image. These models have become increasingly capable, but the researchers found that they can also remember and repeat specific details from their training data, even if the relevant visual information is removed.

This means that VLMs could potentially divulge sensitive personal information that was only mentioned once during their training. To understand this issue better, the researchers conducted experiments to measure how much information these models can extract and whether it comes from genuine understanding or just memorization.

The researchers also looked at the different factors that influence how much VLMs tend to memorize, and they developed a technique that can help prevent these models from revealing sensitive personal information.

Technical Explanation

The researchers conducted a series of experiments to investigate the memorization capabilities of state-of-the-art vision-language models (VLMs) on document-based Visual Question Answering (VQA) tasks. They found that these models can memorize and regurgitate specific details from the training data, including personal identifiable information (PII) that appears only once.

To quantify this, the researchers designed controlled experiments that allowed them to differentiate between cases where the model's responses came from genuine generalization capabilities versus mere memorization. They explored various factors that influence memorization, such as the model architecture, training data size, and prompting techniques.

The researchers' findings suggest that VLMs could pose a privacy risk by inadvertently revealing sensitive information that was present in their training data. To address this, they proposed an effective heuristic countermeasure that empirically prevents the extractability of PII from these models.

Critical Analysis

The paper provides a valuable contribution to understanding the privacy and security implications of vision-language models (VLMs). By demonstrating the models' ability to memorize and reproduce even rare personal details, the researchers highlight an important consideration for the deployment of these technologies.

However, the paper does not fully explore the potential real-world consequences of this memorization behavior. While the researchers propose a countermeasure, it is unclear how effective it would be in practice or how it might impact the models' overall performance and capabilities.

Additionally, the paper focuses solely on document-based VQA tasks, and it is uncertain whether the same issues would arise in other applications of VLMs, such as image captioning or multimodal understanding. Further research is needed to understand the broader implications of memorization in these models.

Conclusion

This paper uncovers a concerning issue with vision-language models (VLMs): their ability to memorize and reproduce even rare personal details from their training data. This poses a potential privacy risk, as these models could inadvertently reveal sensitive information.

The researchers' systematic approach to quantifying and differentiating between generalization and memorization provides valuable insights into the inner workings of VLMs. Their proposed countermeasure offers a promising solution, but additional research is needed to fully understand the implications and develop robust safeguards for the deployment of these powerful AI systems.

As vision-language models continue to advance and become more prevalent, addressing privacy and security concerns like those raised in this paper will be crucial to ensuring the responsible and ethical development of these technologies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🏋️

Extracting Training Data from Document-Based VQA Models

Francesco Pinto, Nathalie Rauschmayr, Florian Tram`er, Philip Torr, Federico Tombari

Vision-Language Models (VLMs) have made remarkable progress in document-based Visual Question Answering (i.e., responding to queries about the contents of an input document provided as an image). In this work, we show these models can memorize responses for training samples and regurgitate them even when the relevant visual information has been removed. This includes Personal Identifiable Information (PII) repeated once in the training set, indicating these models could divulge memorised sensitive information and therefore pose a privacy risk. We quantitatively measure the extractability of information in controlled experiments and differentiate between cases where it arises from generalization capabilities or from memorization. We further investigate the factors that influence memorization across multiple state-of-the-art models and propose an effective heuristic countermeasure that empirically prevents the extractability of PII.

7/12/2024

🎲

Privacy-Aware Document Visual Question Answering

Rub`en Tito, Khanh Nguyen, Marlon Tobaben, Raouf Kerkouche, Mohamed Ali Souibgui, Kangsoo Jung, Joonas Jalko, Vincent Poulain D'Andecy, Aurelie Joseph, Lei Kang, Ernest Valveny, Antti Honkela, Mario Fritz, Dimosthenis Karatzas

Document Visual Question Answering (DocVQA) has quickly grown into a central task of document understanding. But despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees. In this work, we explore privacy in the domain of DocVQA for the first time, highlighting privacy issues in state of the art multi-modal LLM models used for DocVQA, and explore possible solutions. Specifically, we focus on invoice processing as a realistic document understanding scenario, and propose a large scale DocVQA dataset comprising invoice documents and associated questions and answers. We employ a federated learning scheme, that reflects the real-life distribution of documents in different businesses, and we explore the use case where the data of the invoice provider is the sensitive information to be protected. We demonstrate that non-private models tend to memorise, a behaviour that can lead to exposing private information. We then evaluate baseline training schemes employing federated learning and differential privacy in this multi-modal scenario, where the sensitive information might be exposed through either or both of the two input modalities: vision (document image) or language (OCR tokens). Finally, we design attacks exploiting the memorisation effect of the model, and demonstrate their effectiveness in probing a representative DocVQA models.

9/4/2024

💬

Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

Zeyuan Allen-Zhu, Yuanzhi Li

Large language models (LLMs) can store a vast amount of world knowledge, often extractable via question-answering (e.g., What is Abraham Lincoln's birthday?). However, do they answer such questions based on exposure to similar questions during training (i.e., cheating), or by genuinely learning to extract knowledge from sources like Wikipedia? In this paper, we investigate this issue using a controlled biography dataset. We find a strong correlation between the model's ability to extract knowledge and various diversity measures of the training data. $textbf{Essentially}$, for knowledge to be reliably extracted, it must be sufficiently augmented (e.g., through paraphrasing, sentence shuffling, translations) $textit{during pretraining}$. Without such augmentation, knowledge may be memorized but not extractable, leading to 0% accuracy, regardless of subsequent instruction fine-tuning. To understand why this occurs, we employ (nearly) linear probing to demonstrate a strong connection between the observed correlation and how the model internally encodes knowledge -- whether it is linearly encoded in the hidden embeddings of entity names or distributed across other token embeddings in the training text. This paper provides $textbf{several key recommendations for LLM pretraining in the industry}$: (1) rewrite the pretraining data -- using small, auxiliary models -- to provide knowledge augmentation, and (2) incorporate more instruction-finetuning data into the pretraining stage before it becomes too late.

7/17/2024

Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models

Sunny Duan, Mikail Khona, Abhiram Iyer, Rylan Schaeffer, Ila R Fiete

Frontier AI systems are making transformative impacts across society, but such benefits are not without costs: models trained on web-scale datasets containing personal and private data raise profound concerns about data privacy and security. Language models are trained on extensive corpora including potentially sensitive or proprietary information, and the risk of data leakage - where the model response reveals pieces of such information - remains inadequately understood. Prior work has investigated what factors drive memorization and have identified that sequence complexity and the number of repetitions drive memorization. Here, we focus on the evolution of memorization over training. We begin by reproducing findings that the probability of memorizing a sequence scales logarithmically with the number of times it is present in the data. We next show that sequences which are apparently not memorized after the first encounter can be uncovered throughout the course of training even without subsequent encounters, a phenomenon we term latent memorization. The presence of latent memorization presents a challenge for data privacy as memorized sequences may be hidden at the final checkpoint of the model but remain easily recoverable. To this end, we develop a diagnostic test relying on the cross entropy loss to uncover latent memorized sequences with high accuracy.

7/26/2024