Large Language Model Agent for Fake News Detection

2405.01593

Published 5/6/2024 by Xinyi Li, Yongfeng Zhang, Edward C. Malthouse

Large Language Model Agent for Fake News Detection

Abstract

In the current digital era, the rapid spread of misinformation on online platforms presents significant challenges to societal well-being, public trust, and democratic processes, influencing critical decision making and public opinion. To address these challenges, there is a growing need for automated fake news detection mechanisms. Pre-trained large language models (LLMs) have demonstrated exceptional capabilities across various natural language processing (NLP) tasks, prompting exploration into their potential for verifying news claims. Instead of employing LLMs in a non-agentic way, where LLMs generate responses based on direct prompts in a single shot, our work introduces FactAgent, an agentic approach of utilizing LLMs for fake news detection. FactAgent enables LLMs to emulate human expert behavior in verifying news claims without any model training, following a structured workflow. This workflow breaks down the complex task of news veracity checking into multiple sub-steps, where LLMs complete simple tasks using their internal knowledge or external tools. At the final step of the workflow, LLMs integrate all findings throughout the workflow to determine the news claim's veracity. Compared to manual human verification, FactAgent offers enhanced efficiency. Experimental studies demonstrate the effectiveness of FactAgent in verifying claims without the need for any training process. Moreover, FactAgent provides transparent explanations at each step of the workflow and during final decision-making, offering insights into the reasoning process of fake news detection for end users. FactAgent is highly adaptable, allowing for straightforward updates to its tools that LLMs can leverage within the workflow, as well as updates to the workflow itself using domain knowledge. This adaptability enables FactAgent's application to news verification across various domains.

Get summaries of the top AI research delivered straight to your inbox:

Overview

This paper presents a novel approach to using large language models for fake news detection. The researchers develop a system that leverages the capabilities of large language models to identify and classify fake news articles. The key ideas include:

Adapting existing large language models to the task of fake news detection
Incorporating multimodal information (e.g., text, images) to improve detection accuracy
Generating synthetic fake news samples to further train the model
Deploying the system as an interactive agent to assist human fact-checking

Plain English Explanation

The researchers in this paper are exploring how powerful AI language models, known as "large language models," can be used to automatically detect fake news. Fake news is a big problem online, and it's important to be able to quickly identify false or misleading information.

The team developed a system that takes advantage of the impressive language understanding abilities of large language models. These models are trained on massive amounts of text data, giving them a deep understanding of how language is used. The researchers adapted these models to the specific task of detecting fake news, training them to recognize the patterns and characteristics of false articles.

Importantly, the system doesn't just look at the text - it also considers other information like images that might be included in news articles. Combining multiple "modalities" of data can help the model make more accurate judgments.

The researchers also generated their own fake news samples to further train the model. This helps the system become even better at identifying fabricated or misleading content.

Finally, the team packaged their fake news detection system into an interactive "agent" that can assist human fact-checkers and readers in evaluating the truthfulness of online information. This allows the power of the AI model to be harnessed in a user-friendly way.

Overall, this work demonstrates how advanced AI can be leveraged to combat the growing problem of fake news and misinformation online. By adapting large language models to this task, the researchers have developed a promising approach to automatically detect and flag unreliable content.

Technical Explanation

The paper describes a system that uses large language models for the task of fake news detection. The researchers first adapt existing pre-trained language models, such as BERT and GPT-2, to the specific problem of classifying news articles as real or fake.

To enhance the detection capabilities, the system also incorporates multimodal information, such as images that may accompany news articles. This allows the model to consider visual cues in addition to the textual content when making its classification.

The researchers further augment their training data by generating synthetic fake news samples using large language models. This helps the system become more robust at identifying fabricated content.

Finally, the team develops the system as an interactive agent-based framework that can be deployed to assist human fact-checkers and readers in verifying the truthfulness of online information. This allows the power of the AI model to be leveraged in a user-friendly way.

Critical Analysis

The paper presents a promising approach to using large language models for fake news detection, but it also acknowledges several limitations and areas for further research.

One key limitation is the reliance on the availability of high-quality training data, both real and fake news samples. The researchers note that creating a comprehensive dataset of fake news can be challenging, and the quality of the synthetic samples generated by the model may not fully capture the nuances of real-world misinformation.

Additionally, the paper does not delve deeply into the potential biases or blind spots that may be present in the large language models used as the foundation of the system. As these models are trained on vast amounts of online data, they may inherit societal biases or fail to accurately represent certain perspectives.

Further research could explore ways to make the fake news detection system more robust and adaptable, such as by incorporating techniques for explainable AI or by developing methods to continuously update the model as new fake news tactics emerge.

Overall, the work presented in this paper represents an important step forward in leveraging the power of large language models to combat the growing problem of online misinformation. However, continued research and development will be necessary to create reliable and ethical solutions for this critical challenge.

Conclusion

This paper introduces a novel approach to using large language models for the task of fake news detection. The researchers have developed a system that adapts existing pre-trained language models to the specific problem of classifying news articles as real or fake, and they have further enhanced the detection capabilities by incorporating multimodal information and generating synthetic fake news samples.

The resulting system is deployed as an interactive agent-based framework that can assist human fact-checkers and readers in verifying the truthfulness of online information. This work represents a significant advancement in the field of fake news detection and highlights the potential of large language models to tackle this important societal challenge.

While the paper acknowledges several limitations and areas for further research, the proposed approach demonstrates the power of cutting-edge AI technologies to combat the spread of misinformation and promote the dissemination of accurate, reliable information online.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🔎

Adapting Fake News Detection to the Era of Large Language Models

Jinyan Su, Claire Cardie, Preslav Nakov

In the age of large language models (LLMs) and the widespread adoption of AI-driven content creation, the landscape of information dissemination has witnessed a paradigm shift. With the proliferation of both human-written and machine-generated real and fake news, robustly and effectively discerning the veracity of news articles has become an intricate challenge. While substantial research has been dedicated to fake news detection, this either assumes that all news articles are human-written or abruptly assumes that all machine-generated news are fake. Thus, a significant gap exists in understanding the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news. In this paper, we study this gap by conducting a comprehensive evaluation of fake news detectors trained in various scenarios. Our primary objectives revolve around the following pivotal question: How to adapt fake news detectors to the era of LLMs? Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa. Moreover, due to the bias of detectors against machine-generated texts cite{su2023fake}, they should be trained on datasets with a lower machine-generated news ratio than the test set. Building on our findings, we provide a practical strategy for the development of robust fake news detectors.

4/16/2024

cs.CL cs.AI

Multimodal Large Language Models to Support Real-World Fact-Checking

Jiahui Geng, Yova Kementchedjhieva, Preslav Nakov, Iryna Gurevych

Multimodal large language models (MLLMs) carry the potential to support humans in processing vast amounts of information. While MLLMs are already being used as a fact-checking tool, their abilities and limitations in this regard are understudied. Here is aim to bridge this gap. In particular, we propose a framework for systematically assessing the capacity of current multimodal models to facilitate real-world fact-checking. Our methodology is evidence-free, leveraging only these models' intrinsic knowledge and reasoning capabilities. By designing prompts that extract models' predictions, explanations, and confidence levels, we delve into research questions concerning model accuracy, robustness, and reasons for failure. We empirically find that (1) GPT-4V exhibits superior performance in identifying malicious and misleading multimodal claims, with the ability to explain the unreasonable aspects and underlying motives, and (2) existing open-source models exhibit strong biases and are highly sensitive to the prompt. Our study offers insights into combating false multimodal information and building secure, trustworthy multimodal models. To the best of our knowledge, we are the first to evaluate MLLMs for real-world fact-checking.

4/29/2024

cs.CL cs.AI

🔎

LingML: Linguistic-Informed Machine Learning for Enhanced Fake News Detection

Jasraj Singh, Fang Liu, Hong Xu, Bee Chin Ng, Wei Zhang

Nowadays, Information spreads at an unprecedented pace in social media and discerning truth from misinformation and fake news has become an acute societal challenge. Machine learning (ML) models have been employed to identify fake news but are far from perfect with challenging problems like limited accuracy, interpretability, and generalizability. In this paper, we enhance ML-based solutions with linguistics input and we propose LingML, linguistic-informed ML, for fake news detection. We conducted an experimental study with a popular dataset on fake news during the pandemic. The experiment results show that our proposed solution is highly effective. There are fewer than two errors out of every ten attempts with only linguistic input used in ML and the knowledge is highly explainable. When linguistics input is integrated with advanced large-scale ML models for natural language processing, our solution outperforms existing ones with 1.8% average error rate. LingML creates a new path with linguistics to push the frontier of effective and efficient fake news detection. It also sheds light on real-world multi-disciplinary applications requiring both ML and domain expertise to achieve optimal performance.

5/8/2024

cs.CL

🔎

FakeGPT: Fake News Generation, Explanation and Detection of Large Language Models

Yue Huang, Lichao Sun

The rampant spread of fake news has adversely affected society, resulting in extensive research on curbing its spread. As a notable milestone in large language models (LLMs), ChatGPT has gained significant attention due to its exceptional natural language processing capabilities. In this study, we present a thorough exploration of ChatGPT's proficiency in generating, explaining, and detecting fake news as follows. Generation -- We employ four prompt methods to generate fake news samples and prove the high quality of these samples through both self-assessment and human evaluation. Explanation -- We obtain nine features to characterize fake news based on ChatGPT's explanations and analyze the distribution of these factors across multiple public datasets. Detection -- We examine ChatGPT's capacity to identify fake news. We explore its detection consistency and then propose a reason-aware prompt method to improve its performance. Although our experiments demonstrate that ChatGPT shows commendable performance in detecting fake news, there is still room for its improvement. Consequently, we further probe into the potential extra information that could bolster its effectiveness in detecting fake news.

4/9/2024

cs.CL