Complex Claim Verification with Evidence Retrieved in the Wild

Read original: arXiv:2305.11859 - Published 6/18/2024 by Jifan Chen, Grace Kim, Aniruddh Sriram, Greg Durrett, Eunsol Choi

🌐

Overview

This paper presents a fully automated pipeline for checking real-world claims by retrieving raw evidence from the web.
The pipeline includes five components: claim decomposition, raw document retrieval, fine-grained evidence retrieval, claim-focused summarization, and veracity judgment.
The researchers conduct experiments on complex political claims in the ClaimDecomp dataset and show that the aggregated evidence produced by their pipeline improves veracity judgments.

Plain English Explanation

The paper addresses the challenge of [object Object], which is a crucial task for combating the spread of misinformation. Prior research has made simplifying assumptions about the availability of evidence, either assuming no access to evidence, access to evidence curated by a human fact-checker, or access to evidence available long after the claim has been made.

In contrast, this paper presents the first fully automated pipeline that can check real-world claims by retrieving raw evidence from the web, modelling the realistic scenario where an emerging claim needs to be quickly verified. The pipeline starts by [object Object] into its key components, then retrieves relevant documents from the web, extracts fine-grained evidence, summarizes the evidence in a claim-focused way, and finally makes a judgment on the veracity of the claim.

The researchers test their pipeline on complex political claims from the ClaimDecomp dataset, and find that the aggregated evidence it produces can improve the accuracy of veracity judgments. They also conduct a human evaluation, which suggests that the evidence summaries generated by the system are reliable (i.e., do not contain made-up information) and relevant to answering key questions about a claim, even when the system cannot surface a complete set of evidence.

Technical Explanation

This paper presents a [object Object] by retrieving raw evidence from the web. The pipeline consists of five key components:

Claim Decomposition: The system first breaks down the input claim into its key components (e.g., entities, actions, locations) using a claim decomposition model.
Raw Document Retrieval: Next, the system retrieves a set of potentially relevant documents from the web based on the decomposed claim, but restricts the search to only documents available prior to the claim's making.
Fine-Grained Evidence Retrieval: The system then extracts fine-grained evidence snippets from the retrieved documents that are most relevant to answering the key components of the claim.
Claim-Focused Summarization: The system synthesizes the relevant evidence snippets into a concise summary that is focused on addressing the original claim.
Veracity Judgment: Finally, the system makes a judgment on the veracity of the claim based on the aggregated evidence.

The researchers evaluate this pipeline on the ClaimDecomp dataset of complex political claims. They find that the evidence summaries produced by the system can significantly improve the accuracy of veracity judgments compared to baselines. A human evaluation also suggests that the summaries are reliable (do not contain hallucinated information) and relevant to answering key questions about the claims.

Critical Analysis

The key strength of this work is that it presents the first fully automated pipeline for fact-checking real-world claims by retrieving raw evidence from the web, rather than making simplifying assumptions about evidence availability. This models a more realistic scenario where an emerging claim needs to be quickly verified.

However, the authors acknowledge several limitations and areas for future work. First, the pipeline's performance is still limited by the quality of the underlying retrieval and summarization models, which could be improved with further research. Second, the experiments are focused on political claims, so the generalizability to other domains is unclear.

Additionally, while the human evaluation suggests the system's summaries are reliable, there may still be cases where the system hallucinates or misinterprets evidence in subtle ways. Further research is needed to better understand the failure modes and robustness of such automated fact-checking systems.

It would also be interesting to see how this pipeline could be extended to handle evolving claims and evidence over time, rather than just a static snapshot. [object Object] could be relevant here.

Overall, this paper makes an important contribution by moving towards more realistic and fully automated fact-checking, but there is still significant room for improvement and further research in this critical area of [object Object].

Conclusion

This paper presents the first fully automated pipeline for checking real-world claims by retrieving raw evidence from the web, in contrast to prior work that made simplifying assumptions about evidence availability. The pipeline decomposes claims, retrieves relevant documents, extracts fine-grained evidence, summarizes the evidence, and makes a veracity judgment.

Experiments on complex political claims show that the aggregated evidence produced by this pipeline can improve the accuracy of veracity judgments. A human evaluation also suggests the system's evidence summaries are reliable and relevant, even when they cannot surface a complete set of evidence.

While this work represents an important step towards more realistic and automated fact-checking, there is still significant room for improvement, particularly in enhancing the underlying retrieval and summarization models, expanding to other domains, and addressing issues of robustness and evolving claims over time. Continued research in this area is crucial for combating the spread of misinformation in the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌐

Complex Claim Verification with Evidence Retrieved in the Wild

Jifan Chen, Grace Kim, Aniruddh Sriram, Greg Durrett, Eunsol Choi

Evidence retrieval is a core part of automatic fact-checking. Prior work makes simplifying assumptions in retrieval that depart from real-world use cases: either no access to evidence, access to evidence curated by a human fact-checker, or access to evidence available long after the claim has been made. In this work, we present the first fully automated pipeline to check real-world claims by retrieving raw evidence from the web. We restrict our retriever to only search documents available prior to the claim's making, modeling the realistic scenario where an emerging claim needs to be checked. Our pipeline includes five components: claim decomposition, raw document retrieval, fine-grained evidence retrieval, claim-focused summarization, and veracity judgment. We conduct experiments on complex political claims in the ClaimDecomp dataset and show that the aggregated evidence produced by our pipeline improves veracity judgments. Human evaluation finds the evidence summary produced by our system is reliable (it does not hallucinate information) and relevant to answering key questions about a claim, suggesting that it can assist fact-checkers even when it cannot surface a complete evidence set.

6/18/2024

Robust Claim Verification Through Fact Detection

Nazanin Jafari, James Allan

Claim verification can be a challenging task. In this paper, we present a method to enhance the robustness and reasoning capabilities of automated claim verification through the extraction of short facts from evidence. Our novel approach, FactDetect, leverages Large Language Models (LLMs) to generate concise factual statements from evidence and label these facts based on their semantic relevance to the claim and evidence. The generated facts are then combined with the claim and evidence. To train a lightweight supervised model, we incorporate a fact-detection task into the claim verification process as a multitasking approach to improve both performance and explainability. We also show that augmenting FactDetect in the claim verification prompt enhances performance in zero-shot claim verification using LLMs. Our method demonstrates competitive results in the supervised claim verification model by 15% on the F1 score when evaluated for challenging scientific claim verification datasets. We also demonstrate that FactDetect can be augmented with claim and evidence for zero-shot prompting (AugFactDetect) in LLMs for verdict prediction. We show that AugFactDetect outperforms the baseline with statistical significance on three challenging scientific claim verification datasets with an average of 17.3% performance gain compared to the best performing baselines.

7/29/2024

Navigating the Noisy Crowd: Finding Key Information for Claim Verification

Haisong Gong, Huanhuan Ma, Qiang Liu, Shu Wu, Liang Wang

Claim verification is a task that involves assessing the truthfulness of a given claim based on multiple evidence pieces. Using large language models (LLMs) for claim verification is a promising way. However, simply feeding all the evidence pieces to an LLM and asking if the claim is factual does not yield good results. The challenge lies in the noisy nature of both the evidence and the claim: evidence passages typically contain irrelevant information, with the key facts hidden within the context, while claims often convey multiple aspects simultaneously. To navigate this noisy crowd of information, we propose EACon (Evidence Abstraction and Claim Deconstruction), a framework designed to find key information within evidence and verify each aspect of a claim separately. EACon first finds keywords from the claim and employs fuzzy matching to select relevant keywords for each raw evidence piece. These keywords serve as a guide to extract and summarize critical information into abstracted evidence. Subsequently, EACon deconstructs the original claim into subclaims, which are then verified against both abstracted and raw evidence individually. We evaluate EACon using two open-source LLMs on two challenging datasets. Results demonstrate that EACon consistently and substantially improve LLMs' performance in claim verification.

7/18/2024

Document-level Claim Extraction and Decontextualisation for Fact-Checking

Zhenyun Deng, Michael Schlichtkrull, Andreas Vlachos

Selecting which claims to check is a time-consuming task for human fact-checkers, especially from documents consisting of multiple sentences and containing multiple claims. However, existing claim extraction approaches focus more on identifying and extracting claims from individual sentences, e.g., identifying whether a sentence contains a claim or the exact boundaries of the claim within a sentence. In this paper, we propose a method for document-level claim extraction for fact-checking, which aims to extract check-worthy claims from documents and decontextualise them so that they can be understood out of context. Specifically, we first recast claim extraction as extractive summarization in order to identify central sentences from documents, then rewrite them to include necessary context from the originating document through sentence decontextualisation. Evaluation with both automatic metrics and a fact-checking professional shows that our method is able to extract check-worthy claims from documents more accurately than previous work, while also improving evidence retrieval.

6/13/2024