Towards a Framework for Evaluating Explanations in Automated Fact Verification

2403.20322

Published 5/21/2024 by Neema Kotonya, Francesca Toni

Towards a Framework for Evaluating Explanations in Automated Fact Verification

Abstract

As deep neural models in NLP become more complex, and as a consequence opaque, the necessity to interpret them becomes greater. A burgeoning interest has emerged in rationalizing explanations to provide short and coherent justifications for predictions. In this position paper, we advocate for a formal framework for key concepts and properties about rationalizing explanations to support their evaluation systematically. We also outline one such formal framework, tailored to rationalizing explanations of increasingly complex structures, from free-form explanations to deductive explanations, to argumentative explanations (with the richest structure). Focusing on the automated fact verification task, we provide illustrations of the use and usefulness of our formalization for evaluating explanations, tailored to their varying structures.

Create account to get full access

Overview

This paper proposes a framework for evaluating the explanations provided by automated fact verification systems.
Fact verification systems aim to determine whether a given claim is true or false based on evidence from various sources.
Explanations are crucial for understanding and trusting the decisions made by these systems, but there is currently no standard way to evaluate the quality of the explanations.

Plain English Explanation

The paper focuses on improving automated fact verification systems, which are designed to determine whether a claim is true or false based on evidence from different sources. These systems often provide explanations for their decisions, but there is no standard way to evaluate how good or useful those explanations are.

Imagine you have a friend who is really good at finding the truth about things. Whenever you ask them about a claim, they can quickly tell you whether it's true or false and explain their reasoning. But sometimes, their explanations are hard to understand or don't seem to fully justify their conclusion.

Similarly, automated fact verification systems can make decisions, but the way they explain those decisions is not always clear or helpful. The researchers in this paper want to develop a framework, or a set of guidelines, for evaluating the quality of the explanations provided by these systems. This would help ensure that the explanations are meaningful and trustworthy, just like you'd want from your knowledgeable friend.

Technical Explanation

The paper proposes a framework for evaluating the explanations generated by automated fact verification systems. These systems aim to determine whether a given claim is true or false based on evidence from various sources, such as news articles, social media posts, and fact-checking websites.

The researchers identify three key criteria for evaluating the quality of explanations:

Faithfulness: The explanation should accurately reflect the reasoning and decision-making process of the fact verification system.
Completeness: The explanation should provide sufficient information for the user to understand the system's decision.
Coherence: The explanation should be logically organized and easy to follow.

The paper also discusses the importance of defining appropriate evaluation tasks and datasets to assess these criteria. The researchers propose several potential tasks, such as asking users to rate the quality of explanations or to complete comprehension tests based on the explanations.

Additionally, the paper highlights the need to consider different types of explanations, such as natural language descriptions, visualizations, or step-by-step reasoning, and to evaluate their effectiveness in different contexts.

Critical Analysis

The paper provides a valuable framework for evaluating the quality of explanations in automated fact verification systems, which is an important but often overlooked aspect of these systems. By focusing on faithfulness, completeness, and coherence, the framework can help ensure that the explanations generated by these systems are meaningful and trustworthy.

However, the paper also acknowledges some potential limitations and areas for further research. For example, the researchers note that defining appropriate evaluation tasks and datasets can be challenging, as the quality of explanations may depend on the specific context and user needs.

Additionally, the paper does not address the potential trade-offs between the quality of explanations and other system performance metrics, such as accuracy or efficiency. It's possible that optimizing for explanation quality could come at the expense of other desirable system characteristics, and this tension would need to be carefully considered.

Overall, the proposed framework provides a solid foundation for evaluating explanations in automated fact verification systems, but more research may be needed to fully address the nuances and complexities of this problem.

Conclusion

This paper presents a framework for evaluating the quality of explanations provided by automated fact verification systems. By focusing on the faithfulness, completeness, and coherence of explanations, the framework can help ensure that these systems generate meaningful and trustworthy justifications for their decisions.

The proposed approach is an important step towards improving the transparency and interpretability of automated fact verification, which is crucial for building user trust and acceptance of these systems. As automated decision-making becomes more prevalent in various domains, developing robust evaluation frameworks for explanation quality will be essential for ensuring the responsible and ethical development of these technologies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🧪

Towards a Unified Framework for Evaluating Explanations

Juan D. Pinto, Luc Paquette

The challenge of creating interpretable models has been taken up by two main research communities: ML researchers primarily focused on lower-level explainability methods that suit the needs of engineers, and HCI researchers who have more heavily emphasized user-centered approaches often based on participatory design methods. This paper reviews how these communities have evaluated interpretability, identifying overlaps and semantic misalignments. We propose moving towards a unified framework of evaluation criteria and lay the groundwork for such a framework by articulating the relationships between existing criteria. We argue that explanations serve as mediators between models and stakeholders, whether for intrinsically interpretable models or opaque black-box models analyzed via post-hoc techniques. We further argue that useful explanations require both faithfulness and intelligibility. Explanation plausibility is a prerequisite for intelligibility, while stability is a prerequisite for explanation faithfulness. We illustrate these criteria, as well as specific evaluation methods, using examples from an ongoing study of an interpretable neural network for predicting a particular learner behavior.

5/24/2024

cs.LG cs.AI

📊

What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception

Chaitanya Malaviya, Subin Lee, Dan Roth, Mark Yatskar

Eliciting feedback from end users of NLP models can be beneficial for improving models. However, how should we present model responses to users so they are most amenable to be corrected from user feedback? Further, what properties do users value to understand and trust responses? We answer these questions by analyzing the effect of rationales (or explanations) generated by QA models to support their answers. We specifically consider decomposed QA models that first extract an intermediate rationale based on a context and a question and then use solely this rationale to answer the question. A rationale outlines the approach followed by the model to answer the question. Our work considers various formats of these rationales that vary according to well-defined properties of interest. We sample rationales from language models using few-shot prompting for two datasets, and then perform two user studies. First, we present users with incorrect answers and corresponding rationales in various formats and ask them to provide natural language feedback to revise the rationale. We then measure the effectiveness of this feedback in patching these rationales through in-context learning. The second study evaluates how well different rationale formats enable users to understand and trust model answers, when they are correct. We find that rationale formats significantly affect how easy it is (1) for users to give feedback for rationales, and (2) for models to subsequently execute this feedback. In addition, formats with attributions to the context and in-depth reasoning significantly enhance user-reported understanding and trust of model outputs.

4/3/2024

cs.CL

🌿

Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving

Xin Quan, Marco Valentino, Louise A. Dennis, Andr'e Freitas

Natural language explanations have become a proxy for evaluating explainable and multi-step Natural Language Inference (NLI) models. However, assessing the validity of explanations for NLI is challenging as it typically involves the crowd-sourcing of apposite datasets, a process that is time-consuming and prone to logical errors. To address existing limitations, this paper investigates the verification and refinement of natural language explanations through the integration of Large Language Models (LLMs) and Theorem Provers (TPs). Specifically, we present a neuro-symbolic framework, named Explanation-Refiner, that augments a TP with LLMs to generate and formalise explanatory sentences and suggest potential inference strategies for NLI. In turn, the TP is employed to provide formal guarantees on the logical validity of the explanations and to generate feedback for subsequent improvements. We demonstrate how Explanation-Refiner can be jointly used to evaluate explanatory reasoning, autoformalisation, and error correction mechanisms of state-of-the-art LLMs as well as to automatically enhance the quality of human-annotated explanations of variable complexity in different domains.

5/9/2024

cs.CL

📊

Even-if Explanations: Formal Foundations, Priorities and Complexity

Gianvincenzo Alfano, Sergio Greco, Domenico Mandaglio, Francesco Parisi, Reza Shahbazian, Irina Trubitsyna

EXplainable AI has received significant attention in recent years. Machine learning models often operate as black boxes, lacking explainability and transparency while supporting decision-making processes. Local post-hoc explainability queries attempt to answer why individual inputs are classified in a certain way by a given model. While there has been important work on counterfactual explanations, less attention has been devoted to semifactual ones. In this paper, we focus on local post-hoc explainability queries within the semifactual `even-if' thinking and their computational complexity among different classes of models, and show that both linear and tree-based models are strictly more interpretable than neural networks. After this, we introduce a preference-based framework that enables users to personalize explanations based on their preferences, both in the case of semifactuals and counterfactuals, enhancing interpretability and user-centricity. Finally, we explore the complexity of several interpretability problems in the proposed preference-based framework and provide algorithms for polynomial cases.

5/24/2024

cs.AI cs.LG