Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model

Read original: arXiv:2210.02498 - Published 4/26/2024 by Jacob Eisenstein, Daniel Andor, Bernd Bohnet, Michael Collins, David Mimno

💬

Overview

This paper explores a new approach called "markup-and-mask" for generating rationales that explain the reasoning behind open-book question answering systems.
The key idea is to augment the input passage with free-text markup that allows each sentence to stand on its own, and then select a sub-span of the marked-up passage as the rationale.
To train this system without annotations, the authors leverage in-context learning, using a frozen pretrained language model as a "teacher" to generate silver-annotated data.
The student model is then fine-tuned on the subset of rationales that led to correct answers, creating a trustworthy pipeline where the rationale acts as a bottleneck between the passage and the answer.

Plain English Explanation

Explainable AI systems should not only provide accurate answers, but also justify their reasoning in a way that humans can understand and verify. This paper proposes a new approach called "markup-and-mask" to generate these kinds of explanations.

The key idea is to start with a passage of text, and add free-text annotations or "markup" to each sentence. This markup helps the sentence stand on its own, without relying on the context of the full passage. Then, the system selects a subset of the marked-up sentences as the "rationale" - the explanation for why the answer was chosen.

To train this system without having lots of annotated data, the authors use a clever trick. They take a large, pretrained language model and use it as a "teacher" to generate example rationales. They then train a smaller "student" model to mimic the teacher's rationales, but with the constraint that the rationale must lead to the correct answer. This ensures the student model is being honest and transparent in its reasoning.

The result is a pipeline system where the rationale acts as a bottleneck between the input passage and the final answer. This makes the system more trustworthy, since humans can inspect the rationale to understand and verify the model's reasoning.

Technical Explanation

The paper introduces a new approach for generating rationales in open-book question answering systems called "markup-and-mask". This combines aspects of extractive and free-text explanations.

In the "markup" phase, the input passage is augmented with free-text annotations that allow each sentence to stand on its own, without relying on the broader context. Then, in the "masking" phase, a sub-span of the marked-up passage is selected as the rationale.

To train this system without annotations, the authors leverage in-context learning. They use a frozen, pretrained language model as a "teacher" to generate silver-annotated data. Specifically, they send a series of prompts to the teacher model, which responds with potential rationales. The authors then fine-tune a smaller "student" model to mimic the teacher's rationales, but with the constraint that the rationale must lead to the correct answer.

This creates a trustworthy pipeline system, where the rationale acts as a bottleneck between the passage and the final answer. The untrusted teacher model operates without such constraints, while the student model is "honest" in the sense that its rationale must justify its output.

Critical Analysis

The markup-and-mask approach proposed in this paper is a novel and promising direction for building explainable question answering systems. The use of in-context learning to generate silver-annotated data is a clever way to overcome the challenge of obtaining human-annotated rationales. This allows the system to be trained without the need for costly and time-consuming manual annotation.

However, the authors acknowledge that the markup-and-mask rationales may not always be as informative as human-written explanations. There is also the potential risk of the student model "gaming" the system by generating rationales that appear plausible but do not truly reflect its reasoning.

Further research is needed to explore the robustness of this approach and its ability to scale to more complex question answering tasks. The authors suggest several potential directions, such as incorporating additional constraints or leveraging self-supervised techniques to improve the quality of the rationales.

Overall, this paper presents an intriguing approach to the important problem of building trustworthy and explainable AI systems. While there are still challenges to address, the markup-and-mask technique offers a promising step towards more transparent and accountable question answering systems.

Conclusion

This paper introduces a new "markup-and-mask" approach for generating rationales in open-book question answering systems. By augmenting input passages with free-text markup and leveraging in-context learning to generate silver-annotated data, the authors demonstrate a way to train explainable AI systems without the need for costly human annotations.

The resulting pipeline model is "honest" in the sense that the rationale acts as a bottleneck between the input and the final answer, making the system's reasoning more transparent and trustworthy. While further research is needed to address the limitations and scale the approach, this work represents an important step forward in the quest for AI systems that can not only provide accurate answers, but also justify their reasoning in a way that humans can understand and verify.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model

Jacob Eisenstein, Daniel Andor, Bernd Bohnet, Michael Collins, David Mimno

Explainable question answering systems should produce not only accurate answers but also rationales that justify their reasoning and allow humans to check their work. But what sorts of rationales are useful and how can we train systems to produce them? We propose a new style of rationale for open-book question answering, called emph{markup-and-mask}, which combines aspects of extractive and free-text explanations. In the markup phase, the passage is augmented with free-text markup that enables each sentence to stand on its own outside the discourse context. In the masking phase, a sub-span of the marked-up passage is selected. To train a system to produce markup-and-mask rationales without annotations, we leverage in-context learning. Specifically, we generate silver annotated data by sending a series of prompts to a frozen pretrained language model, which acts as a teacher. We then fine-tune a smaller student model by training on the subset of rationales that led to correct answers. The student is honest in the sense that it is a pipeline: the rationale acts as a bottleneck between the passage and the answer, while the untrusted teacher operates under no such constraints. Thus, we offer a new way to build trustworthy pipeline systems from a combination of end-task annotations and frozen pretrained language models.

4/26/2024

Evaluating Human Alignment and Model Faithfulness of LLM Rationale

Mohsen Fayyaz, Fan Yin, Jiao Sun, Nanyun Peng

We study how well large language models (LLMs) explain their generations with rationales -- a set of tokens extracted from the input texts that reflect the decision process of LLMs. We examine LLM rationales extracted with two methods: 1) attribution-based methods that use attention or gradients to locate important tokens, and 2) prompting-based methods that guide LLMs to extract rationales using prompts. Through extensive experiments, we show that prompting-based rationales align better with human-annotated rationales than attribution-based rationales, and demonstrate reasonable alignment with humans even when model performance is poor. We additionally find that the faithfulness limitations of prompting-based methods, which are identified in previous work, may be linked to their collapsed predictions. By fine-tuning these models on the corresponding datasets, both prompting and attribution methods demonstrate improved faithfulness. Our study sheds light on more rigorous and fair evaluations of LLM rationales, especially for prompting-based ones.

7/2/2024

Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations

Ritam Dutt, Zhen Wu, Kelly Shi, Divyanshu Sheth, Prakhar Gupta, Carolyn Penstein Rose

We present a generalizable classification approach that leverages Large Language Models (LLMs) to facilitate the detection of implicitly encoded social meaning in conversations. We design a multi-faceted prompt to extract a textual explanation of the reasoning that connects visible cues to underlying social meanings. These extracted explanations or rationales serve as augmentations to the conversational text to facilitate dialogue understanding and transfer. Our empirical results over 2,340 experimental settings demonstrate the significant positive impact of adding these rationales. Our findings hold true for in-domain classification, zero-shot, and few-shot domain transfer for two different social meaning detection tasks, each spanning two different corpora.

7/1/2024

Why Would You Suggest That? Human Trust in Language Model Responses

Manasi Sharma, Ho Chit Siu, Rohan Paleja, Jaime D. Pe~na

The emergence of Large Language Models (LLMs) has revealed a growing need for human-AI collaboration, especially in creative decision-making scenarios where trust and reliance are paramount. Through human studies and model evaluations on the open-ended News Headline Generation task from the LaMP benchmark, we analyze how the framing and presence of explanations affect user trust and model performance. Overall, we provide evidence that adding an explanation in the model response to justify its reasoning significantly increases self-reported user trust in the model when the user has the opportunity to compare various responses. Position and faithfulness of these explanations are also important factors. However, these gains disappear when users are shown responses independently, suggesting that humans trust all model responses, including deceptive ones, equitably when they are shown in isolation. Our findings urge future research to delve deeper into the nuanced evaluation of trust in human-machine teaming systems.

6/5/2024