Can formal argumentative reasoning enhance LLMs performances?

2405.13036

Published 5/24/2024 by Federico Castagna, Isabel Sassoon, Simon Parsons

🏅

Abstract

Recent years witnessed significant performance advancements in deep-learning-driven natural language models, with a strong focus on the development and release of Large Language Models (LLMs). These improvements resulted in better quality AI-generated output but rely on resource-expensive training and upgrading of models. Although different studies have proposed a range of techniques to enhance LLMs without retraining, none have considered computational argumentation as an option. This is a missed opportunity since computational argumentation is an intuitive mechanism that formally captures agents' interactions and the information conflict that may arise during such interplays, and so it seems well-suited for boosting the reasoning and conversational abilities of LLMs in a seamless manner. In this paper, we present a pipeline (MQArgEng) and preliminary study to evaluate the effect of introducing computational argumentation semantics on the performance of LLMs. Our experiment's goal was to provide a proof-of-concept and a feasibility analysis in order to foster (or deter) future research towards a fully-fledged argumentation engine plugin for LLMs. Exploratory results using the MT-Bench indicate that MQArgEng provides a moderate performance gain in most of the examined topical categories and, as such, show promise and warrant further research.

Create account to get full access

Overview

Recent years have seen significant advances in deep learning-based natural language models, with a focus on the development and release of Large Language Models (LLMs).
These improvements have led to better quality AI-generated output, but rely on resource-intensive training and model upgrades.
While various techniques have been proposed to enhance LLMs without retraining, computational argumentation has not been considered as an option.
This paper presents a pipeline (MQArgEng) and a preliminary study to evaluate the effect of introducing computational argumentation semantics on the performance of LLMs.

Plain English Explanation

Natural language models, especially large language models, have made great strides in recent years, producing higher-quality AI-generated text. However, these improvements often require significant computational resources to train and update the models.

Researchers have explored ways to enhance LLMs without the need for retraining, but they have not yet considered using computational argumentation as a way to do this. Computational argumentation is a system that formally captures how agents interact and how information conflicts may arise during these interactions. This seems like a promising approach for boosting the reasoning and conversational abilities of LLMs in a seamless manner.

This paper presents a pipeline called MQArgEng and a preliminary study to evaluate whether incorporating computational argumentation semantics can improve the performance of LLMs. The goal is to provide a proof-of-concept and assess the feasibility of developing a full-fledged argumentation engine plugin for LLMs. The results, using a benchmark called MT-Bench, suggest that MQArgEng can provide a moderate performance boost in most of the tested topic areas, indicating that this approach warrants further research.

Technical Explanation

The paper describes the development of a pipeline called MQArgEng, which aims to introduce computational argumentation semantics to enhance the performance of large language models.

The researchers conducted a preliminary study to evaluate the effects of this approach. They used the MT-Bench benchmark to assess the performance of LLMs with and without the MQArgEng pipeline. The MT-Bench benchmark covers a range of topical categories, allowing the researchers to examine the impact of the argumentation semantics across different domains.

The exploratory results indicate that the MQArgEng pipeline provides a moderate performance gain in most of the examined topical categories. This suggests that the integration of computational argumentation has the potential to improve the reasoning and conversational abilities of LLMs in a seamless manner, without the need for resource-intensive retraining.

Critical Analysis

The paper presents a promising approach to enhancing LLMs by incorporating computational argumentation semantics. However, the study is still in the preliminary stage, and the authors acknowledge the need for further research to fully evaluate the feasibility and potential of this approach.

One potential limitation is the scope of the evaluation, which was limited to the MT-Bench benchmark. It would be valuable to explore the impact of the MQArgEng pipeline on a wider range of tasks and benchmarks, including evaluating the interventional reasoning capabilities of the enhanced LLMs.

Additionally, the paper does not provide detailed insights into the specific mechanisms by which the computational argumentation semantics improve the performance of LLMs. Further research could delve deeper into the de-formalization of natural language and how it interacts with the reasoning and conversational abilities of the models.

It would also be interesting to investigate the persuasiveness and argument quality of the AI-generated outputs when the computational argumentation semantics are applied, as this could have significant implications for the explainability and contestability of LLM-driven decision-making.

Conclusion

This paper presents a promising approach to enhancing the performance of large language models by incorporating computational argumentation semantics. The preliminary results suggest that the MQArgEng pipeline can provide a moderate performance boost across various topical categories, indicating that this approach warrants further research and exploration.

As natural language models continue to play an increasingly significant role in our lives, it is crucial to explore innovative ways to improve their reasoning and conversational abilities without relying solely on resource-intensive model retraining. The integration of computational argumentation, as demonstrated in this study, offers a potential avenue for enhancing LLMs in a more efficient and seamless manner, with implications for the explainability and contestability of AI-driven decision-making.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

I'd Like to Have an Argument, Please: Argumentative Reasoning in Large Language Models

Adrian de Wynter, Tangming Yuan

We evaluate two large language models (LLMs) ability to perform argumentative reasoning. We experiment with argument mining (AM) and argument pair extraction (APE), and evaluate the LLMs' ability to recognize arguments under progressively more abstract input and output (I/O) representations (e.g., arbitrary label sets, graphs, etc.). Unlike the well-known evaluation of prompt phrasings, abstraction evaluation retains the prompt's phrasing but tests reasoning capabilities. We find that scoring-wise the LLMs match or surpass the SOTA in AM and APE, and under certain I/O abstractions LLMs perform well, even beating chain-of-thought--we call this symbolic prompting. However, statistical analysis on the LLMs outputs when subject to small, yet still human-readable, alterations in the I/O representations (e.g., asking for BIO tags as opposed to line numbers) showed that the models are not performing reasoning. This suggests that LLM applications to some tasks, such as data labelling and paper reviewing, must be done with care.

6/11/2024

cs.CL

💬

Argumentative Large Language Models for Explainable and Contestable Decision-Making

Gabriel Freedman, Adam Dejl, Deniz Gorur, Xiang Yin, Antonio Rago, Francesca Toni

The diversity of knowledge encoded in large language models (LLMs) and their ability to apply this knowledge zero-shot in a range of settings makes them a promising candidate for use in decision-making. However, they are currently limited by their inability to reliably provide outputs which are explainable and contestable. In this paper, we attempt to reconcile these strengths and weaknesses by introducing a method for supplementing LLMs with argumentative reasoning. Concretely, we introduce argumentative LLMs, a method utilising LLMs to construct argumentation frameworks, which then serve as the basis for formal reasoning in decision-making. The interpretable nature of these argumentation frameworks and formal reasoning means that any decision made by the supplemented LLM may be naturally explained to, and contested by, humans. We demonstrate the effectiveness of argumentative LLMs experimentally in the decision-making task of claim verification. We obtain results that are competitive with, and in some cases surpass, comparable state-of-the-art techniques.

5/6/2024

cs.CL cs.AI

ArgMed-Agents: Explainable Clinical Decision Reasoning with LLM Disscusion via Argumentation Schemes

Shengxin Hong, Liang Xiao, Xin Zhang, Jianxia Chen

There are two main barriers to using large language models (LLMs) in clinical reasoning. Firstly, while LLMs exhibit significant promise in Natural Language Processing (NLP) tasks, their performance in complex reasoning and planning falls short of expectations. Secondly, LLMs use uninterpretable methods to make clinical decisions that are fundamentally different from the clinician's cognitive processes. This leads to user distrust. In this paper, we present a multi-agent framework called ArgMed-Agents, which aims to enable LLM-based agents to make explainable clinical decision reasoning through interaction. ArgMed-Agents performs self-argumentation iterations via Argumentation Scheme for Clinical Discussion (a reasoning mechanism for modeling cognitive processes in clinical reasoning), and then constructs the argumentation process as a directed graph representing conflicting relationships. Ultimately, use symbolic solver to identify a series of rational and coherent arguments to support decision. We construct a formal model of ArgMed-Agents and present conjectures for theoretical guarantees. ArgMed-Agents enables LLMs to mimic the process of clinical argumentative reasoning by generating explanations of reasoning in a self-directed manner. The setup experiments show that ArgMed-Agents not only improves accuracy in complex clinical decision reasoning problems compared to other prompt methods, but more importantly, it provides users with decision explanations that increase their confidence.

6/24/2024

cs.AI cs.MA cs.SC

💬

Large Language Models are as persuasive as humans, but why? About the cognitive effort and moral-emotional language of LLM arguments

Carlos Carrasco-Farre

Large Language Models (LLMs) are already as persuasive as humans. However, we know very little about how they do it. This paper investigates the persuasion strategies of LLMs, comparing them with human-generated arguments. Using a dataset of 1,251 participants in an experiment, we analyze the persuasion strategies of LLM-generated and human-generated arguments using measures of cognitive effort (lexical and grammatical complexity) and moral-emotional language (sentiment and moral analysis). The study reveals that LLMs produce arguments that require higher cognitive effort, exhibiting more complex grammatical and lexical structures than human counterparts. Additionally, LLMs demonstrate a significant propensity to engage more deeply with moral language, utilizing both positive and negative moral foundations more frequently than humans. In contrast with previous research, no significant difference was found in the emotional content produced by LLMs and humans. These findings contribute to the discourse on AI and persuasion, highlighting the dual potential of LLMs to both enhance and undermine informational integrity through communication strategies for digital persuasion.

4/23/2024

cs.CL