Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency

Read original: arXiv:2407.08790 - Published 7/15/2024 by Abeba Birhane, Marek McGann

Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency

Overview

This paper critically examines the common misconception that large language models (LLMs) demonstrate human-like linguistic agency.
The authors argue that the impressive feats of LLMs are better understood as engineering achievements rather than manifestations of true language understanding.
The paper contrasts the notions of language as a tool for human expression versus language as a statistical pattern-matching task.

Plain English Explanation

The paper makes an important distinction between the remarkable engineering accomplishments of large language models (LLMs) and the mistaken belief that these models demonstrate true human-like linguistic agency.

The authors explain that while LLMs can generate fluent-sounding text that may appear intelligent, this is fundamentally different from the way humans use language to express themselves and understand the world. LLMs excel at statistical pattern-matching, allowing them to produce convincing language outputs. However, this does not mean they possess genuine language understanding or the ability to use language the way humans do.

The paper highlights two contrasting conceptions of language. On one hand, language can be viewed as a tool for human expression, creativity, and meaning-making. In this view, language is intrinsically linked to human agency, cognition, and the ability to reason about the world. On the other hand, language can be seen as a statistical phenomenon, where models can learn to generate plausible-sounding text by identifying and replicating patterns in large datasets, without necessarily comprehending the underlying meaning.

The authors argue that the remarkable achievements of LLMs are often misinterpreted as demonstrations of human-like linguistic agency, when in reality, they are primarily engineering feats that excel at pattern-matching and language generation, but do not capture the deeper aspects of human language use and cognition.

Technical Explanation

The paper presents a critical examination of the common perception that large language models (LLMs) demonstrate human-like linguistic agency. The authors argue that the impressive capabilities of LLMs are better understood as engineering achievements rather than manifestations of true language understanding.

The paper contrasts two distinct conceptions of language. One view sees language as a tool for human expression, creativity, and meaning-making, where language is intrinsically linked to human agency, cognition, and the ability to reason about the world. In contrast, the other view regards language as a statistical phenomenon, where models can learn to generate plausible-sounding text by identifying and replicating patterns in large datasets, without necessarily comprehending the underlying meaning.

The authors assert that the remarkable achievements of LLMs, such as their ability to generate fluent and coherent text, are often misinterpreted as demonstrations of human-like linguistic agency. However, the paper contends that these impressive feats are primarily the result of advanced pattern-matching and language generation capabilities, rather than genuine language understanding akin to human cognition.

Critical Analysis

The paper raises valid concerns about the tendency to anthropomorphize the capabilities of large language models (LLMs) and mistake their engineering achievements for human-like linguistic agency. The authors effectively challenge the common perception that LLMs' impressive language generation abilities equate to true language understanding.

One important aspect the paper highlights is the distinction between language as a tool for human expression, creativity, and meaning-making versus language as a statistical pattern-matching task. This distinction is crucial in understanding the limitations of LLMs and the key differences between human language use and the way these models operate.

While the paper acknowledges the remarkable engineering accomplishments behind LLMs, it cautions against the overgeneralization of their capabilities and the potential risks of anthropomorphizing these systems. The authors rightly point out that the impressive outputs of LLMs do not necessarily imply genuine language comprehension or the kind of deeper human-like reasoning and agency that is often attributed to them.

The paper's critical analysis encourages readers to think more deeply about the nature of language and the differences between human linguistic abilities and the pattern-matching prowess of large-scale language models. This is an important contribution to the ongoing discussions and debates surrounding the capabilities and limitations of such models.

Conclusion

This paper provides a thought-provoking analysis of the common misconception that large language models (LLMs) demonstrate human-like linguistic agency. The authors effectively challenge the tendency to anthropomorphize the impressive capabilities of LLMs and emphasize the distinction between language as a tool for human expression and language as a statistical pattern-matching task.

By highlighting the contrasting conceptions of language, the paper encourages a more nuanced understanding of the engineering achievements behind LLMs and the risks of mistaking these accomplishments for true human-like language use and cognition. The critical analysis presented in this paper contributes to a more balanced and informed perspective on the capabilities and limitations of large language models, which is crucial as these systems continue to evolve and become increasingly prominent in various applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency

Abeba Birhane, Marek McGann

In this paper we argue that key, often sensational and misleading, claims regarding linguistic capabilities of Large Language Models (LLMs) are based on at least two unfounded assumptions; the assumption of language completeness and the assumption of data completeness. Language completeness assumes that a distinct and complete thing such as `a natural language' exists, the essential characteristics of which can be effectively and comprehensively modelled by an LLM. The assumption of data completeness relies on the belief that a language can be quantified and wholly captured by data. Work within the enactive approach to cognitive science makes clear that, rather than a distinct and complete thing, language is a means or way of acting. Languaging is not the kind of thing that can admit of a complete or comprehensive modelling. From an enactive perspective we identify three key characteristics of enacted language; embodiment, participation, and precariousness, that are absent in LLMs, and likely incompatible in principle with current architectures. We argue that these absences imply that LLMs are not now and cannot in their present form be linguistic agents the way humans are. We illustrate the point in particular through the phenomenon of `algospeak', a recently described pattern of high stakes human language activity in heavily controlled online environments. On the basis of these points, we conclude that sensational and misleading claims about LLM agency and capabilities emerge from a deep misconception of both what human language is and what LLMs are.

7/15/2024

💬

Transforming Agency. On the mode of existence of Large Language Models

Xabier E. Barandiaran, Lola S. Almendros

This paper investigates the ontological characterization of Large Language Models (LLMs) like ChatGPT. Between inflationary and deflationary accounts, we pay special attention to their status as agents. This requires explaining in detail the architecture, processing, and training procedures that enable LLMs to display their capacities, and the extensions used to turn LLMs into agent-like systems. After a systematic analysis we conclude that a LLM fails to meet necessary and sufficient conditions for autonomous agency in the light of embodied theories of mind: the individuality condition (it is not the product of its own activity, it is not even directly affected by it), the normativity condition (it does not generate its own norms or goals), and, partially the interactional asymmetry condition (it is not the origin and sustained source of its interaction with the environment). If not agents, then ... what are LLMs? We argue that ChatGPT should be characterized as an interlocutor or linguistic automaton, a library-that-talks, devoid of (autonomous) agency, but capable to engage performatively on non-purposeful yet purpose-structured and purpose-bounded tasks. When interacting with humans, a ghostly component of the human-machine interaction makes it possible to enact genuine conversational experiences with LLMs. Despite their lack of sensorimotor and biological embodiment, LLMs textual embodiment (the training corpus) and resource-hungry computational embodiment, significantly transform existing forms of human agency. Beyond assisted and extended agency, the LLM-human coupling can produce midtended forms of agency, closer to the production of intentional agency than to the extended instrumentality of any previous technologies.

7/17/2024

💬

Artificial Agency and Large Language Models

Maud van Lier, Gorka Mu~noz-Gil

The arrival of Large Language Models (LLMs) has stirred up philosophical debates about the possibility of realizing agency in an artificial manner. In this work we contribute to the debate by presenting a theoretical model that can be used as a threshold conception for artificial agents. The model defines agents as systems whose actions and goals are always influenced by a dynamic framework of factors that consists of the agent's accessible history, its adaptive repertoire and its external environment. This framework, in turn, is influenced by the actions that the agent takes and the goals that it forms. We show with the help of the model that state-of-the-art LLMs are not agents yet, but that there are elements to them that suggest a way forward. The paper argues that a combination of the agent architecture presented in Park et al. (2023) together with the use of modules like the Coscientist in Boiko et al. (2023) could potentially be a way to realize agency in an artificial manner. We end the paper by reflecting on the obstacles one might face in building such an artificial agent and by presenting possible directions for future research.

7/25/2024

💬

Large Language Models as Instruments of Power: New Regimes of Autonomous Manipulation and Control

Yaqub Chaudhary, Jonnie Penn

Large language models (LLMs) can reproduce a wide variety of rhetorical styles and generate text that expresses a broad spectrum of sentiments. This capacity, now available at low cost, makes them powerful tools for manipulation and control. In this paper, we consider a set of underestimated societal harms made possible by the rapid and largely unregulated adoption of LLMs. Rather than consider LLMs as isolated digital artefacts used to displace this or that area of work, we focus on the large-scale computational infrastructure upon which they are instrumentalised across domains. We begin with discussion on how LLMs may be used to both pollute and uniformize information environments and how these modalities may be leveraged as mechanisms of control. We then draw attention to several areas of emerging research, each of which compounds the capabilities of LLMs as instruments of power. These include (i) persuasion through the real-time design of choice architectures in conversational interfaces (e.g., via AI personas), (ii) the use of LLM-agents as computational models of human agents (e.g., silicon subjects), (iii) the use of LLM-agents as computational models of human agent populations (e.g., silicon societies) and finally, (iv) the combination of LLMs with reinforcement learning to produce controllable and steerable strategic dialogue models. We draw these strands together to discuss how these areas may be combined to build LLM-based systems that serve as powerful instruments of individual, social and political control via the simulation and disingenuous prediction of human behaviour, intent, and action.

5/8/2024