Speakers Fill Lexical Semantic Gaps with Context

Read original: arXiv:2010.02172 - Published 5/29/2024 by Tiago Pimentel, Rowan Hall Maudslay, Dami'an Blasi, Ryan Cotterell

🎯

Overview

Lexical ambiguity is a common feature of language, allowing for efficient reuse of words, but it can also lead to miscommunication.
The paper investigates the relationship between a word's lexical ambiguity and the amount of information its context provides.
The researchers propose two ways to measure lexical ambiguity - one using human-annotated data and one using the BERT language model, which can be applied to more languages.
They find that a word's lexical ambiguity is negatively correlated with the amount of contextual information available, suggesting that speakers compensate for ambiguity by providing more informative contexts.

Plain English Explanation

Words can have multiple meanings, a phenomenon known as lexical ambiguity. This allows language to be more efficient by reusing the same word forms. However, if the intended meaning cannot be determined from the context, this ambiguity can lead to miscommunication.

The researchers propose that for a language to be both clear and efficiently encoded, a word's level of ambiguity should be related to how much information its surrounding context provides about its meaning. To test this, they developed two ways to measure a word's lexical ambiguity: one using human-annotated data from WordNet, and one using the BERT language model, which can be applied to more languages.

They found that these measures of ambiguity correlated with the number of synonyms a word has, validating their approach. Then, across 18 diverse languages, they showed that a word's ambiguity is negatively correlated with the amount of information its context provides. This suggests that when words are ambiguous, speakers compensate by using more informative contexts to convey the intended meaning, helping to align language models to handle ambiguity more effectively.

Technical Explanation

The researchers first operationalized lexical ambiguity as the entropy of the possible meanings a word can take, using two different approaches:

WordNet-based: They counted the number of synonyms a word has in WordNet, which provides a human-annotated measure of its ambiguity.
BERT-based: They used the BERT language model to estimate the entropy of a word's meaning distribution, allowing this ambiguity measure to be applied to a wider range of languages.

They validated these ambiguity measures by showing significant correlations between the BERT-based ambiguity and the WordNet-based synonym count, for 6 high-resource languages.

The researchers then tested their main hypothesis: that a word's lexical ambiguity should be negatively correlated with the amount of information its context provides. They quantified contextual information using the conditional entropy of a word given its context.

Across 18 typologically diverse languages, they found significant negative correlations between their measures of ambiguity and contextual information. This suggests that in the face of lexical ambiguity, speakers compensate by using more informative contexts to convey the intended meaning, helping to align language models to handle ambiguity and analyze semantic change through lexical replacements.

Critical Analysis

The paper provides a robust empirical investigation of the relationship between lexical ambiguity and contextual information, using both human-annotated and language model-based measures across a diverse set of languages.

One potential limitation is that the BERT-based ambiguity measure may not fully capture all aspects of lexical ambiguity, as language models can have biases and blind spots. Further research could explore semantic density and uncertainty quantification in semantic space to refine the ambiguity estimates.

Additionally, the study focuses on lexical ambiguity at the word level, but language also exhibits ambiguity at higher levels, such as syntactic or pragmatic ambiguity. Extending the analysis to probe the semantic depths of language beyond just the lexical level could provide a more comprehensive understanding of how ambiguity and context interact.

Overall, this work makes an important contribution to our understanding of how the linguistic system balances the competing demands of efficiency and clarity, and opens up avenues for further research on the dynamics of lexical ambiguity and its implications for language learning and communication.

Conclusion

This paper investigates the relationship between lexical ambiguity and contextual information, proposing that a word's ambiguity should be negatively correlated with the amount of information its context provides. The researchers developed measures of ambiguity using both human-annotated and language model-based approaches, and found robust empirical support for their hypothesis across 18 diverse languages.

These findings suggest that speakers compensate for lexical ambiguity by using more informative contexts, helping to align language models to handle ambiguity and analyze semantic change more effectively. This work advances our understanding of the tradeoffs between efficiency and clarity in language, and opens up avenues for further research on the semantic depths of language and quantifying uncertainty in semantic spaces.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🎯

Speakers Fill Lexical Semantic Gaps with Context

Tiago Pimentel, Rowan Hall Maudslay, Dami'an Blasi, Ryan Cotterell

Lexical ambiguity is widespread in language, allowing for the reuse of economical word forms and therefore making language more efficient. If ambiguous words cannot be disambiguated from context, however, this gain in efficiency might make language less clear -- resulting in frequent miscommunication. For a language to be clear and efficiently encoded, we posit that the lexical ambiguity of a word type should correlate with how much information context provides about it, on average. To investigate whether this is the case, we operationalise the lexical ambiguity of a word as the entropy of meanings it can take, and provide two ways to estimate this -- one which requires human annotation (using WordNet), and one which does not (using BERT), making it readily applicable to a large number of languages. We validate these measures by showing that, on six high-resource languages, there are significant Pearson correlations between our BERT-based estimate of ambiguity and the number of synonyms a word has in WordNet (e.g. $rho = 0.40$ in English). We then test our main hypothesis -- that a word's lexical ambiguity should negatively correlate with its contextual uncertainty -- and find significant correlations on all 18 typologically diverse languages we analyse. This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.

5/29/2024

Bidirectional Transformer Representations of (Spanish) Ambiguous Words in Context: A New Lexical Resource and Empirical Analysis

Pamela D. Rivi`ere (Department of Cognitive Science UC San Diego), Anne L. Beatty-Mart'inez (Department of Cognitive Science UC San Diego), Sean Trott (Department of Cognitive Science UC San Diego, Computational Social Science UC San Diego)

Lexical ambiguity -- where a single wordform takes on distinct, context-dependent meanings -- serves as a useful tool to compare across different large language models' (LLMs') ability to form distinct, contextualized representations of the same stimulus. Few studies have systematically compared LLMs' contextualized word embeddings for languages beyond English. Here, we evaluate multiple bidirectional transformers' (BERTs') semantic representations of Spanish ambiguous nouns in context. We develop a novel dataset of minimal-pair sentences evoking the same or different sense for a target ambiguous noun. In a pre-registered study, we collect contextualized human relatedness judgments for each sentence pair. We find that various BERT-based LLMs' contextualized semantic representations capture some variance in human judgments but fall short of the human benchmark, and for Spanish -- unlike English -- model scale is uncorrelated with performance. We also identify stereotyped trajectories of target noun disambiguation as a proportion of traversal through a given LLM family's architecture, which we partially replicate in English. We contribute (1) a dataset of controlled, Spanish sentence stimuli with human relatedness norms, and (2) to our evolving understanding of the impact that LLM specification (architectures, training protocols) exerts on contextualized embeddings.

6/24/2024

To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models

Bastien Li'etard, Pascal Denis, Mikaella Keller

Polysemy and synonymy are two crucial interrelated facets of lexical ambiguity. While both phenomena have been studied extensively in NLP, leading to dedicated systems, they are often been considered independently. While many tasks dealing with polysemy (e.g. Word Sense Disambiguiation or Induction) highlight the role of a word's senses, the study of synonymy is rooted in the study of concepts, i.e. meaning shared across the lexicon. In this paper, we introduce Concept Induction, the unsupervised task of learning a soft clustering among words that defines a set of concepts directly from data. This task generalizes that of Word Sense Induction. We propose a bi-level approach to Concept Induction that leverages both a local lemma-centric view and a global cross-lexicon perspective to induce concepts. We evaluate the obtained clustering on SemCor's annotated data and obtain good performances (BCubed F1 above 0.60). We find that the local and the global levels are mutually beneficial to induce concepts and also senses in our setting. Finally, we create static embeddings representing our induced concepts and use them on the Word-in-Context task, obtaining competitive performances with the State-of-the-Art.

7/1/2024

Contextual modulation of language comprehension in a dynamic neural model of lexical meaning

Michael C. Stern, Maria M. Pi~nango

We propose and computationally implement a dynamic neural model of lexical meaning, and experimentally test its behavioral predictions. We demonstrate the architecture and behavior of the model using as a test case the English lexical item 'have', focusing on its polysemous use. In the model, 'have' maps to a semantic space defined by two continuous conceptual dimensions, connectedness and control asymmetry, previously proposed to parameterize the conceptual system for language. The mapping is modeled as coupling between a neural node representing the lexical item and neural fields representing the conceptual dimensions. While lexical knowledge is modeled as a stable coupling pattern, real-time lexical meaning retrieval is modeled as the motion of neural activation patterns between metastable states corresponding to semantic interpretations or readings. Model simulations capture two previously reported empirical observations: (1) contextual modulation of lexical semantic interpretation, and (2) individual variation in the magnitude of this modulation. Simulations also generate a novel prediction that the by-trial relationship between sentence reading time and acceptability should be contextually modulated. An experiment combining self-paced reading and acceptability judgments replicates previous results and confirms the new model prediction. Altogether, results support a novel perspective on lexical polysemy: that the many related meanings of a word are metastable neural activation states that arise from the nonlinear dynamics of neural populations governing interpretation on continuous semantic dimensions.

7/23/2024