How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning

Read original: arXiv:2402.02872 - Published 6/12/2024 by Zeping Yu, Sophia Ananiadou

💬

Overview

The paper investigates the mechanism behind in-context learning (ICL) on sentence classification tasks with unrelated labels.
They find that intervening in only 1% of the neural network heads (called "in-context heads") significantly impacts ICL accuracy, reducing it from 87.6% to 24.4%.
To understand this phenomenon, the authors analyze the value-output vectors in these in-context heads and discover that the vectors at each label position contain substantial information about the corresponding labels.
They observe that the shift in prediction from one label to another is due to changes in the attention scores in the in-context heads.
Based on these findings, the authors propose a hypothesis for ICL, explaining the majority label bias and recency bias, and suggesting two methods to reduce these biases.

Plain English Explanation

In this research, the authors looked at how large language models learn to perform sentence classification tasks when the labels are not directly related to the content of the sentence (e.g., classifying a sentence as "foo" or "bar"). They found that even small changes to a specific part of the neural network, called the "in-context heads," can have a big impact on the model's performance.

By analyzing the internal workings of these in-context heads, the authors discovered that the model is essentially learning to identify features of the labels themselves, rather than just focusing on the content of the sentence. This means the model can be biased towards predicting the most common label or the most recent label it saw, rather than the label that best fits the sentence.

The authors propose a way to understand this process, which involves the model learning a "similarity metric" between the features of the sentence and the features of each possible label. They also suggest two methods that can help reduce these biases, which could be useful for improving in-context learning in language models.

Technical Explanation

The researchers conducted experiments on sentence classification tasks with semantically-unrelated labels (e.g., "foo" or "bar"). They found that intervening in only 1% of the neural network heads, specifically the "in-context heads," significantly reduced the in-context learning (ICL) accuracy from 87.6% to 24.4%.

To understand this phenomenon, the authors analyzed the value-output vectors in these in-context heads. They discovered that the vectors at each label position contained substantial information about the corresponding labels. Furthermore, they observed that the shift in prediction from one label to another was due to changes in the attention scores of the in-context heads at the label positions.

Based on these findings, the authors proposed a hypothesis for ICL: in the in-context heads, the value-output matrices extract label features, while the query-key matrices compute the similarity between the features at the last position and those at each label position. This process can be seen as the model learning a "similarity metric" between the input features and the label features.

Using this hypothesis, the authors were able to explain the majority label bias and recency bias observed in ICL. They also proposed two methods to reduce these biases by 22% and 17%, respectively. These methods involve modifying the training process to encourage the model to learn a more balanced and robust in-context learning approach.

Critical Analysis

The paper provides a detailed and insightful analysis of the underlying mechanisms of in-context learning in language models. The authors' hypothesis about the role of the query-key and value-output matrices in learning a similarity metric between input features and label features is a compelling explanation for the observed biases.

However, the authors acknowledge that their analysis is limited to a specific set of sentence classification tasks with semantically-unrelated labels. It would be valuable to see if their findings and proposed methods generalize to a wider range of in-context learning tasks, including those with more realistic and semantically-related labels.

Additionally, the authors' experiments focused on intervening in only 1% of the neural network heads. It would be interesting to explore the impact of intervening in different proportions of heads or different subsets of the network to gain a more comprehensive understanding of the underlying mechanisms.

Overall, this paper provides a significant contribution to the understanding of in-context learning in language models and offers promising directions for further research and potential improvements in this area.

Conclusion

This research paper offers valuable insights into the underlying mechanisms of in-context learning (ICL) in language models. The authors' discovery that the value-output and query-key matrices in a small subset of neural network heads, called "in-context heads," play a crucial role in ICL provides a new perspective on how these models learn and make predictions.

By proposing a hypothesis that explains the majority label bias and recency bias in ICL, and suggesting methods to reduce these biases, the authors have made important strides towards improving the robustness and reliability of in-context learning. These findings could have significant implications for the development of more advanced and versatile language models, with the potential to enhance their ability to perform a wide range of tasks in an effective and unbiased manner.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning

Zeping Yu, Sophia Ananiadou

We investigate the mechanism of in-context learning (ICL) on sentence classification tasks with semantically-unrelated labels (foo/bar). We find intervening in only 1% heads (named in-context heads) significantly affects ICL accuracy from 87.6% to 24.4%. To understand this phenomenon, we analyze the value-output vectors in these heads and discover that the vectors at each label position contain substantial information about the corresponding labels. Furthermore, we observe that the prediction shift from foo to bar is due to the respective reduction and increase in these heads' attention scores at foo and bar positions. Therefore, we propose a hypothesis for ICL: in in-context heads, the value-output matrices extract label features, while the query-key matrices compute the similarity between the features at the last position and those at each label position. The query and key matrices can be considered as two towers that learn the similarity metric between the last position's features and each demonstration at label positions. Using this hypothesis, we explain the majority label bias and recency bias in ICL and propose two methods to reduce these biases by 22% and 17%, respectively.

6/12/2024

Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism

Anhao Zhao, Fanghua Ye, Jinlan Fu, Xiaoyu Shen

Large language models (LLMs) exhibit remarkable in-context learning (ICL) capabilities. However, the underlying working mechanism of ICL remains poorly understood. Recent research presents two conflicting views on ICL: One attributes it to LLMs' inherent ability of task recognition, deeming label correctness and shot numbers of demonstrations as not crucial; the other emphasizes the impact of similar examples in the demonstrations, stressing the need for label correctness and more shots. In this work, we provide a Two-Dimensional Coordinate System that unifies both views into a systematic framework. The framework explains the behavior of ICL through two orthogonal variables: whether LLMs can recognize the task and whether similar examples are presented in the demonstrations. We propose the peak inverse rank metric to detect the task recognition ability of LLMs and study LLMs' reactions to different definitions of similarity. Based on these, we conduct extensive experiments to elucidate how ICL functions across each quadrant on multiple representative classification tasks. Finally, we extend our analyses to generation tasks, showing that our coordinate system can also be used to interpret ICL for generation tasks effectively.

7/25/2024

Why Larger Language Models Do In-context Learning Differently?

Zhenmei Shi, Junyi Wei, Zhuoyan Xu, Yingyu Liang

Large language models (LLM) have emerged as a powerful tool for AI, with the key ability of in-context learning (ICL), where they can perform well on unseen tasks based on a brief series of task examples without necessitating any adjustments to the model parameters. One recent interesting mysterious observation is that models of different scales may have different ICL behaviors: larger models tend to be more sensitive to noise in the test context. This work studies this observation theoretically aiming to improve the understanding of LLM and ICL. We analyze two stylized settings: (1) linear regression with one-layer single-head linear transformers and (2) parity classification with two-layer multiple attention heads transformers (non-linear data and non-linear model). In both settings, we give closed-form optimal solutions and find that smaller models emphasize important hidden features while larger ones cover more hidden features; thus, smaller models are more robust to noise while larger ones are more easily distracted, leading to different ICL behaviors. This sheds light on where transformers pay attention to and how that affects ICL. Preliminary experimental results on large base and chat models provide positive support for our analysis.

5/31/2024

Large Language Models Know What Makes Exemplary Contexts

Quanyu Long, Jianda Chen, Wenya Wang, Sinno Jialin Pan

In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs). By instructing LLMs using few-shot demonstrative examples, ICL enables them to perform a wide range of tasks without needing to update millions of parameters. This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts; self-rank candidates with different demonstration compositions; self-optimize the demonstration selection and ordering through reinforcement learning. Specifically, our method designs a parameter-efficient retrieval head that generates the optimized demonstration after training with rewards from LLM's own preference. Experimental results validate the proposed method's effectiveness in enhancing ICL performance. Additionally, our approach effectively identifies and selects the most representative examples for the current task, and includes more diversity in retrieval.

8/21/2024