In-Context Learning with Iterative Demonstration Selection

2310.09881

Published 6/26/2024 by Chengwei Qin, Aston Zhang, Chen Chen, Anirudh Dagar, Wenming Ye

⛏️

Abstract

Spurred by advancements in scale, large language models (LLMs) have demonstrated strong few-shot learning ability via in-context learning (ICL). However, the performance of ICL has been shown to be highly sensitive to the selection of few-shot demonstrations. Selecting the most suitable examples as context remains an ongoing challenge and an open problem. Existing literature has highlighted the importance of selecting examples that are diverse or semantically similar to the test sample while ignoring the fact that the optimal selection dimension, i.e., diversity or similarity, is task-specific. Based on how the test sample is answered, we propose Iterative Demonstration Selection (IDS) to leverage the merits of both dimensions. Using zero-shot chain-of-thought reasoning (Zero-shot-CoT), IDS iteratively selects examples that are diverse but still strongly correlated with the test sample as ICL demonstrations. Specifically, IDS applies Zero-shot-CoT to the test sample before demonstration selection. The output reasoning path is then used to choose demonstrations that are prepended to the test sample for inference. The generated answer is followed by its corresponding reasoning path for extracting a new set of demonstrations in the next iteration. After several iterations, IDS adopts majority voting to obtain the final result. Through extensive experiments on tasks including reasoning, question answering, and topic classification, we demonstrate that IDS can consistently outperform existing ICL demonstration selection methods.

Create account to get full access

Overview

Large language models (LLMs) have demonstrated strong few-shot learning abilities through in-context learning (ICL).
The performance of ICL is highly sensitive to the selection of few-shot demonstrations.
Selecting the most suitable examples as context remains an ongoing challenge.
Existing research has highlighted the importance of selecting diverse or semantically similar examples, but the optimal dimension is task-specific.

Plain English Explanation

Large language models (LLMs) are powerful AI systems that can learn and generate human-like text. Recent advancements have allowed these models to perform well on new tasks by simply providing a few examples as context, a technique known as in-context learning (ICL). However, the effectiveness of ICL is highly dependent on the specific examples, or "demonstrations," that are used to provide context.

Choosing the right demonstrations is an ongoing challenge. Some research has suggested that selecting demonstrations that are diverse (different from each other) or semantically similar to the test sample can be beneficial. But the best approach may depend on the particular task at hand.

Technical Explanation

The paper proposes a new method called Iterative Demonstration Selection (IDS) to address the challenge of selecting effective ICL demonstrations. IDS leverages the strengths of both diverse and similar demonstrations, based on how the test sample is answered.

IDS first applies zero-shot chain-of-thought reasoning to the test sample to generate a reasoning path. This reasoning path is then used to select demonstrations that are diverse but still strongly correlated with the test sample. These demonstrations are prepended to the test sample for inference.

The generated answer and its reasoning path are then used to extract a new set of demonstrations for the next iteration. After several iterations, IDS adopts majority voting to obtain the final result.

The authors demonstrate through extensive experiments on tasks like reasoning, question answering, and topic classification that IDS can consistently outperform existing ICL demonstration selection methods, such as those described in this paper, this paper, and this paper.

Critical Analysis

The paper presents a novel and promising approach to addressing the challenge of demonstration selection for in-context learning. By combining the merits of diverse and similar demonstrations, and iterating the process based on the test sample's reasoning, IDS appears to offer improvements over existing methods.

However, the paper does not explore the limits or potential issues with IDS. For example, it's unclear how the method would scale to more complex tasks or larger demonstration sets. Additionally, the paper does not discuss the computational cost or time required for the iterative process, which could be a practical concern.

Further research would be needed to fully understand the strengths, weaknesses, and appropriate use cases of the IDS approach. Exploring its performance on a wider range of tasks and comparing it to other emerging techniques, such as iterative forward tuning, could provide valuable insights.

Conclusion

The paper presents a novel Iterative Demonstration Selection (IDS) method to address the challenge of selecting effective in-context learning demonstrations. IDS leverages the merits of both diverse and semantically similar demonstrations, guided by the test sample's reasoning path, to consistently outperform existing ICL demonstration selection approaches.

While the results are promising, further research is needed to fully understand the limits and practical implications of the IDS method. Exploring its scalability, computational cost, and performance on a wider range of tasks could help refine and validate the approach, contributing to the ongoing advancements in few-shot learning and in-context learning.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🌿

In-Context Learning Demonstration Selection via Influence Analysis

Vinay M. S., Minh-Hao Van, Xintao Wu

Large Language Models (LLMs) have showcased their In-Context Learning (ICL) capabilities, enabling few-shot learning without the need for gradient updates. Despite its advantages, the effectiveness of ICL heavily depends on the choice of demonstrations. Selecting the most effective demonstrations for ICL remains a significant research challenge. To tackle this issue, we propose a demonstration selection method named InfICL, which utilizes influence functions to analyze impacts of training samples. By identifying the most influential training samples as demonstrations, InfICL aims to enhance the ICL generalization performance. To keep InfICL cost-effective, we only use the LLM to generate sample input embeddings, avoiding expensive fine-tuning. Through empirical studies on various real-world datasets, we demonstrate advantages of InfICL compared to state-of-the-art baselines.

6/19/2024

cs.CL

✅

Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning

Hui Liu, Wenya Wang, Hao Sun, Chris Xing Tian, Chenqi Kong, Xin Dong, Haoliang Li

Large Language Models (LLMs) have demonstrated impressive in-context learning (ICL) capabilities from few-shot demonstration exemplars. While recent learning-based demonstration selection methods have proven beneficial to ICL by choosing more useful exemplars, their underlying mechanisms are opaque, hindering efforts to address limitations such as high training costs and poor generalization across tasks. These methods generally assume the selection process captures similarities between the exemplar and the target instance, however, it remains unknown what kinds of similarities are captured and vital to performing ICL. To dive into this question, we analyze the working mechanisms of the learning-based demonstration selection methods and empirically identify two important factors related to similarity measurement: 1) The ability to integrate different levels of task-agnostic text similarities between the input of exemplars and test cases enhances generalization power across different tasks. 2) Incorporating task-specific labels when measuring the similarities significantly improves the performance on each specific task. We validate these two findings through extensive quantitative and qualitative analyses across ten datasets and various LLMs. Based on our findings, we introduce two effective yet simplified exemplar selection methods catering to task-agnostic and task-specific demands, eliminating the costly LLM inference overhead.

6/19/2024

cs.LG cs.AI cs.CL

Unifying Demonstration Selection and Compression for In-Context Learning

Jun Gao, Ziqiang Cao, Wenjie Li

In-context learning (ICL) facilitates large language models (LLMs) exhibiting spectacular emergent capabilities in various scenarios. Unfortunately, introducing demonstrations easily makes the prompt length explode, bringing a significant burden to hardware. In addition, random demonstrations usually achieve limited improvements in ICL, necessitating demonstration selection among accessible candidates. Previous studies introduce extra modules to perform demonstration compression or selection independently. In this paper, we propose an ICL framework UniICL, which Unifies demonstration selection and compression, and final response generation via a single frozen LLM. Specifically, UniICL first projects actual demonstrations and inference text inputs into short virtual tokens, respectively. Then, virtual tokens are applied to select suitable demonstrations by measuring semantic similarity within latent space among candidate demonstrations and inference input. Finally, inference text inputs together with selected virtual demonstrations are fed into the same frozen LLM for response generation. Notably, UniICL is a parameter-efficient framework that only contains 17M trainable parameters originating from the projection layer. We conduct experiments and analysis over in- and out-domain datasets of both generative and understanding tasks, encompassing ICL scenarios with plentiful and limited demonstration candidates. Results show that UniICL effectively unifies $12 times$ compression, demonstration selection, and response generation, efficiently scaling up the baseline from 4-shot to 64-shot ICL in IMDb with 24 GB CUDA allocation

6/18/2024

cs.CL

🚀

Revisiting Demonstration Selection Strategies in In-Context Learning

Keqin Peng, Liang Ding, Yancheng Yuan, Xuebo Liu, Min Zhang, Yuanxin Ouyang, Dacheng Tao

Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL), where a few examples are used to describe a task to the model. However, the performance of ICL varies significantly with the choice of demonstrations, and it is still unclear why this happens or what factors will influence its choice. In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent. We further proposed a data- and model-dependent demonstration selection method, textbf{TopK + ConE}, based on the assumption that textit{the performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples}, resulting in a simple and effective recipe for ICL. Empirically, our method yields consistent improvements in both language understanding and generation tasks with different model scales. Further analyses confirm that, besides the generality and stability under different circumstances, our method provides a unified explanation for the effectiveness of previous methods. Code will be released.

6/26/2024

cs.CL