Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars

Read original: arXiv:2405.16122 - Published 5/28/2024 by Zhaoxuan Wu, Xiaoqiang Lin, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet, Bryan Kian Hsiang Low

🛠️

Overview

This paper presents a new method called EASE (Efficient Ordering-Aware Automated Selection of Exemplars) for optimizing prompts used with large language models.
The key idea is to efficiently explore and select a set of prompts that work well together, considering the order in which they are presented.
The authors show that EASE outperforms existing prompt optimization techniques on a variety of natural language tasks.

Plain English Explanation

Large language models like GPT-3 are powerful tools, but they often require carefully crafted prompts to get good results. Prompt optimization is the process of finding the best set of prompts for a given task.

The EASE method proposed in this paper aims to make prompt optimization more efficient. Instead of just trying to find the single best prompt, EASE looks at the order in which prompts are presented. It searches for a set of prompts that work well together, recognizing that the order in which they are used can impact performance.

This is important because in real-world applications, users may want to present a sequence of prompts to a language model, not just a single prompt. Automatic prompt selection and multi-prompt evaluation are active areas of research in this space.

The authors show that EASE outperforms other prompt optimization techniques on a variety of natural language tasks, like question answering and text generation. By considering the order of prompts, EASE is able to find more effective prompt sets than previous methods.

Technical Explanation

The key innovation in this paper is the EASE algorithm for efficient prompt optimization. EASE builds on prior work on prompt exploration and prompt regression, but adds the ability to consider the ordering of prompts.

The main steps of EASE are:

Initializing a set of candidate prompts
Evaluating the performance of different prompt orderings
Iteratively refining the prompt set to improve overall performance

EASE uses a novel ordering-aware utility function to guide the search for the optimal prompt set. This allows it to find prompt sequences that work well together, rather than just optimizing for individual prompts.

The authors evaluate EASE on a range of natural language tasks, including question answering, text generation, and sentiment analysis. They show that EASE outperforms previous prompt optimization methods, especially when the order of prompts matters for performance.

Critical Analysis

One potential limitation of this work is that it relies on having access to a large set of candidate prompts to start with. In practice, it may be difficult to generate such a diverse initial prompt pool. The authors acknowledge this and suggest using prompt generation techniques to address this.

Additionally, the computational cost of EASE may be higher than simpler prompt optimization methods, especially as the number of candidate prompts grows. The authors argue that the performance gains justify the additional computational effort, but this trade-off should be carefully considered.

Overall, this paper presents a novel and promising approach to prompt optimization that takes into account the ordering of prompts. While there are some practical considerations, EASE represents an important step forward in making large language models more accessible and effective for real-world applications.

Conclusion

This paper introduces EASE, a new method for optimizing prompts used with large language models. By considering the ordering of prompts, EASE is able to find more effective prompt sets than previous optimization techniques.

The authors demonstrate the benefits of EASE on a variety of natural language tasks, showing significant performance improvements over existing prompt optimization approaches. This work highlights the importance of considering prompt ordering in real-world applications of language models, and suggests promising avenues for further research in this area.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🛠️

Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars

Zhaoxuan Wu, Xiaoqiang Lin, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet, Bryan Kian Hsiang Low

Large language models (LLMs) have shown impressive capabilities in real-world applications. The capability of in-context learning (ICL) allows us to adapt an LLM to downstream tasks by including input-label exemplars in the prompt without model fine-tuning. However, the quality of these exemplars in the prompt greatly impacts performance, highlighting the need for an effective automated exemplar selection method. Recent studies have explored retrieval-based approaches to select exemplars tailored to individual test queries, which can be undesirable due to extra test-time computation and an increased risk of data exposure. Moreover, existing methods fail to adequately account for the impact of exemplar ordering on the performance. On the other hand, the impact of the instruction, another essential component in the prompt given to the LLM, is often overlooked in existing exemplar selection methods. To address these challenges, we propose a novel method named EASE, which leverages the hidden embedding from a pre-trained language model to represent ordered sets of exemplars and uses a neural bandit algorithm to optimize the sets of exemplars while accounting for exemplar ordering. Our EASE can efficiently find an ordered set of exemplars that performs well for all test queries from a given task, thereby eliminating test-time computation. Importantly, EASE can be readily extended to jointly optimize both the exemplars and the instruction. Through extensive empirical evaluations (including novel tasks), we demonstrate the superiority of EASE over existing methods, and reveal practical insights about the impact of exemplar selection on ICL, which may be of independent interest. Our code is available at https://github.com/ZhaoxuanWu/EASE-Prompt-Optimization.

5/28/2024

🛠️

Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization

Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Sercan O. Arik

Large language models have demonstrated remarkable capabilities, but their performance is heavily reliant on effective prompt engineering. Automatic prompt optimization (APO) methods are designed to automate this and can be broadly categorized into those targeting instructions (instruction optimization, IO) vs. those targeting exemplars (exemplar selection, ES). Despite their shared objective, these have evolved rather independently, with IO recently receiving more research attention. This paper seeks to bridge this gap by comprehensively comparing the performance of representative IO and ES techniques, both isolation and combination, on a diverse set of challenging tasks. Our findings reveal that intelligently reusing model-generated input-output pairs obtained from evaluating prompts on the validation set as exemplars consistently improves performance over IO methods but is currently under-investigated. We also find that despite the recent focus on IO, how we select exemplars can outweigh how we optimize instructions, with ES strategies as simple as random search outperforming state-of-the-art IO methods with seed instructions without any optimization. Moreover, we observe synergy between ES and IO, with optimal combinations surpassing individual contributions. We conclude that studying exemplar selection as a standalone method and its optimal combination with instruction optimization remains a crucial aspect of APO and deserves greater consideration in future research, even in the era of highly capable instruction-following models.

6/26/2024

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

Ruochen Wang, Sohyun An, Minhao Cheng, Tianyi Zhou, Sung Ju Hwang, Cho-Jui Hsieh

Large Language Models (LLMs) exhibit strong generalization capabilities to novel tasks when prompted with language instructions and in-context demos. Since this ability sensitively depends on the quality of prompts, various methods have been explored to automate the instruction design. While these methods demonstrated promising results, they also restricted the searched prompt to one instruction. Such simplification significantly limits their capacity, as a single demo-free instruction might not be able to cover the entire complex problem space of the targeted task. To alleviate this issue, we adopt the Mixture-of-Expert paradigm and divide the problem space into a set of sub-regions; Each sub-region is governed by a specialized expert, equipped with both an instruction and a set of demos. A two-phase process is developed to construct the specialized expert for each region: (1) demo assignment: Inspired by the theoretical connection between in-context learning and kernel regression, we group demos into experts based on their semantic similarity; (2) instruction assignment: A region-based joint search of an instruction per expert complements the demos assigned to it, yielding a synergistic effect. The resulting method, codenamed Mixture-of-Prompts (MoP), achieves an average win rate of 81% against prior arts across several major benchmarks.

7/2/2024

NICE: To Optimize In-Context Examples or Not?

Pragya Srivastava, Satvik Golechha, Amit Deshpande, Amit Sharma

Recent work shows that in-context learning and optimization of in-context examples (ICE) can significantly improve the accuracy of large language models (LLMs) on a wide range of tasks, leading to an apparent consensus that ICE optimization is crucial for better performance. However, most of these studies assume a fixed or no instruction provided in the prompt. We challenge this consensus by investigating the necessity of optimizing ICE when task-specific instructions are provided and find that there are many tasks for which it yields diminishing returns. In particular, using a diverse set of tasks and a systematically created instruction set with gradually added details, we find that as the prompt instruction becomes more detailed, the returns on ICE optimization diminish. To characterize this behavior, we introduce a task-specific metric called Normalized Invariability to Choice of Examples (NICE) that quantifies the learnability of tasks from a given instruction, and provides a heuristic to help decide whether to optimize instructions or ICE for a new task. Given a task, the proposed metric can reliably predict the utility of optimizing ICE compared to using random ICE. Our code is available at https://github.com/microsoft/nice-icl.

6/7/2024