Gauri K.

LLMs as Prompt Optimizers

A Review of Optimization by PROmpting(OPRO).

Tags: tech, llms, prompt engineering


Why is Prompt Optimisation important?

LLMs are shown to be sensitive to the prompt format (Zhao et al., 2021; Lu et al., 2021; Wei et al., 2023; Madaan & Yazdanbakhsh, 2022); in particular, semantically similar prompts may have drastically different performance (Kojima et al., 2022; Zhou et al., 2022b; Zhang et al., 2023), and the optimal prompt formats can be model-specific and task-specific (Ma et al., 2023; Chen et al., 2023c). Therefore, prompt engineering is often important for LLMs to achieve good performance (Reynolds & McDonell, 2021). However, the large and discrete prompt space makes it challenging for optimization, especially when only API access to the LLM is available. Prior work on continuous and discrete prompt optimization (Lester et al., 2021; Li & Liang, 2021; Zhou et al., 2022b; Pryzant et al., 2023) assumes a training set is available to compute training accuracy as the optimization objective. Studies show that optimizing prompts for accuracy on a small training set is often sufficient to achieve high performance on a test set.

What is OPRO?

OPRO (Optimization through Prompting) introduces a revolutionary framework by leveraging Large Language Models (LLMs) as optimizers, enabling users to frame optimization problems in natural language rather than complex code or formulas. This approach holds the potential to democratize optimization, making it more adaptable and accessible.

OPRO reframes the optimization process by allowing users to describe their problem, constraints, and desired solution characteristics through a structured meta-prompt. This meta-prompt includes not only the problem setup but also a record of previous solutions and their corresponding scores, forming an optimization trajectory that the LLM uses to iteratively refine new solutions.

OPRO’s significance lies in three key advantages:

  1. Accessibility:
    By using natural language descriptions, OPRO removes the barrier of requiring advanced mathematical or programming skills, making it feasible for non-experts to tackle complex optimization tasks.

  2. Adaptability:
    Switching tasks in OPRO is as simple as updating the problem description in the meta-prompt, making OPRO highly adaptable to a wide range of challenges.

  3. Novel Approach:
    Unlike traditional gradient-based methods, OPRO works without derivatives, opening new doors for solving complex problems, especially those involving discrete search spaces or non-differentiable functions.

Case Studies in Optimization with LLMs

To demonstrate the potential of LLMs for optimization, OPRO can be applied to classic problems such as linear regression and the traveling salesman problem (TSP), two foundational challenges in mathematical optimization and operations research.

On small-scale optimization problems, these case studies reveal that LLMs are capable of finding high-quality solutions via prompting and, in some cases, matching or surpassing hand-designed heuristic algorithms.

Key Components of OPRO

Decription of OPRO

1. The Goal

The primary objective of OPRO in prompt optimization is to identify the prompt that maximizes the performance of a scorer LLM for a given task, such as text classification or summarization. The optimization process is guided by a training and test set, enabling objective measurement of each prompt’s effectiveness.

2. The Setup

3. The Meta-Prompt

At the heart of OPRO lies the meta-prompt, structured to guide the optimizer LLM in generating effective prompts. The meta-prompt consists of:

Optimization Process

Each optimization step in OPRO generates new prompts with the aim of improving task accuracy, based on a trajectory of previously generated prompts. This approach contrasts with recent methods that iteratively edit a single input prompt (Pryzant et al., 2023) or constrain new prompts to adhere to a consistent semantic meaning (Zhou et al., 2022b). By leveraging the entire optimization trajectory, OPRO enables the LLM to gradually create prompts that enhance task accuracy.

The OPRO process follows a straightforward yet effective iterative workflow:

  1. Initialization: Starts with an initial prompt or an empty string.
  2. Iterative Refinement: The optimizer LLM is tasked with generating new prompts based on the meta-prompt.
  3. Evaluation: Each prompt is paired with training examples and tested on the scorer LLM, and the prompt’s score is recorded.
  4. Update Meta-Prompt: Newly generated prompts and their scores are added to the optimization trajectory, while lower-scoring prompts may be discarded to maintain the context window within the LLM’s limits.
  5. Repeat: This process iterates until a predefined number of rounds or a satisfactory score is achieved.

Output

The optimal prompt – the one with the highest score on the training set – is selected as the final output of the OPRO process. This prompt is then evaluated on the test set to assess its generalization performance.

Limitations of OPRO

While OPRO offers significant advantages, it also faces certain limitations. These challenges reflect both the inherent constraints of LLMs and the specific demands of optimization tasks:

  1. Scale:
    The meta-prompt’s length is constrained by the LLM’s context window, limiting OPRO’s applicability to simpler problem instances. Large-scale problems or those requiring detailed descriptions may exceed the LLM’s capacity.

  2. Complex Landscapes:
    Optimization problems with irregular, multi-peaked loss landscapes can pose difficulties for OPRO. The LLM may become trapped in local optima, hindering consistent improvement.

  3. Overfitting:
    During prompt optimization, OPRO may overfit to the training set used for prompt evaluation. While optimized prompts often generalize well, overfitting can still reduce performance on unseen data.

  4. Hallucination and Repetition:
    The optimizer LLM may generate nonsensical or repetitive prompts, even when instructed otherwise. This reflects ongoing challenges with LLM reliability, necessitating careful monitoring of generated outputs.

Conclusion

OPRO represents a paradigm shift in optimization, allowing users to harness the power of LLMs by describing their problems in natural language. By lowering technical barriers, OPRO holds transformative potential for applications ranging from prompt optimization to solving complex discrete optimization problems.

Despite its limitations, OPRO’s novel approach offers an exciting glimpse into the future of optimization, where natural language and artificial intelligence converge to unlock new possibilities. As LLM capabilities evolve and larger context windows become available, OPRO may pave the way for increasingly sophisticated optimization solutions accessible to all.

Citations

This article is based on concepts discussed in LARGE LANGUAGE MODELS AS OPTIMIZERS by Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, and Xinyun Chen.