Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents

2402.00798

Published 6/19/2024 by Zelong Li, Wenyue Hua, Hao Wang, He Zhu, Yongfeng Zhang

💬

Abstract

Recent advancements on Large Language Models (LLMs) enable AI Agents to automatically generate and execute multi-step plans to solve complex tasks. However, since LLM's content generation process is hardly controllable, current LLM-based agents frequently generate invalid or non-executable plans, which jeopardizes the performance of the generated plans and corrupts users' trust in LLM-based agents. In response, this paper proposes a novel ``Formal-LLM'' framework for LLM-based agents by integrating the expressiveness of natural language and the precision of formal language. Specifically, the framework allows human users to express their requirements or constraints for the planning process as an automaton. A stack-based LLM plan generation process is then conducted under the supervision of the automaton to ensure that the generated plan satisfies the constraints, making the planning process controllable. We conduct experiments on both benchmark tasks and practical real-life tasks, and our framework achieves over 50% overall performance increase, which validates the feasibility and effectiveness of employing Formal-LLM to guide the plan generation of agents, preventing the agents from generating invalid and unsuccessful plans. Further, more controllable LLM-based agents can facilitate the broader utilization of LLM in application scenarios where high validity of planning is essential. The work is open-sourced at https://github.com/agiresearch/Formal-LLM.

Create account to get full access

Overview

Recent advancements in Large Language Models (LLMs) have enabled AI agents to automatically generate and execute multi-step plans to solve complex tasks.
However, the content generation process of LLMs is often uncontrollable, leading to the generation of invalid or non-executable plans, which can jeopardize the performance of the generated plans and undermine users' trust in LLM-based agents.
To address this issue, the paper proposes a novel "Formal-LLM" framework for LLM-based agents by integrating the expressiveness of natural language and the precision of formal language.

Plain English Explanation

The paper presents a way to make LLM-based agents better at planning. Typically, these agents can generate plans to solve complex tasks, but the plans they create are often not valid or executable. This can cause problems and make people lose trust in the agents.

The new "Formal-LLM" framework allows human users to express their requirements or constraints for the planning process using an automaton, which is a type of mathematical model. The agent then generates the plan under the supervision of this automaton, ensuring that the plan satisfies the constraints. This makes the planning process more controllable and reduces the chances of the agent producing invalid or unsuccessful plans.

The researchers tested this framework on both standard benchmark tasks and real-world practical tasks. They found that it increased the overall performance of the agent's planning by over 50%. This shows that using the Formal-LLM approach can help create more controllable LLM-based agents that are better at planning and executing complex tasks. This could lead to broader use of LLMs in applications where reliable planning is crucial.

Technical Explanation

The paper proposes the "Formal-LLM" framework to address the issue of LLM-based agents generating invalid or non-executable plans. The key components of this framework are:

Automaton-based Requirement Expression: The framework allows human users to express their requirements or constraints for the planning process as an automaton, a mathematical model that can represent complex logical conditions.
Stack-based LLM Plan Generation: The agent then generates the plan under the supervision of the automaton, using a stack-based process. This ensures that the generated plan satisfies the constraints specified by the automaton, making the planning process more controllable.

The researchers conducted experiments on both standard benchmark tasks and practical real-life tasks. They found that the Formal-LLM framework achieved over 50% overall performance increase compared to traditional LLM-based planning approaches. This validates the feasibility and effectiveness of using the Formal-LLM framework to guide the plan generation of agents, preventing them from producing invalid and unsuccessful plans.

Critical Analysis

The paper provides a promising solution to the problem of LLM-based agents generating invalid or non-executable plans. By integrating the expressiveness of natural language with the precision of formal language, the Formal-LLM framework enables more controllable LLM-based agents that can satisfy user requirements and constraints.

However, the paper does not fully address the computational complexity and scalability of the automaton-based requirement expression. As the complexity of the task and the number of constraints increase, the size and complexity of the automaton may grow, potentially impacting the efficiency of the planning process.

Additionally, the paper focuses on the planning aspect of LLM-based agents, but does not delve into the broader implications of using such agents in real-world applications. Further research is needed to understand the robustness and reliability of LLM-based agents in diverse scenarios, as well as the potential ethical and societal consequences of their widespread deployment.

Conclusion

The Formal-LLM framework presented in this paper offers a promising approach to improving the reliability and trustworthiness of LLM-based agents by enabling more controllable plan generation. This advancement could pave the way for the broader adoption of LLMs in application domains where high-validity planning is essential, such as critical infrastructure management, medical decision-making, and disaster response. As the field of AI continues to evolve, frameworks like Formal-LLM may play a crucial role in ensuring that LLM-based agents can be safely and effectively deployed to tackle complex real-world challenges.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication

Weize Chen, Chenfei Yuan, Jiarui Yuan, Yusheng Su, Chen Qian, Cheng Yang, Ruobing Xie, Zhiyuan Liu, Maosong Sun

Natural language (NL) has long been the predominant format for human cognition and communication, and by extension, has been similarly pivotal in the development and application of Large Language Models (LLMs). Yet, besides NL, LLMs have seen various non-NL formats during pre-training, such as code and logical expression. NL's status as the optimal format for LLMs, particularly in single-LLM reasoning and multi-agent communication, has not been thoroughly examined. In this work, we challenge the default use of NL by exploring the utility of non-NL formats in these contexts. We show that allowing LLMs to autonomously select the most suitable format before reasoning or communicating leads to a 3.3 to 5.7% improvement in reasoning efficiency for different LLMs, and up to a 72.7% reduction in token usage in multi-agent communication, all while maintaining communicative effectiveness. Our comprehensive analysis further reveals that LLMs can devise a format from limited task instructions and that the devised format is effectively transferable across different LLMs. Intriguingly, the structured communication format decided by LLMs exhibits notable parallels with established agent communication languages, suggesting a natural evolution towards efficient, structured communication in agent communication. Our code is released at url{https://github.com/thunlp/AutoForm}.

6/21/2024

cs.CL cs.AI

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers. In this position paper, we take the view that both these extremes are misguided. We argue that auto-regressive LLMs cannot, by themselves, do planning or self-verification (which is after all a form of reasoning), and shed some light on the reasons for misunderstandings in the literature. We will also argue that LLMs should be viewed as universal approximate knowledge sources that have much more meaningful roles to play in planning/reasoning tasks beyond simple front-end/back-end format translators. We present a vision of {bf LLM-Modulo Frameworks} that combine the strengths of LLMs with external model-based verifiers in a tighter bi-directional interaction regime. We will show how the models driving the external verifiers themselves can be acquired with the help of LLMs. We will also argue that rather than simply pipelining LLMs and symbolic components, this LLM-Modulo Framework provides a better neuro-symbolic approach that offers tighter integration between LLMs and symbolic components, and allows extending the scope of model-based planning/reasoning regimes towards more flexible knowledge, problem and preference specifications.

6/13/2024

cs.AI cs.LG

From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems

Jianliang He, Siyu Chen, Fengzhuo Zhang, Zhuoran Yang

In this work, from a theoretical lens, we aim to understand why large language model (LLM) empowered agents are able to solve decision-making problems in the physical world. To this end, consider a hierarchical reinforcement learning (RL) model where the LLM Planner and the Actor perform high-level task planning and low-level execution, respectively. Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting. Under proper assumptions on the pretraining data, we prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning. Additionally, we highlight the necessity for exploration beyond the subgoals derived from BAIL by proving that naively executing the subgoals returned by LLM leads to a linear regret. As a remedy, we introduce an $epsilon$-greedy exploration strategy to BAIL, which is proven to incur sublinear regret when the pretraining error is small. Finally, we extend our theoretical framework to include scenarios where the LLM Planner serves as a world model for inferring the transition model of the environment and to multi-agent settings, enabling coordination among multiple Actors.

5/31/2024

cs.LG cs.AI cs.CL

Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs

Yusuke Mikami, Andrew Melnik, Jun Miura, Ville Hautamaki

We demonstrate experimental results with LLMs that address robotics task planning problems. Recently, LLMs have been applied in robotics task planning, particularly using a code generation approach that converts complex high-level instructions into mid-level policy codes. In contrast, our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning, and outputs coordinate level control commands, thus reducing the necessity for intermediate representation code as policies with pre-defined APIs. Our approach is evaluated on a multi-modal prompt simulation benchmark, demonstrating that our prompt engineering experiments with natural language reasoning significantly enhance success rates compared to its absence. Furthermore, our approach illustrates the potential for natural language descriptions to transfer robotics skills from known tasks to previously unseen tasks. The project website: https://natural-language-as-policies.github.io/

4/9/2024

cs.RO cs.AI cs.CL