NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions

2405.04215

Published 5/8/2024 by Elliot Gestrin, Marco Kuhlmann, Jendrik Seipp

NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions

Abstract

Today's classical planners are powerful, but modeling input tasks in formats such as PDDL is tedious and error-prone. In contrast, planning with Large Language Models (LLMs) allows for almost any input text, but offers no guarantees on plan quality or even soundness. In an attempt to merge the best of these two approaches, some work has begun to use LLMs to automate parts of the PDDL creation process. However, these methods still require various degrees of expert input. We present NL2Plan, the first domain-agnostic offline LLM-driven planning system. NL2Plan uses an LLM to incrementally extract the necessary information from a short text prompt before creating a complete PDDL description of both the domain and the problem, which is finally solved by a classical planner. We evaluate NL2Plan on four planning domains and find that it solves 10 out of 15 tasks - a clear improvement over a plain chain-of-thought reasoning LLM approach, which only solves 2 tasks. Moreover, in two out of the five failure cases, instead of returning an invalid plan, NL2Plan reports that it failed to solve the task. In addition to using NL2Plan in end-to-end mode, users can inspect and correct all of its intermediate results, such as the PDDL representation, increasing explainability and making it an assistive tool for PDDL creation.

Create account to get full access

Overview

This paper presents NL2Plan, a system that can generate detailed action plans from minimal natural language descriptions.
NL2Plan uses large language models (LLMs) to understand the intent behind natural language inputs and translate them into formal planning representations, such as PDDL.
The system is designed to be robust and handle a wide range of planning tasks, from simple everyday scenarios to more complex problem-solving.

Plain English Explanation

The paper describes a system called NL2Plan that can take a brief natural language description of a task or problem and automatically generate a detailed step-by-step plan to solve it. For example, if you gave NL2Plan the description "I need to bake a cake for my friend's birthday," it would understand the goal and then outline all the necessary actions, like gathering the ingredients, mixing the batter, setting the oven temperature, and so on.

NL2Plan uses advanced language models, which are AI systems trained on massive amounts of text data, to comprehend the underlying intent behind the natural language input. It then translates that understanding into a formal planning representation, called PDDL, that can be used to logically reason about the steps required to achieve the goal.

The key innovation of NL2Plan is its ability to generate robust and flexible plans from minimal descriptions. Rather than relying on highly structured instructions, the system can handle vague or open-ended natural language and still produce coherent and sensible plans. This makes it potentially useful for a wide range of applications, from helping with everyday tasks to tackling complex problem-solving challenges.

Technical Explanation

NL2Plan leverages large language models (LLMs) like GPT-3 to process natural language inputs and reason about the associated planning tasks. The system first encodes the natural language description into a vector representation using the LLM. It then passes this vector through a series of neural network modules to translate it into a formal PDDL planning representation.

The PDDL representation includes a description of the initial state, the goal, and the available actions that can be taken to achieve the goal. NL2Plan then uses a classical planning algorithm, such as Fast Downward, to search for a sequence of actions that will transform the initial state into the goal state.

A key innovation of NL2Plan is its ability to handle a wide range of planning tasks, from simple everyday scenarios to more complex problem-solving challenges. The authors demonstrate the system's robustness through extensive testing on diverse natural language inputs and planning domains, including tasks like travel planning, household chores, and logical reasoning problems.

Critical Analysis

The paper provides a compelling demonstration of the potential for LLMs to drive robust planning capabilities from minimal natural language inputs. The authors have addressed a important challenge in AI, namely the ability to translate natural language into formal representations that can be used for reasoning and problem-solving.

However, the paper also acknowledges several limitations and areas for further research. For example, the system's performance may degrade on very complex or ambiguous inputs, and the planning algorithms used may not be able to handle certain types of planning problems. Additionally, the system's reliance on PDDL may limit its applicability in domains where other planning representations are more appropriate.

Further research is needed to better understand the strengths and weaknesses of LLM-driven planning systems like NL2Plan, and to explore ways to make them more robust, flexible, and scalable. There are also open questions around the transparency and interpretability of these systems, and how to ensure they behave in alignment with human values and intentions.

Overall, the NL2Plan system represents an exciting step forward in the field of natural language-driven planning, and the research presented in this paper is a valuable contribution to the ongoing efforts to bridge the gap between human language and machine reasoning.

Conclusion

The NL2Plan system demonstrates the potential for large language models to enable robust, flexible planning capabilities from minimal natural language inputs. By translating natural language descriptions into formal planning representations, the system can generate detailed action plans to solve a wide range of tasks and problems.

This research represents an important step forward in the field of natural language-driven planning, highlighting the ability of advanced language models to comprehend the underlying intent behind natural language and translate it into formal reasoning frameworks. As LLMs continue to improve and become more widely adopted, systems like NL2Plan may find increasing applications in areas like task automation, decision support, and even general problem-solving.

However, the paper also acknowledges the limitations of the current approach and the need for further research to address challenges around system robustness, transparency, and alignment with human values. As the field of LLM-driven planning evolves, it will be crucial to carefully consider these issues and work towards developing systems that are not only capable, but also trustworthy and beneficial to society.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

💬

Large Language Models as Planning Domain Generators

James Oswald, Kavitha Srinivas, Harsha Kokel, Junkyu Lee, Michael Katz, Shirin Sohrabi

Developing domain models is one of the few remaining places that require manual human labor in AI planning. Thus, in order to make planning more accessible, it is desirable to automate the process of domain model generation. To this end, we investigate if large language models (LLMs) can be used to generate planning domain models from simple textual descriptions. Specifically, we introduce a framework for automated evaluation of LLM-generated domains by comparing the sets of plans for domain instances. Finally, we perform an empirical analysis of 7 large language models, including coding and chat models across 9 different planning domains, and under three classes of natural language domain descriptions. Our results indicate that LLMs, particularly those with high parameter counts, exhibit a moderate level of proficiency in generating correct planning domains from natural language descriptions. Our code is available at https://github.com/IBM/NL2PDDL.

5/14/2024

cs.CL cs.AI

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers. In this position paper, we take the view that both these extremes are misguided. We argue that auto-regressive LLMs cannot, by themselves, do planning or self-verification (which is after all a form of reasoning), and shed some light on the reasons for misunderstandings in the literature. We will also argue that LLMs should be viewed as universal approximate knowledge sources that have much more meaningful roles to play in planning/reasoning tasks beyond simple front-end/back-end format translators. We present a vision of {bf LLM-Modulo Frameworks} that combine the strengths of LLMs with external model-based verifiers in a tighter bi-directional interaction regime. We will show how the models driving the external verifiers themselves can be acquired with the help of LLMs. We will also argue that rather than simply pipelining LLMs and symbolic components, this LLM-Modulo Framework provides a better neuro-symbolic approach that offers tighter integration between LLMs and symbolic components, and allows extending the scope of model-based planning/reasoning regimes towards more flexible knowledge, problem and preference specifications.

6/13/2024

cs.AI cs.LG

PDDLEGO: Iterative Planning in Textual Environments

Li Zhang, Peter Jansen, Tianyi Zhang, Peter Clark, Chris Callison-Burch, Niket Tandon

Planning in textual environments have been shown to be a long-standing challenge even for current models. A recent, promising line of work uses LLMs to generate a formal representation of the environment that can be solved by a symbolic planner. However, existing methods rely on a fully-observed environment where all entity states are initially known, so a one-off representation can be constructed, leading to a complete plan. In contrast, we tackle partially-observed environments where there is initially no sufficient information to plan for the end-goal. We propose PDDLEGO that iteratively construct a planning representation that can lead to a partial plan for a given sub-goal. By accomplishing the sub-goal, more information is acquired to augment the representation, eventually achieving the end-goal. We show that plans produced by few-shot PDDLEGO are 43% more efficient than generating plans end-to-end on the Coin Collector simulation, with strong performance (98%) on the more complex Cooking World simulation where end-to-end LLMs fail to generate coherent plans (4%).

5/31/2024

cs.CL

Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

Wang Zhu, Ishika Singh, Robin Jia, Jesse Thomason

Classical planning approaches guarantee finding a set of actions that can achieve a given goal state when possible, but require an expert to specify logical action semantics that govern the dynamics of the environment. Researchers have shown that Large Language Models (LLMs) can be used to directly infer planning steps based on commonsense knowledge and minimal domain information alone, but such plans often fail on execution. We bring together the strengths of classical planning and LLM commonsense inference to perform domain induction, learning and validating action pre- and post-conditions based on closed-loop interactions with the environment itself. We propose PSALM, which leverages LLM inference to heuristically complete partial plans emitted by a classical planner given partial domain knowledge, as well as to infer the semantic rules of the domain in a logical language based on environment feedback after execution. Our analysis on 7 environments shows that with just one expert-curated example plans, using LLMs as heuristic planners and rule predictors achieves lower environment execution steps and environment resets than random exploration while simultaneously recovering the underlying ground truth action semantics of the domain.

6/6/2024

cs.AI cs.CL cs.RO