LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs

Read original: arXiv:2405.12433 - Published 5/22/2024 by Sudhir Agarwal, Anu Sreepathy, David H. Alonso, Prarit Lamba

🔗

Overview

Recent advancements in Large Language Models (LLMs) have enabled the development of natural language interfaces for various end-user tasks.
These end-user tasks often require orchestrating multiple APIs, but user queries can be incomplete, lacking all the necessary information.
While LLMs excel at natural language processing, they can struggle with handling missing information or orchestrating APIs.
The proposed approach leverages logical reasoning and classical AI planning along with an LLM to accurately answer user queries, including identifying and gathering any missing information.

Plain English Explanation

The recent availability of Large Language Models (LLMs) has led to the creation of numerous natural language interfaces for helping users with various tasks. These tasks often involve using a set of pre-existing software tools, or APIs, in a specific order.

However, users don't always provide all the necessary information in their initial requests. While LLMs are great at understanding natural language, they can sometimes make up information or have trouble coordinating the right sequence of API calls to fully address the user's needs.

The researchers in this paper have come up with a new approach that combines the power of LLMs with classical AI planning techniques. Their method first translates the user's request into a formal representation that can be understood by an AI planning system. This system then figures out the correct set of API calls, including any additional information that needs to be gathered, to properly answer the user's query.

The key idea is to use both the natural language processing capabilities of LLMs and the logical reasoning abilities of classical AI planning to robustly handle incomplete user requests. This hybrid approach has been shown to significantly outperform using an LLM alone, achieving over 95% success in answering both simple and more complex multi-step queries.

Technical Explanation

The paper proposes a novel approach that leverages logical reasoning and classical AI planning along with an LLM to accurately answer user queries, even when they are missing information.

The system first uses an LLM to translate the user's natural language request into an intermediate representation in Answer Set Programming (ASP). It then further translates this ASP representation into a Planning Domain Definition Language (PDDL) format, which can be understood by a classical AI planner.

The key innovation is the introduction of a special get_info_api that the planner can use to identify and gather any missing information required to fully address the user's query. The researchers also model all the available APIs as PDDL actions, capturing the dataflow between them.

The planner then generates an orchestration of API calls, including calls to get_info_api, to answer the user's request. The evaluation shows that this hybrid approach significantly outperforms a pure LLM-based solution, achieving over 95% success on a dataset of both complete and incomplete single-goal and multi-goal queries.

Critical Analysis

The paper presents a promising approach to leveraging both LLM and classical AI planning techniques to robustly handle incomplete user queries. However, there are a few potential caveats and areas for further research:

The evaluation was limited to a specific dataset, and it's unclear how well the approach would generalize to a wider range of real-world user requests.
The paper doesn't discuss the performance or computational overhead of the planning system, which could be a practical concern for real-time applications.
The researchers mention that the get_info_api mechanism relies on the availability of relevant information sources, but they don't explore how this could be scaled or automated in a practical system.
While the hybrid approach outperforms a pure LLM-based solution, it's still not clear how it would compare to other potential methods, such as training the LLM to reason about missing information or using the LLM to generate planning domains.

Overall, the proposed NL2Plan approach is a valuable contribution to the field of natural language interfaces and AI planning, but further research and real-world testing would be needed to fully evaluate its practical benefits and limitations.

Conclusion

This paper presents a novel approach that combines the strengths of Large Language Models (LLMs) and classical AI planning to enable robust natural language interfaces for end-user tasks. By leveraging logical reasoning and a specialized get_info_api, the system can accurately answer user queries, even when they are missing information.

The key innovation is the hybrid architecture that translates natural language requests into a formal planning representation, allowing a classical planner to orchestrate the necessary API calls, including those to gather any missing information. The evaluation shows this approach significantly outperforms a pure LLM-based solution, suggesting it could be a valuable tool for building more capable and reliable natural language interfaces.

While the paper presents promising results, further research is needed to explore the scalability, performance, and generalization of this approach in real-world applications. Nonetheless, the proposed NL2Plan system represents an important step forward in the integration of modern language models and classical AI planning techniques to enhance the capabilities of natural language interfaces.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔗

LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs

Sudhir Agarwal, Anu Sreepathy, David H. Alonso, Prarit Lamba

Recent availability of Large Language Models (LLMs) has led to the development of numerous LLM-based approaches aimed at providing natural language interfaces for various end-user tasks. These end-user tasks in turn can typically be accomplished by orchestrating a given set of APIs. In practice, natural language task requests (user queries) are often incomplete, i.e., they may not contain all the information required by the APIs. While LLMs excel at natural language processing (NLP) tasks, they frequently hallucinate on missing information or struggle with orchestrating the APIs. The key idea behind our proposed approach is to leverage logical reasoning and classical AI planning along with an LLM for accurately answering user queries including identification and gathering of any missing information in these queries. Our approach uses an LLM and ASP (Answer Set Programming) solver to translate a user query to a representation in Planning Domain Definition Language (PDDL) via an intermediate representation in ASP. We introduce a special API get_info_api for gathering missing information. We model all the APIs as PDDL actions in a way that supports dataflow between the APIs. Our approach then uses a classical AI planner to generate an orchestration of API calls (including calls to get_info_api) to answer the user query. Our evaluation results show that our approach significantly outperforms a pure LLM based approach by achieving over 95% success rate in most cases on a dataset containing complete and incomplete single goal and multi-goal queries where the multi-goal queries may or may not require dataflow among the APIs.

5/22/2024

💬

LASP: Surveying the State-of-the-Art in Large Language Model-Assisted AI Planning

Haoming Li, Zhaoliang Chen, Jonathan Zhang, Fei Liu

Effective planning is essential for the success of any task, from organizing a vacation to routing autonomous vehicles and developing corporate strategies. It involves setting goals, formulating plans, and allocating resources to achieve them. LLMs are particularly well-suited for automated planning due to their strong capabilities in commonsense reasoning. They can deduce a sequence of actions needed to achieve a goal from a given state and identify an effective course of action. However, it is frequently observed that plans generated through direct prompting often fail upon execution. Our survey aims to highlight the existing challenges in planning with language models, focusing on key areas such as embodied environments, optimal scheduling, competitive and cooperative games, task decomposition, reasoning, and planning. Through this study, we explore how LLMs transform AI planning and provide unique insights into the future of LM-assisted planning.

9/4/2024

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers. In this position paper, we take the view that both these extremes are misguided. We argue that auto-regressive LLMs cannot, by themselves, do planning or self-verification (which is after all a form of reasoning), and shed some light on the reasons for misunderstandings in the literature. We will also argue that LLMs should be viewed as universal approximate knowledge sources that have much more meaningful roles to play in planning/reasoning tasks beyond simple front-end/back-end format translators. We present a vision of {bf LLM-Modulo Frameworks} that combine the strengths of LLMs with external model-based verifiers in a tighter bi-directional interaction regime. We will show how the models driving the external verifiers themselves can be acquired with the help of LLMs. We will also argue that rather than simply pipelining LLMs and symbolic components, this LLM-Modulo Framework provides a better neuro-symbolic approach that offers tighter integration between LLMs and symbolic components, and allows extending the scope of model-based planning/reasoning regimes towards more flexible knowledge, problem and preference specifications.

6/13/2024

Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

Wang Zhu, Ishika Singh, Robin Jia, Jesse Thomason

Classical planning approaches guarantee finding a set of actions that can achieve a given goal state when possible, but require an expert to specify logical action semantics that govern the dynamics of the environment. Researchers have shown that Large Language Models (LLMs) can be used to directly infer planning steps based on commonsense knowledge and minimal domain information alone, but such plans often fail on execution. We bring together the strengths of classical planning and LLM commonsense inference to perform domain induction, learning and validating action pre- and post-conditions based on closed-loop interactions with the environment itself. We propose PSALM, which leverages LLM inference to heuristically complete partial plans emitted by a classical planner given partial domain knowledge, as well as to infer the semantic rules of the domain in a logical language based on environment feedback after execution. Our analysis on 7 environments shows that with just one expert-curated example plans, using LLMs as heuristic planners and rule predictors achieves lower environment execution steps and environment resets than random exploration while simultaneously recovering the underlying ground truth action semantics of the domain.

6/6/2024