Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

2405.04082

Published 6/5/2024 by Teng Xue, Amirreza Razmjoo, Suhan Shetty, Sylvain Calinon

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

Abstract

Recent advances in robot skill learning have unlocked the potential to construct task-agnostic skill libraries, facilitating the seamless sequencing of multiple simple manipulation primitives (aka. skills) to tackle significantly more complex tasks. Nevertheless, determining the optimal sequence for independently learned skills remains an open problem, particularly when the objective is given solely in terms of the final geometric configuration rather than a symbolic goal. To address this challenge, we propose Logic-Skill Programming (LSP), an optimization-based approach that sequences independently learned skills to solve long-horizon tasks. We formulate a first-order extension of a mathematical program to optimize the overall cumulative reward of all skills within a plan, abstracted by the sum of value functions. To solve such programs, we leverage the use of tensor train factorization to construct the value function space, and rely on alternations between symbolic search and skill value optimization to find the appropriate skill skeleton and optimal subgoal sequence. Experimental results indicate that the obtained value functions provide a superior approximation of cumulative rewards compared to state-of-the-art reinforcement learning methods. Furthermore, we validate LSP in three manipulation domains, encompassing both prehensile and non-prehensile primitives. The results demonstrate its capability to identify the optimal solution over the full logic and geometric path. The real-robot experiments showcase the effectiveness of our approach to cope with contact uncertainty and external disturbances in the real world.

Create account to get full access

Overview

This paper proposes an optimization-based approach to sequential skill planning, called "Logic-Skill Programming" (LSP).
LSP aims to enable robots to efficiently plan and execute complex sequences of skills to accomplish tasks.
The approach combines logical reasoning and geometric optimization to generate plans that satisfy high-level task specifications while considering low-level skill constraints.

Plain English Explanation

The paper presents a new method called "Logic-Skill Programming" (LSP) that allows robots to plan and carry out complex sequences of actions to complete tasks. The key idea is to combine two important capabilities: logical reasoning and geometric optimization.

The logical reasoning component allows the robot to understand the high-level goals of a task and break it down into a sequence of simpler skills that need to be executed. This provides the "what" and "why" of the task.

The geometric optimization component then figures out the "how" - the precise motions and movements the robot needs to perform to carry out each skill in the sequence. This ensures the robot can actually execute the plan in the physical world while satisfying various constraints.

By integrating these two capabilities, the LSP approach enables robots to efficiently plan and execute complex multi-step tasks that involve a variety of skills. This could be very useful for applications like home assistants, manufacturing, or search and rescue operations, where robots need to be able to flexibly and reliably carry out sophisticated sequences of actions.

The paper demonstrates the effectiveness of LSP through experiments in simulated and real-world robotic manipulation tasks. The results show that LSP outperforms alternative planning approaches in terms of both plan quality and computational efficiency.

Technical Explanation

The core of the LSP approach is a mathematical optimization formulation that combines logical task specifications with geometric skill constraints. The high-level task is represented as a set of logical predicates that must be satisfied, while the robot's skills are modeled as geometric programs that capture the feasible state-action sequences.

The optimization problem then seeks to find a sequence of skills that minimizes some objective (e.g. time, energy) while ensuring the logical task constraints are met. This is achieved by seamlessly integrating the logical and geometric components through a mixed-integer programming framework.

The authors show how this LSP formulation can be efficiently solved using off-the-shelf optimization solvers. They also describe techniques to handle uncertainty, such as incorporating probabilistic skill models and re-planning during execution.

Experiments are conducted on simulated manipulation tasks as well as real-world robotic platforms. The results demonstrate that LSP outperforms alternative planning approaches, including Plan-Seq-Learn, Logical Specifications Guided Dynamic Task Sampling, D-LGP, and Logic-DMP, in terms of both plan quality and computational efficiency.

Critical Analysis

The authors have provided a rigorous and well-designed study, with a clear technical approach and thorough experimental evaluation. However, there are a few potential limitations and areas for further research:

The current formulation assumes deterministic skill models, which may not always be realistic in real-world scenarios. Incorporating more sophisticated probabilistic models could improve the system's robustness to uncertainty.
The experiments focus on relatively simple manipulation tasks. Scaling the LSP approach to handle more complex, multi-stage tasks with a larger number of skills would be an important next step.
The paper does not discuss how the logical task specifications are generated or acquired. Automating this process or providing user-friendly interfaces could enhance the usability of the system.
While the computational efficiency of LSP is promising, the optimization process may still be too slow for some time-critical applications. Exploring ways to further accelerate the planning process could broaden the range of deployable scenarios.

Overall, the LSP framework represents a significant advance in the field of sequential skill planning, and the authors have made a valuable contribution. Addressing the aforementioned limitations could lead to even more robust and versatile robotic systems capable of tackling a wide variety of complex tasks.

Conclusion

The "Logic-Skill Programming" (LSP) approach proposed in this paper provides a novel optimization-based solution to the problem of sequential skill planning for robots. By seamlessly integrating logical task specifications and geometric skill constraints, LSP enables efficient planning and execution of complex multi-step tasks.

The key innovation of LSP is its ability to leverage both high-level reasoning and low-level control to generate plans that are both effective and feasible. This could have significant implications for a wide range of robotic applications, from home assistants to industrial automation, where the ability to flexibly and reliably carry out sophisticated sequences of actions is essential.

The experimental results demonstrate the advantages of LSP over alternative planning approaches, and the authors have outlined several promising directions for future research to further enhance the capabilities and robustness of the system. Overall, this work represents an important step forward in the field of robotic planning and control, with the potential to significantly expand the range of tasks that robots can successfully accomplish.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

💬

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Sameer Singh, Peter Clark, Roy Fox

Large language models (LLMs) have recently been used for sequential decision making in interactive environments. However, leveraging environment reward signals for continual LLM actor improvement is not straightforward. We propose Skill Set Optimization (SSO) for improving LLM actor performance through constructing and refining sets of transferable skills. SSO constructs skills by extracting common subtrajectories with high rewards and generating subgoals and instructions to represent each skill. These skills are provided to the LLM actor in-context to reinforce behaviors with high rewards. Then, SSO further refines the skill set by pruning skills that do not continue to result in high rewards. We evaluate our method in the classic videogame NetHack and the text environment ScienceWorld to demonstrate SSO's ability to optimize a set of skills and perform in-context policy improvement. SSO outperforms baselines by 40% in our custom NetHack task and outperforms the previous state-of-the-art in ScienceWorld by 35%.

6/26/2024

cs.LG cs.CL

Agentic Skill Discovery

Xufeng Zhao, Cornelius Weber, Stefan Wermter

Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control. A remaining challenge is to acquire a diverse set of fundamental skills. Existing approaches either manually decompose a complex task into atomic robotic actions in a top-down fashion, or bootstrap as many combinations as possible in a bottom-up fashion to cover a wider range of task possibilities. These decompositions or combinations, however, require an initial skill library. For example, a grasping capability can never emerge from a skill library containing only diverse pushing skills. Existing skill discovery techniques with reinforcement learning acquire skills by an exhaustive exploration but often yield non-meaningful behaviors. In this study, we introduce a novel framework for skill discovery that is entirely driven by LLMs. The framework begins with an LLM generating task proposals based on the provided scene description and the robot's configurations, aiming to incrementally acquire new skills upon task completion. For each proposed task, a series of reinforcement learning processes are initiated, utilizing reward and success determination functions sampled by the LLM to develop the corresponding policy. The reliability and trustworthiness of learned behaviors are further ensured by an independent vision-language model. We show that starting with zero skill, the ASD skill library emerges and expands to more and more meaningful and reliable skills, enabling the robot to efficiently further propose and complete advanced tasks. The project page can be found at: https://agentic-skill-discovery.github.io.

5/27/2024

cs.RO cs.AI cs.LG

💬

Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks

Murtaza Dalal, Tarun Chiruvolu, Devendra Chaplot, Ruslan Salakhutdinov

Large Language Models (LLMs) have been shown to be capable of performing high-level planning for long-horizon robotics tasks, yet existing methods require access to a pre-defined skill library (e.g. picking, placing, pulling, pushing, navigating). However, LLM planning does not address how to design or learn those behaviors, which remains challenging particularly in long-horizon settings. Furthermore, for many tasks of interest, the robot needs to be able to adjust its behavior in a fine-grained manner, requiring the agent to be capable of modifying low-level control actions. Can we instead use the internet-scale knowledge from LLMs for high-level policies, guiding reinforcement learning (RL) policies to efficiently solve robotic control tasks online without requiring a pre-determined set of skills? In this paper, we propose Plan-Seq-Learn (PSL): a modular approach that uses motion planning to bridge the gap between abstract language and learned low-level control for solving long-horizon robotics tasks from scratch. We demonstrate that PSL achieves state-of-the-art results on over 25 challenging robotics tasks with up to 10 stages. PSL solves long-horizon tasks from raw visual input spanning four benchmarks at success rates of over 85%, out-performing language-based, classical, and end-to-end approaches. Video results and code at https://mihdalal.github.io/planseqlearn/

5/3/2024

cs.LG cs.AI cs.CV cs.RO

Practice Makes Perfect: Planning to Learn Skill Parameter Policies

Nishanth Kumar, Tom Silver, Willie McClinton, Linfeng Zhao, Stephen Proulx, Tom'as Lozano-P'erez, Leslie Pack Kaelbling, Jennifer Barry

One promising approach towards effective robot decision making in complex, long-horizon tasks is to sequence together parameterized skills. We consider a setting where a robot is initially equipped with (1) a library of parameterized skills, (2) an AI planner for sequencing together the skills given a goal, and (3) a very general prior distribution for selecting skill parameters. Once deployed, the robot should rapidly and autonomously learn to improve its performance by specializing its skill parameter selection policy to the particular objects, goals, and constraints in its environment. In this work, we focus on the active learning problem of choosing which skills to practice to maximize expected future task success. We propose that the robot should estimate the competence of each skill, extrapolate the competence (asking: how much would the competence improve through practice?), and situate the skill in the task distribution through competence-aware planning. This approach is implemented within a fully autonomous system where the robot repeatedly plans, practices, and learns without any environment resets. Through experiments in simulation, we find that our approach learns effective parameter policies more sample-efficiently than several baselines. Experiments in the real-world demonstrate our approach's ability to handle noise from perception and control and improve the robot's ability to solve two long-horizon mobile-manipulation tasks after a few hours of autonomous practice. Project website: http://ees.csail.mit.edu

5/21/2024

cs.RO cs.LG