Learning Extrinsic Dexterity with Parameterized Manipulation Primitives

Read original: arXiv:2310.17785 - Published 5/10/2024 by Shih-Min Yang, Martin Magnusson, Johannes A. Stork, Todor Stoyanov

🐍

Overview

This paper tackles the problem of robot grasping when the target object is occluded by the environment, making single-shot grasp planning ineffective.
The authors propose a hierarchical reinforcement learning approach to learn a sequence of manipulation primitives that can exploit the environment to change the object's pose and enable successful grasping.
The method learns low-level manipulation policies to control the object's state through interactions between the object, the gripper, and the environment, without the need for accurate physical modeling.
Experiments show the approach can successfully pick up box-shaped objects with various weight, shape, and friction properties in a constrained tabletop workspace, with a 98% success rate on a real robot.

Plain English Explanation

Robots often need to grasp and pick up objects to complete tasks, but sometimes the object they need to grab is hidden or blocked by the environment. In these cases, single-shot grasp planning, where the robot tries to grab the object in one attempt, won't work. Instead, the robot needs to first manipulate the object into a position where it can be successfully grasped.

To solve this problem, the researchers used a hierarchical reinforcement learning approach. They trained the robot to learn a sequence of low-level manipulation skills, like pushing, sliding, or tipping the object, that it can use to change the object's position and orientation until it's in a grabbable configuration. The robot learns these skills directly from depth perception data, without needing to first detect the object's shape or estimate its pose.

By learning how to interact with the object and the environment, the robot can figure out how to exploit the surroundings to move the object into the right position, even in complex scenarios where analytically modeling the physics would be very difficult. The researchers tested this approach on picking up box-shaped objects with different weights, shapes, and surface properties, and found it was successful 98% of the time when tried on a real robot arm.

Technical Explanation

The key innovation of this work is the use of a hierarchical reinforcement learning framework to learn a sequence of parameterized manipulation primitives that can be combined to change the pose of an occluded object, enabling successful grasping.

Rather than relying on accurate physical modeling or object detection/pose estimation, the method learns low-level manipulation policies that directly operate on depth perception data. These policies control the interactions between the gripper, the object, and the environment to gradually manipulate the object into a graspable configuration.

The hierarchical structure allows the system to learn high-level strategies for sequencing the low-level primitives, such as pushing, sliding, or tipping the object, to achieve the desired pose. This avoids the need for manual design of complex controllers, which would be infeasible given the unstructured nature of the environment and object interactions.

The authors evaluate their approach on a tabletop object picking task with box-shaped objects of varying weight, shape, and friction properties. They demonstrate successful grasping in 98% of trials on a real robot, showing the method's ability to handle diverse object and environmental conditions without requiring explicit modeling.

This work builds upon prior research in areas like high-DOF reaching, pre-grasp manipulation, and manipulation in dynamic shared environments, but tackles the unique challenge of occluded object grasping by learning flexible, environment-exploiting manipulation skills.

Critical Analysis

The paper provides a compelling solution to the problem of robot grasping in occluded environments, which is a common and practically relevant challenge. The use of hierarchical reinforcement learning to learn a sequence of manipulation primitives is a clever approach that avoids the need for accurate physical modeling or object detection/pose estimation.

One potential limitation is that the method was only evaluated on box-shaped objects in a constrained tabletop workspace. It would be interesting to see how well the approach generalizes to more diverse object shapes and environments, such as unknown objects in assistive robotics scenarios.

Additionally, the paper does not discuss the sample efficiency or training time required to learn the manipulation policies, which could be an important practical consideration. It would also be valuable to understand how the hierarchical structure and choice of primitive skills impact the overall performance and robustness of the system.

Despite these potential areas for further exploration, the authors have presented a notable contribution to the field of robot grasping and manipulation by demonstrating a learning-based approach that can effectively exploit the environment to overcome occlusion challenges.

Conclusion

This paper addresses the problem of robot grasping when the target object is occluded by the environment, which is a common and practically relevant challenge. The authors propose a hierarchical reinforcement learning approach that can learn a sequence of manipulation primitives to change the object's pose and enable successful grasping, without the need for accurate physical modeling or object detection/pose estimation.

The experimental results on a real robot show the method can successfully pick up box-shaped objects with various weight, shape, and friction properties in a constrained tabletop workspace, with a 98% success rate. This work demonstrates the power of learning-based approaches to tackle complex robot manipulation tasks by leveraging interactions with the environment, rather than relying on analytical modeling.

While there are opportunities for further exploration, such as evaluating the approach on more diverse object shapes and environments, this paper presents a significant contribution to the field of robot grasping and manipulation, with potential applications in areas like assistive robotics and warehouse automation.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🐍

Learning Extrinsic Dexterity with Parameterized Manipulation Primitives

Shih-Min Yang, Martin Magnusson, Johannes A. Stork, Todor Stoyanov

Many practically relevant robot grasping problems feature a target object for which all grasps are occluded, e.g., by the environment. Single-shot grasp planning invariably fails in such scenarios. Instead, it is necessary to first manipulate the object into a configuration that affords a grasp. We solve this problem by learning a sequence of actions that utilize the environment to change the object's pose. Concretely, we employ hierarchical reinforcement learning to combine a sequence of learned parameterized manipulation primitives. By learning the low-level manipulation policies, our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment. Designing such a complex behavior analytically would be infeasible under uncontrolled conditions, as an analytic approach requires accurate physical modeling of the interaction and contact dynamics. In contrast, we learn a hierarchical policy model that operates directly on depth perception data, without the need for object detection, pose estimation, or manual design of controllers. We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace. Our method transfers to a real robot and is able to successfully complete the object picking task in 98% of experimental trials. Supplementary information and videos can be found at https://shihminyang.github.io/ED-PMP/.

5/10/2024

Learning Cross-hand Policies for High-DOF Reaching and Grasping

Qijin She, Shishun Zhang, Yunfan Ye, Ruizhen Hu, Kai Xu

Reaching-and-grasping is a fundamental skill for robotic manipulation, but existing methods usually train models on a specific gripper and cannot be reused on another gripper. In this paper, we propose a novel method that can learn a unified policy model that can be easily transferred to different dexterous grippers. Our method consists of two stages: a gripper-agnostic policy model that predicts the displacements of pre-defined key points on the gripper, and a gripper-specific adaptation model that translates these displacements into adjustments for controlling the grippers' joints. The gripper state and interactions with objects are captured at the finger level using robust geometric representations, integrated with a transformer-based network to address variations in gripper morphology and geometry. In the experiments, we evaluate our method on several dexterous grippers and diverse objects, and the result shows that our method significantly outperforms the baseline methods. Pioneering the transfer of grasp policies across dexterous grippers, our method effectively demonstrates its potential for learning generalizable and transferable manipulation skills for various robotic hands.

7/16/2024

Hand-Object Interaction Pretraining from Videos

Himanshu Gaurav Singh, Antonio Loquercio, Carmelo Sferrazza, Jane Wu, Haozhi Qi, Pieter Abbeel, Jitendra Malik

We present an approach to learn general robot manipulation priors from 3D hand-object interaction trajectories. We build a framework to use in-the-wild videos to generate sensorimotor robot trajectories. We do so by lifting both the human hand and the manipulated object in a shared 3D space and retargeting human motions to robot actions. Generative modeling on this data gives us a task-agnostic base policy. This policy captures a general yet flexible manipulation prior. We empirically demonstrate that finetuning this policy, with both reinforcement learning (RL) and behavior cloning (BC), enables sample-efficient adaptation to downstream tasks and simultaneously improves robustness and generalizability compared to prior approaches. Qualitative experiments are available at: url{https://hgaurav2k.github.io/hop/}.

9/14/2024

Dexterous Functional Pre-Grasp Manipulation with Diffusion Policy

Tianhao Wu, Yunchong Gan, Mingdong Wu, Jingbo Cheng, Yaodong Yang, Yixin Zhu, Hao Dong

In real-world scenarios, objects often require repositioning and reorientation before they can be grasped, a process known as pre-grasp manipulation. Learning universal dexterous functional pre-grasp manipulation requires precise control over the relative position, orientation, and contact between the hand and object while generalizing to diverse dynamic scenarios with varying objects and goal poses. To address this challenge, we propose a teacher-student learning approach that utilizes a novel mutual reward, incentivizing agents to optimize three key criteria jointly. Additionally, we introduce a pipeline that employs a mixture-of-experts strategy to learn diverse manipulation policies, followed by a diffusion policy to capture complex action distributions from these experts. Our method achieves a success rate of 72.6% across more than 30 object categories by leveraging extrinsic dexterity and adjusting from feedback.

5/7/2024