GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation

2401.00929

Published 6/17/2024 by Zifan Wang, Junyu Chen, Ziqing Chen, Pengwei Xie, Rui Chen, Li Yi

GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation

Abstract

This paper presents GenH2R, a framework for learning generalizable vision-based human-to-robot (H2R) handover skills. The goal is to equip robots with the ability to reliably receive objects with unseen geometry handed over by humans in various complex trajectories. We acquire such generalizability by learning H2R handover at scale with a comprehensive solution including procedural simulation assets creation, automated demonstration generation, and effective imitation learning. We leverage large-scale 3D model repositories, dexterous grasp generation methods, and curve-based 3D animation to create an H2R handover simulation environment named simabbns, surpassing the number of scenes in existing simulators by three orders of magnitude. We further introduce a distillation-friendly demonstration generation method that automatically generates a million high-quality demonstrations suitable for learning. Finally, we present a 4D imitation learning method augmented by a future forecasting objective to distill demonstrations into a visuo-motor handover policy. Experimental evaluations in both simulators and the real world demonstrate significant improvements (at least +10% success rate) over baselines in all cases. The project page is https://GenH2R.github.io/.

Create account to get full access

Overview

This paper presents GenH2R, a system for learning generalizable human-to-robot handover skills through scalable simulation, demonstration, and imitation.
The researchers develop a simulation-based training framework that can generate diverse handover scenarios and learn robust policies for transferring objects from humans to robots.
The system is evaluated on a variety of handover tasks and shown to outperform existing approaches in terms of success rate and generalization to new situations.

Plain English Explanation

The paper describes a new system called GenH2R that aims to help robots learn how to take objects from humans in a smooth and natural way. This is an important skill for robots to have if they are going to work alongside people in tasks like manufacturing or home assistance.

The key idea behind GenH2R is to use computer simulation to create a wide variety of handover scenarios that the robot can practice. This allows the robot to learn general principles for how to approach a person, grasp an object, and complete the handover without dropping or damaging it. The researchers then use a technique called "imitation learning" to distill the robot's simulated experiences into a policy that can be deployed on a real robot.

The system is evaluated on several different handover tasks, and the results show that GenH2R outperforms previous approaches. The robot is able to successfully complete handovers in a variety of situations, including when the human is in different positions or the object being handed over has different shapes and sizes.

Technical Explanation

The paper introduces GenH2R, a framework for learning generalizable human-to-robot handover skills through a combination of scalable simulation, demonstration, and imitation learning.

The key elements of the system include:

Simulation-based Training: The researchers develop a physics-based simulation environment that can generate diverse handover scenarios, including variations in human and object poses, friction, and other parameters. This allows the robot to practice a wide range of handover skills in a scalable and customizable way.
Demonstration Collection: The authors collect a dataset of human-to-human handover demonstrations using motion capture. These demonstrations provide examples of natural and fluent handover behaviors that the robot can learn from.
Imitation Learning: The robot uses the simulated handover experiences and human demonstration data to learn a generalizable handover policy through imitation learning. This policy allows the robot to adapt its approach and grasp based on the specific handover situation.

The paper evaluates GenH2R on a variety of handover tasks, including delivering objects to a stationary human, handing over objects while the human is moving, and transferring objects of different shapes and sizes. The results show that GenH2R outperforms previous approaches like ContactHandover in terms of success rate and generalization to new situations.

Critical Analysis

The paper provides a comprehensive and well-designed approach for enabling robots to learn generalizable human-to-robot handover skills. The use of scalable simulation to generate diverse handover scenarios is a particularly notable aspect, as it allows the robot to practice a wide range of situations without the need for extensive real-world data collection.

However, the paper does acknowledge some limitations of the current system. For example, the simulation environment may not perfectly capture all the nuances of real-world handover interactions, and the handover policy learned through imitation may not be optimal for all possible situations.

Additionally, the paper does not address potential safety and trust concerns that could arise when deploying a robot capable of handling objects in close proximity to humans. Further research may be needed to ensure that the robot's handover behaviors are safe, predictable, and aligned with human preferences.

Overall, the GenH2R framework represents an important step forward in enabling robots to collaborate more seamlessly with humans. By combining scalable simulation, demonstration, and imitation learning, the researchers have developed a system that can learn robust and generalizable handover skills, paving the way for more natural and effective human-robot interaction.

Conclusion

The GenH2R paper presents a novel approach for enabling robots to learn generalizable human-to-robot handover skills through a combination of scalable simulation, demonstration, and imitation learning. The system is shown to outperform previous methods in terms of success rate and generalization to new situations, indicating its potential for enabling more natural and effective human-robot collaboration in a variety of applications.

While the paper acknowledges some limitations of the current system, the overall approach represents an important contribution to the field of robotics and human-robot interaction. By developing techniques for robots to learn fluent handover behaviors, the researchers are helping to bridge the gap between robots and humans, paving the way for a future where robots can seamlessly assist and collaborate with people in a wide range of tasks.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

ContactHandover: Contact-Guided Robot-to-Human Object Handover

Zixi Wang, Zeyi Liu, Nicolas Ouporov, Shuran Song

Robot-to-human object handover is an important step in many human robot collaboration tasks. A successful handover requires the robot to maintain a stable grasp on the object while making sure the human receives the object in a natural and easy-to-use manner. We propose ContactHandover, a robot to human handover system that consists of two phases: a contact-guided grasping phase and an object delivery phase. During the grasping phase, ContactHandover predicts both 6-DoF robot grasp poses and a 3D affordance map of human contact points on the object. The robot grasp poses are reranked by penalizing those that block human contact points, and the robot executes the highest ranking grasp. During the delivery phase, the robot end effector pose is computed by maximizing human contact points close to the human while minimizing the human arm joint torques and displacements. We evaluate our system on 27 diverse household objects and show that our system achieves better visibility and reachability of human contacts to the receiver compared to several baselines. More results can be found on https://clairezixiwang.github.io/ContactHandover.github.io

4/3/2024

cs.RO cs.AI cs.CV

Model Predictive Trajectory Planning for Human-Robot Handovers

Thies Oelerich, Christian Hartl-Nesic, Andreas Kugi

This work develops a novel trajectory planner for human-robot handovers. The handover requirements can naturally be handled by a path-following-based model predictive controller, where the path progress serves as a progress measure of the handover. Moreover, the deviations from the path are used to follow human motion by adapting the path deviation bounds with a handover location prediction. A Gaussian process regression model, which is trained on known handover trajectories, is employed for this prediction. Experiments with a collaborative 7-DoF robotic manipulator show the effectiveness and versatility of the proposed approach.

4/12/2024

cs.RO

GenHeld: Generating and Editing Handheld Objects

Chaerin Min, Srinath Sridhar

Grasping is an important human activity that has long been studied in robotics, computer vision, and cognitive science. Most existing works study grasping from the perspective of synthesizing hand poses conditioned on 3D or 2D object representations. We propose GenHeld to address the inverse problem of synthesizing held objects conditioned on 3D hand model or 2D image. Given a 3D model of hand, GenHeld 3D can select a plausible held object from a large dataset using compact object representations called object codes.The selected object is then positioned and oriented to form a plausible grasp without changing hand pose. If only a 2D hand image is available, GenHeld 2D can edit this image to add or replace a held object. GenHeld 2D operates by combining the abilities of GenHeld 3D with diffusion-based image editing. Results and experiments show that we outperform baselines and can generate plausible held objects in both 2D and 3D. Our experiments demonstrate that our method achieves high quality and plausibility of held object synthesis in both 3D and 2D.

6/18/2024

cs.CV

Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer

Xinyang Gu, Yen-Jen Wang, Jianyu Chen

Humanoid-Gym is an easy-to-use reinforcement learning (RL) framework based on Nvidia Isaac Gym, designed to train locomotion skills for humanoid robots, emphasizing zero-shot transfer from simulation to the real-world environment. Humanoid-Gym also integrates a sim-to-sim framework from Isaac Gym to Mujoco that allows users to verify the trained policies in different physical simulations to ensure the robustness and generalization of the policies. This framework is verified by RobotEra's XBot-S (1.2-meter tall humanoid robot) and XBot-L (1.65-meter tall humanoid robot) in a real-world environment with zero-shot sim-to-real transfer. The project website and source code can be found at: https://sites.google.com/view/humanoid-gym/.

5/21/2024

cs.RO cs.AI cs.LG cs.SY eess.SY