Unsupervised Neural Motion Retargeting for Humanoid Teleoperation

Read original: arXiv:2406.00727 - Published 6/4/2024 by Satoshi Yagi, Mitsunori Tada, Eiji Uchibe, Suguru Kanoga, Takamitsu Matsubara, Jun Morimoto

Unsupervised Neural Motion Retargeting for Humanoid Teleoperation

Overview

This paper presents an unsupervised neural network approach for retargeting human motion to control a humanoid robot.
The method learns to map human motion to the robot's joint angles without any explicit supervision, allowing for natural teleoperation.
The technique could enable more intuitive and responsive control of humanoid robots for applications like remote assistance, exploration, and disaster response.

Plain English Explanation

The paper describes a new way to control humanoid robots using the natural movements of a human operator. Typically, operating a robot requires training it on specific motion patterns or painstakingly adjusting its joints. This can make the robot feel rigid and unnatural to control.

The researchers developed a neural network that can automatically learn to translate a human's movements into the corresponding joint positions for the robot. This "motion retargeting" happens without any direct supervision - the network figures out the mapping on its own by observing the human and robot motions.

The advantage is that the robot can then be controlled in a much more fluid and intuitive way, mimicking the human operator. This could be useful for applications like remote assistance, where a human expert controls a robot to perform delicate tasks. It could also enable more natural control of humanoid robots for exploration, disaster response, and other domains where flexibility and adaptability are important.

Technical Explanation

The key innovation of this paper is an unsupervised neural network architecture that can learn to map human motion to the corresponding joint configurations of a humanoid robot. Rather than requiring explicit training data or rules-based algorithms, the network autonomously discovers the latent relationship between human and robot kinematics.

The approach consists of two main components: a human pose estimation module that tracks the 3D positions of the human's joints, and a motion retargeting module that translates this human pose into robot joint angles. The researchers use a variational autoencoder (VAE) to learn a low-dimensional representation of the human and robot motions, which allows the network to efficiently capture the complex nonlinear mapping between them.

During training, the network observes unpaired examples of human and robot motions, with no direct correspondences provided. It learns to align the latent spaces of the two modalities in an unsupervised manner, discovering the implicit transformation between them. At inference time, the trained model can then take in a new human motion and output the appropriate robot joint configurations to mimic that motion.

The authors evaluate their method on both simulated and real-world humanoid robots, demonstrating its ability to enable natural, responsive teleoperation. The unsupervised nature of the approach also allows it to generalize to new human subjects and robot platforms without requiring retraining.

Critical Analysis

A key advantage of this unsupervised motion retargeting approach is its ability to adapt to new human operators and robot platforms without the need for extensive retraining or manual tuning. This could make the technique more practical for real-world deployment compared to previous methods that required carefully curated datasets or expert knowledge.

However, the paper does not extensively explore the limitations of the approach. For example, it is unclear how well the method would handle large differences in the kinematic structure or degrees of freedom between the human and robot. Additionally, the evaluation is focused on relatively simple, unconstrained motions - it remains to be seen how well the technique would scale to more complex, dexterous tasks that require fine motor control.

Further research could also investigate ways to incorporate domain knowledge or safety constraints into the motion retargeting process, to ensure the robot's movements remain physically feasible and do not risk damage to itself or the environment. Integrating uncertainty-aware motion prediction techniques, as explored in related work, could also improve the system's robustness.

Overall, this paper presents a promising step towards more natural and intuitive control of humanoid robots through unsupervised motion retargeting. By enabling robots to better mimic human movements, the approach could unlock new possibilities for remote manipulation, human-robot interaction, and autonomous systems that seamlessly collaborate with people.

Conclusion

This paper introduces an unsupervised neural network method for retargeting human motion to control a humanoid robot. The key innovation is the ability to learn the mapping between human and robot kinematics without any explicit supervision, allowing for more natural and responsive teleoperation.

The technique could enable a wide range of applications where flexible, intuitive control of humanoid robots is desirable, such as remote assistance, exploration, and disaster response. While the current evaluation shows promising results, further research is needed to address potential limitations and expand the method's capabilities.

Overall, this work represents an important step towards bridging the gap between human and robot motion, potentially leading to more seamless and effective human-robot interaction and collaboration in the future.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Unsupervised Neural Motion Retargeting for Humanoid Teleoperation

Satoshi Yagi, Mitsunori Tada, Eiji Uchibe, Suguru Kanoga, Takamitsu Matsubara, Jun Morimoto

This study proposes an approach to human-to-humanoid teleoperation using GAN-based online motion retargeting, which obviates the need for the construction of pairwise datasets to identify the relationship between the human and the humanoid kinematics. Consequently, it can be anticipated that our proposed teleoperation system will reduce the complexity and setup requirements typically associated with humanoid controllers, thereby facilitating the development of more accessible and intuitive teleoperation systems for users without robotics knowledge. The experiments demonstrated the efficacy of the proposed method in retargeting a range of upper-body human motions to humanoid, including a body jab motion and a basketball shoot motion. Moreover, the human-in-the-loop teleoperation performance was evaluated by measuring the end-effector position errors between the human and the retargeted humanoid motions. The results demonstrated that the error was comparable to those of conventional motion retargeting methods that require pairwise motion datasets. Finally, a box pick-and-place task was conducted to demonstrate the usability of the developed humanoid teleoperation system.

6/4/2024

🤷

ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space

Yashuai Yan, Esteve Valls Mascaro, Dongheui Lee

This paper introduces a novel deep-learning approach for human-to-robot motion retargeting, enabling robots to mimic human poses accurately. Contrary to prior deep-learning-based works, our method does not require paired human-to-robot data, which facilitates its translation to new robots. First, we construct a shared latent space between humans and robots via adaptive contrastive learning that takes advantage of a proposed cross-domain similarity metric between the human and robot poses. Additionally, we propose a consistency term to build a common latent space that captures the similarity of the poses with precision while allowing direct robot motion control from the latent space. For instance, we can generate in-between motion through simple linear interpolation between two projected human poses. We conduct a comprehensive evaluation of robot control from diverse modalities (i.e., texts, RGB videos, and key poses), which facilitates robot control for non-expert users. Our model outperforms existing works regarding human-to-robot retargeting in terms of efficiency and precision. Finally, we implemented our method in a real robot with self-collision avoidance through a whole-body controller to showcase the effectiveness of our approach. More information on our website https://evm7.github.io/UnsH2R/

4/9/2024

High-Speed and Impact Resilient Teleoperation of Humanoid Robots

Sylvain Bertrand, Luigi Penco, Dexton Anderson, Duncan Calvert, Valentine Roy, Stephen McCrory, Khizar Mohammed, Sebastian Sanchez, Will Griffith, Steve Morfey, Alexis Maslyczyk, Achintya Mohan, Cody Castello, Bingyin Ma, Kartik Suryavanshi, Patrick Dills, Jerry Pratt, Victor Ragusila, Brandon Shrewsbury, Robert Griffin

Teleoperation of humanoid robots has long been a challenging domain, necessitating advances in both hardware and software to achieve seamless and intuitive control. This paper presents an integrated solution based on several elements: calibration-free motion capture and retargeting, low-latency fast whole-body kinematics streaming toolbox and high-bandwidth cycloidal actuators. Our motion retargeting approach stands out for its simplicity, requiring only 7 IMUs to generate full-body references for the robot. The kinematics streaming toolbox, ensures real-time, responsive control of the robot's movements, significantly reducing latency and enhancing operational efficiency. Additionally, the use of cycloidal actuators makes it possible to withstand high speeds and impacts with the environment. Together, these approaches contribute to a teleoperation framework that offers unprecedented performance. Experimental results on the humanoid robot Nadia demonstrate the effectiveness of the integrated system.

9/10/2024

Redefining Data Pairing for Motion Retargeting Leveraging a Human Body Prior

Xiyana Figuera, Soogeun Park, Hyemin Ahn

We propose MR.HuBo (Motion Retargeting leveraging a HUman BOdy prior), a cost-effective and convenient method to collect high-quality upper body paired $langle text{robot, human} rangle$ pose data, which is essential for data-driven motion retargeting methods. Unlike existing approaches which collect $langle text{robot, human} rangle$ pose data by converting human MoCap poses into robot poses, our method goes in reverse. We first sample diverse random robot poses, and then convert them into human poses. However, since random robot poses can result in extreme and infeasible human poses, we propose an additional technique to sort out extreme poses by exploiting a human body prior trained from a large amount of human pose data. Our data collection method can be used for any humanoid robots, if one designs or optimizes the system's hyperparameters which include a size scale factor and the joint angle ranges for sampling. In addition to this data collection method, we also present a two-stage motion retargeting neural network that can be trained via supervised learning on a large amount of paired data. Compared to other learning-based methods trained via unsupervised learning, we found that our deep neural network trained with ample high-quality paired data achieved notable performance. Our experiments also show that our data filtering method yields better retargeting results than training the model with raw and noisy data. Our code and video results are available on https://sites.google.com/view/mr-hubo/

9/23/2024