WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds

Read original: arXiv:2407.18946 - Published 7/30/2024 by Peizhuo Li, Sebastian Starke, Yuting Ye, Olga Sorkine-Hornung

WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds

Overview

This paper introduces "WalkTheDog", a method for aligning the motion of different characters with different morphologies (body shapes) to a common reference motion.
The approach uses a phase manifold representation to capture the temporal dynamics of motion and enable cross-morphology alignment.
Experiments show the method can transfer motion from one character to another while preserving the style and timing of the original motion.

Plain English Explanation

The paper presents a technique called "WalkTheDog" that allows animating the movement of different animated characters, even if they have very different body shapes or "morphologies". The key idea is to represent the character's motion in terms of a "phase manifold" - a mathematical representation that captures the temporal dynamics and rhythms of the movement.

By aligning these phase manifolds across different characters, the method can transfer the motion from one character to another while preserving the overall style and timing of the original motion. For example, it could take the walking motion of a human character and apply it to an animal character like a dog, making the dog move in a very human-like way.

This is useful for creating animated content, as it allows animators to more easily reuse and adapt motion capture data across different characters, saving time and effort. The phase manifold representation also provides a more principled way to blend and combine different motions compared to previous techniques.

Technical Explanation

The paper introduces a new method called "WalkTheDog" for aligning the motion of characters with different body morphologies to a common reference motion. The key innovation is the use of a phase manifold representation to capture the temporal dynamics of motion.

The method first learns a phase manifold encoder that maps the raw joint angle trajectories of a reference motion into a low-dimensional phase manifold representation. It then learns a separate phase manifold encoder for each target character. By aligning the phase manifolds of the target characters to the reference, the method can transfer the motion while preserving its style and timing.

Experiments show that WalkTheDog can effectively transfer motion between characters with very different body shapes, enabling the animation of diverse characters using a common reference motion.

Critical Analysis

The paper presents a novel and promising approach for motion transfer across different character morphologies. The use of phase manifolds is an interesting way to model the temporal dynamics of motion in a principled manner.

However, the paper does not extensively evaluate the method's ability to handle very large differences in morphology, such as transferring motion between a human and a completely different animal like a dog or a bird. The experiments are mostly focused on humanoid characters, so further testing would be needed to understand the broader applicability of the approach.

Additionally, the paper does not address potential issues around the faithfulness of the transferred motion or the ability to preserve subtle stylistic nuances. While the method can preserve overall motion timing, there may be challenges in retaining more detailed motion characteristics when transferring between very different body shapes.

Further research could explore ways to better model and preserve such fine-grained motion details, as well as expanding the range of morphologies that can be effectively aligned using the phase manifold representation.

Conclusion

Overall, the "WalkTheDog" method presented in this paper represents an interesting step forward in enabling cross-morphology motion alignment and transfer. By leveraging a phase manifold representation, the approach provides a principled way to adapt motion capture data across diverse animated characters.

This has the potential to significantly streamline the animation production process, allowing animators to more easily reuse and repurpose motion data. While further research is needed to fully understand the capabilities and limitations of the approach, this work demonstrates the value of exploring novel representations for modeling the temporal dynamics of character motion.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds

Peizhuo Li, Sebastian Starke, Yuting Ye, Olga Sorkine-Hornung

We present a new approach for understanding the periodicity structure and semantics of motion datasets, independently of the morphology and skeletal structure of characters. Unlike existing methods using an overly sparse high-dimensional latent, we propose a phase manifold consisting of multiple closed curves, each corresponding to a latent amplitude. With our proposed vector quantized periodic autoencoder, we learn a shared phase manifold for multiple characters, such as a human and a dog, without any supervision. This is achieved by exploiting the discrete structure and a shallow network as bottlenecks, such that semantically similar motions are clustered into the same curve of the manifold, and the motions within the same component are aligned temporally by the phase variable. In combination with an improved motion matching framework, we demonstrate the manifold's capability of timing and semantics alignment in several applications, including motion retrieval, transfer and stylization. Code and pre-trained models for this paper are available at https://peizhuoli.github.io/walkthedog.

7/30/2024

Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space

Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tu

Stylized motion breathes life into characters. However, the fixed skeleton structure and style representation hinder existing data-driven motion synthesis methods from generating stylized motion for various characters. In this work, we propose a generative motion stylization pipeline, named MotionS, for synthesizing diverse and stylized motion on cross-structure characters using cross-modality style prompts. Our key insight is to embed motion style into a cross-modality latent space and perceive the cross-structure skeleton topologies, allowing for motion stylization within a canonical motion space. Specifically, the large-scale Contrastive-Language-Image-Pre-training (CLIP) model is leveraged to construct the cross-modality latent space, enabling flexible style representation within it. Additionally, two topology-encoded tokens are learned to capture the canonical and specific skeleton topologies, facilitating cross-structure topology shifting. Subsequently, the topology-shifted stylization diffusion is designed to generate motion content for the particular skeleton and stylize it in the shifted canonical motion space using multi-modality style descriptions. Through an extensive set of examples, we demonstrate the flexibility and generalizability of our pipeline across various characters and style descriptions. Qualitative and quantitative comparisons show the superiority of our pipeline over state-of-the-arts, consistently delivering high-quality stylized motion across a broad spectrum of skeletal structures.

7/24/2024

Learning Human Motion from Monocular Videos via Cross-Modal Manifold Alignment

Shuaiying Hou, Hongyu Tao, Junheng Fang, Changqing Zou, Hujun Bao, Weiwei Xu

Learning 3D human motion from 2D inputs is a fundamental task in the realms of computer vision and computer graphics. Many previous methods grapple with this inherently ambiguous task by introducing motion priors into the learning process. However, these approaches face difficulties in defining the complete configurations of such priors or training a robust model. In this paper, we present the Video-to-Motion Generator (VTM), which leverages motion priors through cross-modal latent feature space alignment between 3D human motion and 2D inputs, namely videos and 2D keypoints. To reduce the complexity of modeling motion priors, we model the motion data separately for the upper and lower body parts. Additionally, we align the motion data with a scale-invariant virtual skeleton to mitigate the interference of human skeleton variations to the motion priors. Evaluated on AIST++, the VTM showcases state-of-the-art performance in reconstructing 3D human motion from monocular videos. Notably, our VTM exhibits the capabilities for generalization to unseen view angles and in-the-wild videos.

4/16/2024

MoManifold: Learning to Measure 3D Human Motion via Decoupled Joint Acceleration Manifolds

Ziqiang Dang, Tianxing Fan, Boming Zhao, Xujie Shen, Lei Wang, Guofeng Zhang, Zhaopeng Cui

Incorporating temporal information effectively is important for accurate 3D human motion estimation and generation which have wide applications from human-computer interaction to AR/VR. In this paper, we present MoManifold, a novel human motion prior, which models plausible human motion in continuous high-dimensional motion space. Different from existing mathematical or VAE-based methods, our representation is designed based on the neural distance field, which makes human dynamics explicitly quantified to a score and thus can measure human motion plausibility. Specifically, we propose novel decoupled joint acceleration manifolds to model human dynamics from existing limited motion data. Moreover, we introduce a novel optimization method using the manifold distance as guidance, which facilitates a variety of motion-related tasks. Extensive experiments demonstrate that MoManifold outperforms existing SOTAs as a prior in several downstream tasks such as denoising real-world human mocap data, recovering human motion from partial 3D observations, mitigating jitters for SMPL-based pose estimators, and refining the results of motion in-betweening.

9/4/2024