PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Read original: arXiv:2402.10636 - Published 4/3/2024 by Hyunsoo Cha, Byungjun Kim, Hanbyul Joo

🎯

Overview

Researchers present a method called PEGASUS to create personalized 3D face avatars from monocular video sources
The avatars allow selective alteration of facial attributes like hair or nose while preserving the person's identity
The approach involves two stages: generating a synthetic database of the target identity and then building a personalized generative 3D avatar

Plain English Explanation

The researchers have developed a way to create realistic 3D avatars of people's faces from regular video footage. These avatars have a special capability - you can change certain features of the face, like the hair or nose, without losing the overall identity of the person.

The process works in two steps. First, the researchers generate a library of synthetic videos of the target person, where they borrow and combine different facial features from other people's videos. This gives them a diverse set of videos to work with.

Then, they use this synthetic video database to build a personalized 3D avatar that can dynamically modify its own facial attributes. So you could make the avatar's nose smaller or give it a different hairstyle, for example, while still clearly recognizing it as that specific person.

The key benefit is that this allows you to create highly customizable 3D avatars that preserve the user's identity. This could be valuable for virtual reality, video games, or other applications where people want to represent themselves digitally in a personalized way.

Technical Explanation

The PEGASUS method consists of two main stages. First, the researchers generate a synthetic database of the target person's identity by borrowing facial attributes from diverse monocular video sources. They achieve this by swapping in different facial features (like hair, nose, etc.) from other videos and combining them with the target person's identity.

This synthetic database then serves as the input to the second stage - constructing a personalized generative 3D avatar. The researchers build a generative model that can continuously modify the facial attributes of the avatar while preserving the underlying identity. Through extensive experiments, they demonstrate that this approach is highly effective at maintaining identity realism compared to other methods.

Additionally, the paper introduces a "zero-shot" approach that can achieve the same generative modeling goals more efficiently by leveraging a previously constructed personalized model. This further improves the scalability and practicality of the PEGASUS framework.

Critical Analysis

The PEGASUS paper presents a compelling approach for creating personalized 3D face avatars with fine-grained control over facial attributes. The synthetic database generation and personalized generative model together form a robust solution for this challenging task.

However, the paper does not delve deeply into potential limitations or ethical considerations around the technology. For example, the ability to easily manipulate a person's digital likeness raises questions about consent, privacy, and the potential for misuse. The researchers could have discussed safeguards or precautions to mitigate these concerns.

Additionally, the performance evaluation is primarily focused on objective metrics like realism and identity preservation. It would be valuable to also get feedback from end-users on the subjective quality and usefulness of the generated avatars for various applications.

Overall, the PEGASUS framework represents an impressive technical advancement, but the paper could have provided a more comprehensive discussion of the broader implications and future research directions.

Conclusion

The PEGASUS method offers a novel approach to create personalized 3D face avatars that can selectively modify facial attributes while preserving the user's identity. By generating a synthetic database and training a personalized generative model, the researchers demonstrate an effective way to enable this type of fine-grained control and customization.

This technology could unlock new possibilities for virtual self-representation in gaming, social media, and other digital spaces. However, it also raises important ethical considerations around consent, privacy, and the potential for misuse that warrant further examination.

As the field of 3D avatar generation continues to advance, it will be crucial to thoughtfully address these broader implications to ensure the technology is developed and deployed responsibly. The PEGASUS framework represents a significant step forward, but there is still more work to be done to fully realize the benefits while mitigating the risks.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🎯

PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Hyunsoo Cha, Byungjun Kim, Hanbyul Joo

We present PEGASUS, a method for constructing a personalized generative 3D face avatar from monocular video sources. Our generative 3D avatar enables disentangled controls to selectively alter the facial attributes (e.g., hair or nose) while preserving the identity. Our approach consists of two stages: synthetic database generation and constructing a personalized generative avatar. We generate a synthetic video collection of the target identity with varying facial attributes, where the videos are synthesized by borrowing the attributes from monocular videos of diverse identities. Then, we build a person-specific generative 3D avatar that can modify its attributes continuously while preserving its identity. Through extensive experiments, we demonstrate that our method of generating a synthetic database and creating a 3D generative avatar is the most effective in preserving identity while achieving high realism. Subsequently, we introduce a zero-shot approach to achieve the same goal of generative modeling more efficiently by leveraging a previously constructed personalized generative model.

4/3/2024

Gaussian D'ej`a-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

Peizhi Yan, Rabab Ward, Qiang Tang, Shan Du

Recent advancements in 3D Gaussian Splatting (3DGS) have unlocked significant potential for modeling 3D head avatars, providing greater flexibility than mesh-based methods and more efficient rendering compared to NeRF-based approaches. Despite these advancements, the creation of controllable 3DGS-based head avatars remains time-intensive, often requiring tens of minutes to hours. To expedite this process, we here introduce the ``Gaussian D'ej`a-vu framework, which first obtains a generalized model of the head avatar and then personalizes the result. The generalized model is trained on large 2D (synthetic and real) image datasets. This model provides a well-initialized 3D Gaussian head that is further refined using a monocular video to achieve the personalized head avatar. For personalizing, we propose learnable expression-aware rectification blendmaps to correct the initial 3D Gaussians, ensuring rapid convergence without the reliance on neural networks. Experiments demonstrate that the proposed method meets its objectives. It outperforms state-of-the-art 3D Gaussian head avatars in terms of photorealistic quality as well as reduces training time consumption to at least a quarter of the existing methods, producing the avatar in minutes.

9/27/2024

My3DGen: A Scalable Personalized 3D Generative Model

Luchao Qi, Jiaye Wu, Annie N. Wang, Shengze Wang, Roni Sengupta

In recent years, generative 3D face models (e.g., EG3D) have been developed to tackle the problem of synthesizing photo-realistic faces. However, these models are often unable to capture facial features unique to each individual, highlighting the importance of personalization. Some prior works have shown promise in personalizing generative face models, but these studies primarily focus on 2D settings. Also, these methods require both fine-tuning and storing a large number of parameters for each user, posing a hindrance to achieving scalable personalization. Another challenge of personalization is the limited number of training images available for each individual, which often leads to overfitting when using full fine-tuning methods. Our proposed approach, My3DGen, generates a personalized 3D prior of an individual using as few as 50 training images. My3DGen allows for novel view synthesis, semantic editing of a given face (e.g. adding a smile), and synthesizing novel appearances, all while preserving the original person's identity. We decouple the 3D facial features into global features and personalized features by freezing the pre-trained EG3D and training additional personalized weights through low-rank decomposition. As a result, My3DGen introduces only $textbf{240K}$ personalized parameters per individual, leading to a $textbf{127}times$ reduction in trainable parameters compared to the $textbf{30.6M}$ required for fine-tuning the entire parameter space. Despite this significant reduction in storage, our model preserves identity features without compromising the quality of downstream applications.

5/21/2024

PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation

Lukas Meyer, Floris Erich, Yusuke Yoshiyasu, Marc Stamminger, Noriaki Ando, Yukiyasu Domae

We introduce Physically Enhanced Gaussian Splatting Simulation System (PEGASUS) for 6DOF object pose dataset generation, a versatile dataset generator based on 3D Gaussian Splatting. Environment and object representations can be easily obtained using commodity cameras to reconstruct with Gaussian Splatting. PEGASUS allows the composition of new scenes by merging the respective underlying Gaussian Splatting point cloud of an environment with one or multiple objects. Leveraging a physics engine enables the simulation of natural object placement within a scene through interaction between meshes extracted for the objects and the environment. Consequently, an extensive amount of new scenes - static or dynamic - can be created by combining different environments and objects. By rendering scenes from various perspectives, diverse data points such as RGB images, depth maps, semantic masks, and 6DoF object poses can be extracted. Our study demonstrates that training on data generated by PEGASUS enables pose estimation networks to successfully transfer from synthetic data to real-world data. Moreover, we introduce the Ramen dataset, comprising 30 Japanese cup noodle items. This dataset includes spherical scans that captures images from both object hemisphere and the Gaussian Splatting reconstruction, making them compatible with PEGASUS.

7/16/2024