HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models

Read original: arXiv:2404.04876 - Published 4/22/2024 by Yifan Yang, Dong Liu, Shuhai Zhang, Zeshuai Deng, Zixiong Huang, Mingkui Tan
Total Score

0

HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper, titled "HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models," presents a new method for accurately reconstructing 3D models of clothed human bodies using a combination of high-frequency and low-frequency information.
  • The key idea is to capture both detailed surface features (high-frequency) and the overall body shape (low-frequency) to create more realistic and robust 3D human reconstructions.
  • The method is demonstrated on a variety of datasets and compared to existing approaches, showing improved performance in terms of reconstruction accuracy and robustness.

Plain English Explanation

The paper describes a new way to create 3D models of people wearing clothes. Current methods can struggle to accurately capture both the fine details of the clothing and the overall shape of the person's body. The "HiLo" approach aims to address this by combining two types of information:

  1. High-frequency information: This refers to the small, detailed features of the clothing, like the folds and wrinkles.
  2. Low-frequency information: This is the general shape and outline of the person's body underneath the clothing.

By using both the high-frequency (detailed) and low-frequency (overall shape) data, the method is able to produce 3D models that are more realistic and accurate than previous techniques. The researchers tested their approach on various datasets and found it outperformed existing methods in terms of reconstruction quality and robustness.

The key insight is that you need both the fine details and the broad shape to create compelling 3D models of clothed humans. This is an important step forward in the field of 3D reconstruction of interacting multi-person clothing and learning physics-based 3D avatars from data.

Technical Explanation

The paper introduces a new method called "HiLo" for 3D reconstruction of clothed human bodies. The key innovation is the use of both high-frequency and low-frequency information to capture detailed surface features as well as the overall body shape.

The high-frequency (HF) component models the fine-grained details of the clothing, such as folds and wrinkles. This is achieved using a neural network that predicts a displacement field from a low-resolution body mesh. The low-frequency (LF) component models the coarse body shape using a parametric human body model, such as SMPL.

The final 3D reconstruction is obtained by combining the HF and LF components, which allows the method to produce detailed and robust clothed human models. The authors demonstrate the effectiveness of HiLo on a variety of datasets, showing improvements over previous state-of-the-art approaches in terms of reconstruction accuracy and robustness.

Critical Analysis

The paper presents a compelling approach to 3D human reconstruction that addresses some important limitations of prior work. By jointly modeling high-frequency and low-frequency information, HiLo is able to capture both the fine details of clothing and the overall body shape, leading to more realistic and accurate results.

That said, the paper does not discuss certain potential limitations or areas for future work. For example, the method assumes the availability of a parametric body model like SMPL, which may not always be the case. It would be interesting to explore how HiLo could be extended to handle more general 3D human representations, such as template-free reconstruction of human-object interactions.

Additionally, the paper focuses on static 3D reconstruction and does not address the challenge of high-quality 3D human generation or animation. Extending the HiLo approach to these domains could further expand its applicability and impact.

Overall, the HiLo method represents an important step forward in 3D clothed human reconstruction, and the ideas presented in this work could inspire future research in this rapidly evolving field.

Conclusion

The "HiLo" method introduced in this paper offers a novel approach to 3D reconstruction of clothed human bodies. By combining high-frequency and low-frequency information, the technique is able to capture both the detailed surface features of clothing and the overall shape of the human form, leading to more realistic and robust 3D models.

The demonstrated improvements over existing methods, in terms of reconstruction accuracy and robustness, highlight the potential of this approach. As 3D human reconstruction continues to be an active area of research, with applications in virtual try-on, animation, and beyond, the ideas presented in this work could have a significant impact on the field.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models
Total Score

0

HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models

Yifan Yang, Dong Liu, Shuhai Zhang, Zeshuai Deng, Zixiong Huang, Mingkui Tan

Reconstructing 3D clothed human involves creating a detailed geometry of individuals in clothing, with applications ranging from virtual try-on, movies, to games. To enable practical and widespread applications, recent advances propose to generate a clothed human from an RGB image. However, they struggle to reconstruct detailed and robust avatars simultaneously. We empirically find that the high-frequency (HF) and low-frequency (LF) information from a parametric model has the potential to enhance geometry details and improve robustness to noise, respectively. Based on this, we propose HiLo, namely clothed human reconstruction with high- and low-frequency information, which contains two components. 1) To recover detailed geometry using HF information, we propose a progressive HF Signed Distance Function to enhance the detailed 3D geometry of a clothed human. We analyze that our progressive learning manner alleviates large gradients that hinder model convergence. 2) To achieve robust reconstruction against inaccurate estimation of the parametric model by using LF information, we propose a spatial interaction implicit function. This function effectively exploits the complementary spatial information from a low-resolution voxel grid of the parametric model. Experimental results demonstrate that HiLo outperforms the state-of-the-art methods by 10.43% and 9.54% in terms of Chamfer distance on the Thuman2.0 and CAPE datasets, respectively. Additionally, HiLo demonstrates robustness to noise from the parametric model, challenging poses, and various clothing styles.

Read more

4/22/2024

HumanCoser: Layered 3D Human Generation via Semantic-Aware Diffusion Model
Total Score

0

HumanCoser: Layered 3D Human Generation via Semantic-Aware Diffusion Model

Yi Wang, Jian Ma, Ruizhi Shao, Qiao Feng, Yu-kun Lai, Kun Li

This paper aims to generate physically-layered 3D humans from text prompts. Existing methods either generate 3D clothed humans as a whole or support only tight and simple clothing generation, which limits their applications to virtual try-on and part-level editing. To achieve physically-layered 3D human generation with reusable and complex clothing, we propose a novel layer-wise dressed human representation based on a physically-decoupled diffusion model. Specifically, to achieve layer-wise clothing generation, we propose a dual-representation decoupling framework for generating clothing decoupled from the human body, in conjunction with an innovative multi-layer fusion volume rendering method. To match the clothing with different body shapes, we propose an SMPL-driven implicit field deformation network that enables the free transfer and reuse of clothing. Extensive experiments demonstrate that our approach not only achieves state-of-the-art layered 3D human generation with complex clothing but also supports virtual try-on and layered human animation.

Read more

8/22/2024

🛸

Total Score

0

TELA: Text to Layer-wise 3D Clothed Human Generation

Junting Dong, Qi Fang, Zehuan Huang, Xudong Xu, Jingbo Wang, Sida Peng, Bo Dai

This paper addresses the task of 3D clothed human generation from textural descriptions. Previous works usually encode the human body and clothes as a holistic model and generate the whole model in a single-stage optimization, which makes them struggle for clothing editing and meanwhile lose fine-grained control over the whole generation process. To solve this, we propose a layer-wise clothed human representation combined with a progressive optimization strategy, which produces clothing-disentangled 3D human models while providing control capacity for the generation process. The basic idea is progressively generating a minimal-clothed human body and layer-wise clothes. During clothing generation, a novel stratified compositional rendering method is proposed to fuse multi-layer human models, and a new loss function is utilized to help decouple the clothing model from the human body. The proposed method achieves high-quality disentanglement, which thereby provides an effective way for 3D garment generation. Extensive experiments demonstrate that our approach achieves state-of-the-art 3D clothed human generation while also supporting cloth editing applications such as virtual try-on. Project page: http://jtdong.com/tela_layer/

Read more

4/26/2024

🧠

Total Score

0

Representing Animatable Avatar via Factorized Neural Fields

Chunjin Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin

For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores the observation that the per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent equivalent to facilitate frame consistency. Pose adaptive textures can be further improved by restricting frequency bands of these two components. In detail, pose-independent outputs are expected to be low-frequency, while highfrequency information is linked to pose-dependent factors. We achieve a coherent preservation of both coarse body contours across the entire input video and finegrained texture features that are time variant with a dual-branch network with distinct frequency components. The first branch takes coordinates in canonical space as input, while the second branch additionally considers features outputted by the first branch and pose information of each frame. Our network integrates the information predicted by both branches and utilizes volume rendering to generate photo-realistic 3D human images. Through experiments, we demonstrate that our network surpasses the neural radiance fields (NeRF) based state-of-the-art methods in preserving high-frequency details and ensuring consistent body contours.

Read more

6/4/2024