NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation

Read original: arXiv:2403.18241 - Published 7/15/2024 by Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen and 2 others

NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation

Overview

This paper presents NeuSDFusion, a novel 3D shape completion, reconstruction, and generation model that leverages spatial awareness.
The model uses a neural signed distance field (SDF) representation to capture the 3D geometry and generate high-quality 3D shapes.
NeuSDFusion is capable of completing partial 3D shapes, reconstructing 3D objects from various inputs, and generating novel 3D shapes.

Plain English Explanation

NeuSDFusion is a machine learning model that can work with 3D shapes and objects in several ways. It can take an incomplete or partial 3D shape and "fill in" the missing parts to create a complete 3D model. It can also reconstruct a full 3D object from various types of input, like 2D images or low-resolution 3D scans. Additionally, NeuSDFusion can generate brand new 3D shapes from scratch.

The key innovation of this model is that it uses a special representation called a neural signed distance field (SDF) to capture the 3D geometry. This allows NeuSDFusion to be "spatially aware" and understand the spatial relationships within 3D shapes. This spatial awareness helps the model perform the 3D shape completion, reconstruction, and generation tasks more effectively.

Technical Explanation

NeuSDFusion uses a neural SDF representation to model 3D shapes. This representation encodes the 3D geometry by storing the signed distance of each point in space to the object's surface. This provides a detailed and continuous description of the 3D shape.

The model is designed with several key components. First, it has an encoder network that takes in partial or incomplete 3D input and learns a latent representation. Second, it has a conditional decoder network that can generate a complete SDF representation from the latent code and additional conditioning information. Finally, it has a differentiable rendering module that can project the SDF into 2D views, enabling training from various 2D and 3D supervision signals.

Through careful architecture design and training, NeuSDFusion is able to effectively complete partial 3D shapes, reconstruct 3D objects from different inputs, and generate novel 3D shapes. The spatial-aware SDF representation is a crucial component that enables these capabilities.

Critical Analysis

The paper provides a thorough evaluation of NeuSDFusion on several 3D shape tasks, demonstrating its strong performance. However, the authors acknowledge some limitations. For example, the model may struggle with highly complex or detailed 3D shapes, and the generation of entirely novel shapes is still a challenging problem.

Additionally, the computational and memory requirements of the neural SDF representation could be a concern for real-world deployment, especially for resource-constrained devices. Further research may be needed to improve the efficiency and scalability of the approach.

Overall, NeuSDFusion represents an interesting and promising approach to 3D shape modeling that leverages spatial awareness. With continued advancements in this area, such models could have significant impact on applications like 3D content creation, virtual/augmented reality, and robotic perception.

Conclusion

The NeuSDFusion paper presents a novel 3D shape completion, reconstruction, and generation model that uses a spatial-aware neural SDF representation. This approach allows the model to effectively handle various 3D shape tasks by exploiting the detailed geometric information captured in the SDF. While there are still some limitations to address, NeuSDFusion demonstrates the potential of spatially-aware generative models for 3D shape understanding and synthesis.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation

Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, Hongdong Li, Pan Ji

3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints. Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation without considering spatial consistency. As a result, these approaches exhibit limited versatility in 3D data representation and shape generation, hindering their ability to generate highly diverse 3D shapes that comply with the specified constraints. In this paper, we introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling. To ensure spatial coherence and reduce memory usage, we incorporate a hybrid shape representation technique that directly learns a continuous signed distance field representation of the 3D shape using orthogonal 2D planes. Additionally, we meticulously enforce spatial correspondences across distinct planes using a transformer-based autoencoder structure, promoting the preservation of spatial relationships in the generated 3D shapes. This yields an algorithm that consistently outperforms state-of-the-art 3D shape generation methods on various tasks, including unconditional shape generation, multi-modal shape completion, single-view reconstruction, and text-to-shape synthesis. Our project page is available at https://weizheliu.github.io/NeuSDFusion/ .

7/15/2024

GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Salvatore Esposito, Qingshan Xu, Kacper Kania, Charlie Hewitt, Octave Mariotti, Lohit Petikam, Julien Valentin, Arno Onken, Oisin Mac Aodha

We introduce a new generative approach for synthesizing 3D geometry and images from single-view collections. Most existing approaches predict volumetric density to render multi-view consistent images. By employing volumetric rendering using neural radiance fields, they inherit a key limitation: the generated geometry is noisy and unconstrained, limiting the quality and utility of the output meshes. To address this issue, we propose GeoGen, a new SDF-based 3D generative model trained in an end-to-end manner. Initially, we reinterpret the volumetric density as a Signed Distance Function (SDF). This allows us to introduce useful priors to generate valid meshes. However, those priors prevent the generative model from learning details, limiting the applicability of the method to real-world scenarios. To alleviate that problem, we make the transformation learnable and constrain the rendered depth map to be consistent with the zero-level set of the SDF. Through the lens of adversarial training, we encourage the network to produce higher fidelity details on the output meshes. For evaluation, we introduce a synthetic dataset of human avatars captured from 360-degree camera angles, to overcome the challenges presented by real-world datasets, which often lack 3D consistency and do not cover all camera angles. Our experiments on multiple datasets show that GeoGen produces visually and quantitatively better geometry than the previous generative models based on neural radiance fields.

6/17/2024

Few-Shot Unsupervised Implicit Neural Shape Representation Learning with Spatial Adversaries

Amine Ouasfi, Adnane Boukhayma

Implicit Neural Representations have gained prominence as a powerful framework for capturing complex data modalities, encompassing a wide range from 3D shapes to images and audio. Within the realm of 3D shape representation, Neural Signed Distance Functions (SDF) have demonstrated remarkable potential in faithfully encoding intricate shape geometry. However, learning SDFs from sparse 3D point clouds in the absence of ground truth supervision remains a very challenging task. While recent methods rely on smoothness priors to regularize the learning, our method introduces a regularization term that leverages adversarial samples around the shape to improve the learned SDFs. Through extensive experiments and evaluations, we illustrate the efficacy of our proposed method, highlighting its capacity to improve SDF learning with respect to baselines and the state-of-the-art using synthetic and real data.

8/28/2024

OctFusion: Octree-based Diffusion Models for 3D Shape Generation

Bojun Xiong, Si-Tong Wei, Xin-Yang Zheng, Yan-Pei Cao, Zhouhui Lian, Peng-Shuai Wang

Diffusion models have emerged as a popular method for 3D generation. However, it is still challenging for diffusion models to efficiently generate diverse and high-quality 3D shapes. In this paper, we introduce OctFusion, which can generate 3D shapes with arbitrary resolutions in 2.5 seconds on a single Nvidia 4090 GPU, and the extracted meshes are guaranteed to be continuous and manifold. The key components of OctFusion are the octree-based latent representation and the accompanying diffusion models. The representation combines the benefits of both implicit neural representations and explicit spatial octrees and is learned with an octree-based variational autoencoder. The proposed diffusion model is a unified multi-scale U-Net that enables weights and computation sharing across different octree levels and avoids the complexity of widely used cascaded diffusion schemes. We verify the effectiveness of OctFusion on the ShapeNet and Objaverse datasets and achieve state-of-the-art performances on shape generation tasks. We demonstrate that OctFusion is extendable and flexible by generating high-quality color fields for textured mesh generation and high-quality 3D shapes conditioned on text prompts, sketches, or category labels. Our code and pre-trained models are available at url{https://github.com/octree-nn/octfusion}.

8/28/2024