GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Read original: arXiv:2406.04254 - Published 6/17/2024 by Salvatore Esposito, Qingshan Xu, Kacper Kania, Charlie Hewitt, Octave Mariotti, Lohit Petikam, Julien Valentin, Arno Onken, Oisin Mac Aodha

GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Overview

This paper introduces GeoGen, a novel geometry-aware generative modeling approach based on signed distance functions (SDFs).
GeoGen leverages the powerful representation of SDFs to capture detailed 3D geometry, enabling the generation of high-quality 3D shapes.
The model is trained on a dataset of 3D shapes represented as SDFs, allowing it to learn the underlying geometry and generate new, realistic 3D shapes.

Plain English Explanation

GeoGen is a new way of generating 3D shapes using a mathematical concept called signed distance functions (SDFs). SDFs can represent the detailed geometry of 3D objects, encoding information about their shape and size. By training a machine learning model on a dataset of 3D shapes represented as SDFs, GeoGen can learn the patterns and rules underlying these 3D geometries. This allows the model to then generate brand new 3D shapes that are realistic and plausible, as they are grounded in the learned geometric properties.

This is a powerful approach, as it enables the creation of high-quality 3D content without requiring complex 3D modeling or design skills. The Geometry-Aware Reconstruction and Fusion for Refined Rendering and Generalizable 3D Reconstruction and Text-to-3D Using Gaussian Splatting papers also explore the use of SDFs and generative modeling for 3D shape creation, demonstrating the growing importance of this approach.

Technical Explanation

The key innovation of GeoGen is its use of signed distance functions (SDFs) to represent 3D shapes. SDFs encode the distance from any point in 3D space to the nearest surface of an object, along with the direction (sign) of that distance. This compact, geometry-aware representation allows GeoGen to capture detailed 3D shapes, including fine-grained features and complex topologies.

To train GeoGen, the authors collect a dataset of 3D shapes represented as SDFs. They then design a generative model architecture that can learn the underlying patterns and distributions of these SDF representations. The model is able to generate new SDF fields, which can then be converted into high-quality 3D meshes using techniques like Depth Reconstruction with Neural Signed Distance Fields and GENS: Generalizable Neural Surface Reconstruction from Multi-View Observations.

The authors evaluate GeoGen on a variety of 3D shape generation tasks, demonstrating its ability to produce detailed, realistic 3D content that is grounded in the learned geometric properties. They also show that GeoGen can be used for tasks like shape interpolation and conditional generation, further showcasing its versatility and potential applications.

Critical Analysis

One limitation of GeoGen is that it relies on a dataset of 3D shapes represented as SDFs, which may not be readily available for many real-world applications. The authors mention the potential for using other 3D representations, such as point clouds or meshes, but further research is needed to explore how well GeoGen would perform in those settings.

Additionally, while GeoGen is able to generate high-quality 3D shapes, the authors do not extensively discuss the computational efficiency or inference speed of the model. As 3D content generation often requires real-time performance, this is an important consideration for practical applications.

Further research could also explore the integration of GeoGen with other 3D modeling and rendering techniques, such as those explored in the GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance paper. Such synergies could unlock even more powerful and versatile 3D content creation capabilities.

Conclusion

GeoGen represents a significant step forward in geometry-aware generative modeling for 3D shapes. By leveraging the power of signed distance functions, the model is able to capture intricate 3D geometries and generate highly realistic and plausible 3D content. This approach has the potential to revolutionize 3D content creation, making it more accessible and efficient for a wide range of applications, from virtual environments to product design.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Salvatore Esposito, Qingshan Xu, Kacper Kania, Charlie Hewitt, Octave Mariotti, Lohit Petikam, Julien Valentin, Arno Onken, Oisin Mac Aodha

We introduce a new generative approach for synthesizing 3D geometry and images from single-view collections. Most existing approaches predict volumetric density to render multi-view consistent images. By employing volumetric rendering using neural radiance fields, they inherit a key limitation: the generated geometry is noisy and unconstrained, limiting the quality and utility of the output meshes. To address this issue, we propose GeoGen, a new SDF-based 3D generative model trained in an end-to-end manner. Initially, we reinterpret the volumetric density as a Signed Distance Function (SDF). This allows us to introduce useful priors to generate valid meshes. However, those priors prevent the generative model from learning details, limiting the applicability of the method to real-world scenarios. To alleviate that problem, we make the transformation learnable and constrain the rendered depth map to be consistent with the zero-level set of the SDF. Through the lens of adversarial training, we encourage the network to produce higher fidelity details on the output meshes. For evaluation, we introduce a synthetic dataset of human avatars captured from 360-degree camera angles, to overcome the challenges presented by real-world datasets, which often lack 3D consistency and do not cover all camera angles. Our experiments on multiple datasets show that GeoGen produces visually and quantitatively better geometry than the previous generative models based on neural radiance fields.

6/17/2024

GenS: Generalizable Neural Surface Reconstruction from Multi-View Images

Rui Peng, Xiaodong Gu, Luyang Tang, Shihe Shen, Fanqi Yu, Ronggang Wang

Combining the signed distance function (SDF) and differentiable volume rendering has emerged as a powerful paradigm for surface reconstruction from multi-view images without 3D supervision. However, current methods are impeded by requiring long-time per-scene optimizations and cannot generalize to new scenes. In this paper, we present GenS, an end-to-end generalizable neural surface reconstruction model. Unlike coordinate-based methods that train a separate network for each scene, we construct a generalized multi-scale volume to directly encode all scenes. Compared with existing solutions, our representation is more powerful, which can recover high-frequency details while maintaining global smoothness. Meanwhile, we introduce a multi-scale feature-metric consistency to impose the multi-view consistency in a more discriminative multi-scale feature space, which is robust to the failures of the photometric consistency. And the learnable feature can be self-enhanced to continuously improve the matching accuracy and mitigate aggregation ambiguity. Furthermore, we design a view contrast loss to force the model to be robust to those regions covered by few viewpoints through distilling the geometric prior from dense input to sparse input. Extensive experiments on popular benchmarks show that our model can generalize well to new scenes and outperform existing state-of-the-art methods even those employing ground-truth depth supervision. Code is available at https://github.com/prstrive/GenS.

6/5/2024

Few-Shot Unsupervised Implicit Neural Shape Representation Learning with Spatial Adversaries

Amine Ouasfi, Adnane Boukhayma

Implicit Neural Representations have gained prominence as a powerful framework for capturing complex data modalities, encompassing a wide range from 3D shapes to images and audio. Within the realm of 3D shape representation, Neural Signed Distance Functions (SDF) have demonstrated remarkable potential in faithfully encoding intricate shape geometry. However, learning SDFs from sparse 3D point clouds in the absence of ground truth supervision remains a very challenging task. While recent methods rely on smoothness priors to regularize the learning, our method introduces a regularization term that leverages adversarial samples around the shape to improve the learned SDFs. Through extensive experiments and evaluations, we illustrate the efficacy of our proposed method, highlighting its capacity to improve SDF learning with respect to baselines and the state-of-the-art using synthetic and real data.

8/28/2024

Depth Reconstruction with Neural Signed Distance Fields in Structured Light Systems

Rukun Qiao, Hiroshi Kawasaki, Hongbin Zha

We introduce a novel depth estimation technique for multi-frame structured light setups using neural implicit representations of 3D space. Our approach employs a neural signed distance field (SDF), trained through self-supervised differentiable rendering. Unlike passive vision, where joint estimation of radiance and geometry fields is necessary, we capitalize on known radiance fields from projected patterns in structured light systems. This enables isolated optimization of the geometry field, ensuring convergence and network efficacy with fixed device positioning. To enhance geometric fidelity, we incorporate an additional color loss based on object surfaces during training. Real-world experiments demonstrate our method's superiority in geometric performance for few-shot scenarios, while achieving comparable results with increased pattern availability.

5/21/2024