WildGaussians: 3D Gaussian Splatting in the Wild

Read original: arXiv:2407.08447 - Published 7/12/2024 by Jonas Kulhanek, Songyou Peng, Zuzana Kukelova, Marc Pollefeys, Torsten Sattler

310

WildGaussians: 3D Gaussian Splatting in the Wild

Overview

• The paper introduces WildGaussians, a novel 3D Gaussian splatting technique for real-time novel view synthesis in uncontrolled scenes.

• The method represents 3D scenes using a sparse set of Gaussian primitives, which can be efficiently rendered using GPU-accelerated splatting.

• This enables high-quality 3D reconstruction and rendering from sparse RGB-D or multi-view data, even in challenging real-world environments.

Plain English Explanation

The paper presents a new way to create 3D models from images and videos captured in the real world. Traditional 3D modeling often requires carefully controlled environments or expensive equipment. WildGaussians aims to make 3D modeling more accessible by working with regular photos and videos taken in uncontrolled settings, like a person's home or a busy city street.

The key insight is to represent the 3D world using simple geometric shapes called Gaussians. These Gaussians can be quickly rendered on a computer's graphics card, allowing for real-time 3D reconstruction and rendering. This means you can create 3D models and explore them interactively, even on ordinary devices like smartphones or laptops.

The WildGaussians approach is inspired by recent advances in 3D Gaussian splatting and generative models that can create 3D scenes from 2D images. By combining these ideas, the researchers have developed a system that can capture the complex shapes and appearances found in real-world environments, while still being efficient enough for interactive use.

Technical Explanation

The WildGaussians system first uses a neural network to extract a sparse set of 3D Gaussian primitives from RGB-D or multi-view input data. These Gaussians represent the geometry and appearance of the scene in a compact way.

To render the 3D scene, the system uses GPU-accelerated Gaussian splatting. This means that each Gaussian primitive is "splatted" or projected onto the screen, creating a smooth, high-quality 3D reconstruction. The system can also handle appearance-conditioned Gaussians to capture complex material properties.

The researchers evaluate WildGaussians on a variety of real-world scenes, demonstrating its ability to handle challenging environments and produce high-fidelity 3D reconstructions at interactive framerates. They also show how the system can be integrated with neural radiance fields for advanced rendering capabilities.

Critical Analysis

The WildGaussians approach is a promising step towards making 3D modeling and rendering more accessible, but it does have some limitations. The system is still dependent on RGB-D or multi-view input data, which may not be readily available in all scenarios. Additionally, the neural network used to extract the Gaussian primitives may struggle with highly complex or intricate scenes.

Further research could explore ways to make the system more robust to incomplete or noisy input data, or to extend the Gaussian representation to capture even more detailed geometries and appearances. Integrating WildGaussians with other 3D reconstruction techniques could also lead to interesting synergies and expanded capabilities.

Conclusion

WildGaussians represents an important step forward in making 3D modeling and rendering more accessible to a wider audience. By using a sparse Gaussian representation and efficient GPU-accelerated splatting, the system can produce high-quality 3D reconstructions from real-world data in real-time. This could have significant implications for a variety of applications, from virtual and augmented reality to 3D content creation and scene understanding.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

310

WildGaussians: 3D Gaussian Splatting in the Wild

Jonas Kulhanek, Songyou Peng, Zuzana Kukelova, Marc Pollefeys, Torsten Sattler

While the field of 3D scene reconstruction is dominated by NeRFs due to their photorealistic quality, 3D Gaussian Splatting (3DGS) has recently emerged, offering similar quality with real-time rendering speeds. However, both methods primarily excel with well-controlled 3D scenes, while in-the-wild data - characterized by occlusions, dynamic objects, and varying illumination - remains challenging. NeRFs can adapt to such conditions easily through per-image embedding vectors, but 3DGS struggles due to its explicit representation and lack of shared parameters. To address this, we introduce WildGaussians, a novel approach to handle occlusions and appearance changes with 3DGS. By leveraging robust DINO features and integrating an appearance modeling module within 3DGS, our method achieves state-of-the-art results. We demonstrate that WildGaussians matches the real-time rendering speed of 3DGS while surpassing both 3DGS and NeRF baselines in handling in-the-wild data, all within a simple architectural framework.

7/12/2024

Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections

Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang

Novel view synthesis from unconstrained in-the-wild images remains a meaningful but challenging task. The photometric variation and transient occluders in those unconstrained images make it difficult to reconstruct the original scene accurately. Previous approaches tackle the problem by introducing a global appearance feature in Neural Radiance Fields (NeRF). However, in the real world, the unique appearance of each tiny point in a scene is determined by its independent intrinsic material attributes and the varying environmental impacts it receives. Inspired by this fact, we propose Gaussian in the wild (GS-W), a method that uses 3D Gaussian points to reconstruct the scene and introduces separated intrinsic and dynamic appearance feature for each point, capturing the unchanged scene appearance along with dynamic variation like illumination and weather. Additionally, an adaptive sampling strategy is presented to allow each Gaussian point to focus on the local and detailed information more effectively. We also reduce the impact of transient occluders using a 2D visibility map. More experiments have demonstrated better reconstruction quality and details of GS-W compared to NeRF-based methods, with a faster rendering speed. Video results and code are available at https://eastbeanzhang.github.io/GS-W/.

7/16/2024

Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections

Jiacong Xu, Yiqun Mei, Vishal M. Patel

Photographs captured in unstructured tourist environments frequently exhibit variable appearances and transient occlusions, challenging accurate scene reconstruction and inducing artifacts in novel view synthesis. Although prior approaches have integrated the Neural Radiance Field (NeRF) with additional learnable modules to handle the dynamic appearances and eliminate transient objects, their extensive training demands and slow rendering speeds limit practical deployments. Recently, 3D Gaussian Splatting (3DGS) has emerged as a promising alternative to NeRF, offering superior training and inference efficiency along with better rendering quality. This paper presents Wild-GS, an innovative adaptation of 3DGS optimized for unconstrained photo collections while preserving its efficiency benefits. Wild-GS determines the appearance of each 3D Gaussian by their inherent material attributes, global illumination and camera properties per image, and point-level local variance of reflectance. Unlike previous methods that model reference features in image space, Wild-GS explicitly aligns the pixel appearance features to the corresponding local Gaussians by sampling the triplane extracted from the reference image. This novel design effectively transfers the high-frequency detailed appearance of the reference view to 3D space and significantly expedites the training process. Furthermore, 2D visibility maps and depth regularization are leveraged to mitigate the transient effects and constrain the geometry, respectively. Extensive experiments demonstrate that Wild-GS achieves state-of-the-art rendering performance and the highest efficiency in both training and inference among all the existing techniques.

6/18/2024

Recent Advances in 3D Gaussian Splatting

Tong Wu, Yu-Jie Yuan, Ling-Xiao Zhang, Jie Yang, Yan-Pei Cao, Ling-Qi Yan, Lin Gao

The emergence of 3D Gaussian Splatting (3DGS) has greatly accelerated the rendering speed of novel view synthesis. Unlike neural implicit representations like Neural Radiance Fields (NeRF) that represent a 3D scene with position and viewpoint-conditioned neural networks, 3D Gaussian Splatting utilizes a set of Gaussian ellipsoids to model the scene so that efficient rendering can be accomplished by rasterizing Gaussian ellipsoids into images. Apart from the fast rendering speed, the explicit representation of 3D Gaussian Splatting facilitates editing tasks like dynamic reconstruction, geometry editing, and physical simulation. Considering the rapid change and growing number of works in this field, we present a literature review of recent 3D Gaussian Splatting methods, which can be roughly classified into 3D reconstruction, 3D editing, and other downstream applications by functionality. Traditional point-based rendering methods and the rendering formulation of 3D Gaussian Splatting are also illustrated for a better understanding of this technique. This survey aims to help beginners get into this field quickly and provide experienced researchers with a comprehensive overview, which can stimulate the future development of the 3D Gaussian Splatting representation.

4/16/2024