HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting

2405.15125

Published 5/28/2024 by Yuanhao Cai, Zihao Xiao, Yixun Liang, Minghan Qin, Yulun Zhang, Xiaokang Yang, Yaoyao Liu, Alan Yuille

cs.CV

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting

Abstract

High dynamic range (HDR) novel view synthesis (NVS) aims to create photorealistic images from novel viewpoints using HDR imaging techniques. The rendered HDR images capture a wider range of brightness levels containing more details of the scene than normal low dynamic range (LDR) images. Existing HDR NVS methods are mainly based on NeRF. They suffer from long training time and slow inference speed. In this paper, we propose a new framework, High Dynamic Range Gaussian Splatting (HDR-GS), which can efficiently render novel HDR views and reconstruct LDR images with a user input exposure time. Specifically, we design a Dual Dynamic Range (DDR) Gaussian point cloud model that uses spherical harmonics to fit HDR color and employs an MLP-based tone-mapper to render LDR color. The HDR and LDR colors are then fed into two Parallel Differentiable Rasterization (PDR) processes to reconstruct HDR and LDR views. To establish the data foundation for the research of 3D Gaussian splatting-based methods in HDR NVS, we recalibrate the camera parameters and compute the initial positions for Gaussian point clouds. Experiments demonstrate that our HDR-GS surpasses the state-of-the-art NeRF-based method by 3.84 and 1.91 dB on LDR and HDR NVS while enjoying 1000x inference speed and only requiring 6.3% training time. Code, models, and recalibrated data will be publicly available at https://github.com/caiyuanhao1998/HDR-GS

Create account to get full access

Overview

This paper introduces HDR-GS, a novel method for efficient high dynamic range (HDR) novel view synthesis at over 1000x speed compared to previous approaches.
The method uses Gaussian splatting, a technique that allows for fast rendering of 3D scenes by approximating geometry with Gaussian kernels.
HDR-GS can generate high-quality HDR novel views from sparse input views, enabling applications like virtual and augmented reality.

Plain English Explanation

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting is a new way to create 3D scenes with a wide range of brightness levels (high dynamic range or HDR) and from different viewpoints (novel views). The key innovation is a technique called Gaussian splatting that allows these HDR novel views to be generated extremely quickly - over 1000 times faster than previous methods.

Normally, creating HDR novel views is a computationally intensive process, as it requires accurately modeling the 3D geometry and lighting of a scene. HDR-GS bypasses this complexity by approximating the 3D geometry with simple Gaussian "blobs" that can be rendered very efficiently. This allows the method to generate high-quality HDR novel views from just a few sparse input images.

The speed and efficiency of HDR-GS make it particularly well-suited for applications like virtual and augmented reality, where the ability to generate immersive 3D environments in real-time is crucial. By using HDR-GS, developers can create more realistic and visually compelling virtual worlds without the heavy computational burden that has historically been a major challenge in this field.

Technical Explanation

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting presents a novel method for generating high-quality HDR novel views from sparse input images. The key innovation is the use of Gaussian splatting, a technique that SRGS: Efficient 3D Scene Reconstruction via Sparse-to-Dense Gaussian Splatting and SparseGS: Real-time 360° Novel View Synthesis from Sparse Views via Gaussian Splatting have shown to be highly efficient for 3D reconstruction and novel view synthesis.

The authors first use a neural network to estimate the 3D geometry of the scene from the sparse input images. They then represent this 3D geometry as a set of Gaussian kernels, where each kernel corresponds to a point in the 3D space. These Gaussian kernels can be efficiently rendered to generate the final HDR novel views, allowing for over 1000x speedup compared to previous methods.

The authors also introduce a novel loss function that encourages the Gaussian kernels to capture both the 3D geometry and the HDR radiance of the scene. This allows HDR-GS to generate convincing HDR novel views from just a few input images, without the need for complex lighting reconstruction.

The authors extensively evaluate HDR-GS on both synthetic and real-world datasets, demonstrating its ability to generate high-quality HDR novel views orders of magnitude faster than prior work. They also show that HDR-GS outperforms state-of-the-art methods on a range of quantitative metrics, including PSNR, SSIM, and HDR-VDP-2.

Critical Analysis

The paper provides a compelling solution to the challenge of efficiently generating HDR novel views from sparse input data. The key innovation of Gaussian splatting is shown to be a highly effective approach, enabling orders of magnitude speedup compared to previous methods.

One potential limitation of the work is that it relies on a neural network to estimate the 3D geometry of the scene, which could introduce errors or biases depending on the quality and diversity of the training data. SGD: Street View Synthesis with Gaussian Diffusion and Recent Advances in 3D Gaussian Splatting have explored alternative approaches to 3D reconstruction that may be worth investigating in the context of HDR-GS.

Additionally, the paper does not provide a detailed analysis of the limitations of the HDR-GS approach. For example, it's unclear how the method would perform in scenes with complex lighting, occlusions, or highly specular materials. Further research and evaluation in more challenging scenarios would help better understand the strengths and weaknesses of the approach.

Overall, the paper presents a highly innovative and efficient solution for HDR novel view synthesis that could have significant real-world impact, particularly in the field of virtual and augmented reality. The use of GPS: Generalizable Pixel-wise Gaussian Splatting to approximate 3D geometry is a promising direction that could be further explored and expanded upon in future work.

Conclusion

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting introduces a novel method for generating high-quality HDR novel views from sparse input data. The key innovation is the use of Gaussian splatting, a highly efficient technique for approximating 3D geometry that allows for over 1000x speedup compared to previous approaches.

The speed and efficiency of HDR-GS make it a promising solution for applications like virtual and augmented reality, where the ability to generate realistic 3D environments in real-time is crucial. While the method shows impressive results, further research is needed to fully understand its limitations and potential areas for improvement.

Overall, this paper represents an important advancement in the field of 3D rendering and novel view synthesis, and the authors' use of Gaussian splatting is a novel and compelling approach that could have far-reaching implications for a wide range of computer vision and graphics applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

From Chaos to Clarity: 3DGS in the Dark

Zhihao Li, Yufei Wang, Alex Kot, Bihan Wen

Novel view synthesis from raw images provides superior high dynamic range (HDR) information compared to reconstructions from low dynamic range RGB images. However, the inherent noise in unprocessed raw images compromises the accuracy of 3D scene representation. Our study reveals that 3D Gaussian Splatting (3DGS) is particularly susceptible to this noise, leading to numerous elongated Gaussian shapes that overfit the noise, thereby significantly degrading reconstruction quality and reducing inference speed, especially in scenarios with limited views. To address these issues, we introduce a novel self-supervised learning framework designed to reconstruct HDR 3DGS from a limited number of noisy raw images. This framework enhances 3DGS by integrating a noise extractor and employing a noise-robust reconstruction loss that leverages a noise distribution prior. Experimental results show that our method outperforms LDR/HDR 3DGS and previous state-of-the-art (SOTA) self-supervised and supervised pre-trained models in both reconstruction quality and inference speed on the RawNeRF dataset across a broad range of training views. Code can be found in url{https://lizhihao6.github.io/Raw3DGS}.

6/13/2024

eess.IV cs.CV

Self-Calibrating 4D Novel View Synthesis from Monocular Videos Using Gaussian Splatting

Fang Li, Hao Zhang, Narendra Ahuja

Gaussian Splatting (GS) has significantly elevated scene reconstruction efficiency and novel view synthesis (NVS) accuracy compared to Neural Radiance Fields (NeRF), particularly for dynamic scenes. However, current 4D NVS methods, whether based on GS or NeRF, primarily rely on camera parameters provided by COLMAP and even utilize sparse point clouds generated by COLMAP for initialization, which lack accuracy as well are time-consuming. This sometimes results in poor dynamic scene representation, especially in scenes with large object movements, or extreme camera conditions e.g. small translations combined with large rotations. Some studies simultaneously optimize the estimation of camera parameters and scenes, supervised by additional information like depth, optical flow, etc. obtained from off-the-shelf models. Using this unverified information as ground truth can reduce robustness and accuracy, which does frequently occur for long monocular videos (with e.g. > hundreds of frames). We propose a novel approach that learns a high-fidelity 4D GS scene representation with self-calibration of camera parameters. It includes the extraction of 2D point features that robustly represent 3D structure, and their use for subsequent joint optimization of camera parameters and 3D structure towards overall 4D scene optimization. We demonstrate the accuracy and time efficiency of our method through extensive quantitative and qualitative experimental results on several standard benchmarks. The results show significant improvements over state-of-the-art methods for 4D novel view synthesis. The source code will be released soon at https://github.com/fangli333/SC-4DGS.

6/4/2024

cs.CV

WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections

Yuze Wang, Junyi Wang, Yue Qi

Novel View Synthesis (NVS) from unconstrained photo collections is challenging in computer graphics. Recently, 3D Gaussian Splatting (3DGS) has shown promise for photorealistic and real-time NVS of static scenes. Building on 3DGS, we propose an efficient point-based differentiable rendering framework for scene reconstruction from photo collections. Our key innovation is a residual-based spherical harmonic coefficients transfer module that adapts 3DGS to varying lighting conditions and photometric post-processing. This lightweight module can be pre-computed and ensures efficient gradient propagation from rendered images to 3D Gaussian attributes. Additionally, we observe that the appearance encoder and the transient mask predictor, the two most critical parts of NVS from unconstrained photo collections, can be mutually beneficial. We introduce a plug-and-play lightweight spatial attention module to simultaneously predict transient occluders and latent appearance representation for each image. After training and preprocessing, our method aligns with the standard 3DGS format and rendering pipeline, facilitating seamlessly integration into various 3DGS applications. Extensive experiments on diverse datasets show our approach outperforms existing approaches on the rendering quality of novel view and appearance synthesis with high converge and rendering speed.

6/5/2024

cs.CV

➖

SRGS: Super-Resolution 3D Gaussian Splatting

Xiang Feng, Yongbo He, Yubo Wang, Yan Yang, Wen Li, Yifei Chen, Zhenzhong Kuang, Jiajun ding, Jianping Fan, Yu Jun

Recently, 3D Gaussian Splatting (3DGS) has gained popularity as a novel explicit 3D representation. This approach relies on the representation power of Gaussian primitives to provide a high-quality rendering. However, primitives optimized at low resolution inevitably exhibit sparsity and texture deficiency, posing a challenge for achieving high-resolution novel view synthesis (HRNVS). To address this problem, we propose Super-Resolution 3D Gaussian Splatting (SRGS) to perform the optimization in a high-resolution (HR) space. The sub-pixel constraint is introduced for the increased viewpoints in HR space, exploiting the sub-pixel cross-view information of the multiple low-resolution (LR) views. The gradient accumulated from more viewpoints will facilitate the densification of primitives. Furthermore, a pre-trained 2D super-resolution model is integrated with the sub-pixel constraint, enabling these dense primitives to learn faithful texture features. In general, our method focuses on densification and texture learning to effectively enhance the representation ability of primitives. Experimentally, our method achieves high rendering quality on HRNVS only with LR inputs, outperforming state-of-the-art methods on challenging datasets such as Mip-NeRF 360 and Tanks & Temples. Related codes will be released upon acceptance.

6/19/2024

cs.CV