Exposure Completing for Temporally Consistent Neural High Dynamic Range Video Rendering

Read original: arXiv:2407.13309 - Published 7/19/2024 by Jiahao Cui, Wei Jiang, Zhan Peng, Zhiyu Pan, Zhiguo Cao

Exposure Completing for Temporally Consistent Neural High Dynamic Range Video Rendering

Overview

The paper proposes a method for generating temporally consistent high dynamic range (HDR) video from low dynamic range (LDR) input frames.
The approach uses a neural network to "complete" the missing exposure information in the LDR frames, which helps maintain visual consistency across the video.
This tackles the challenge of HDR video reconstruction, which is important for applications like video capture, display, and editing.

Plain English Explanation

The researchers developed a technique to create high-quality HDR videos from regular (LDR) video footage. HDR video can capture a wider range of brightness levels than standard video, resulting in more lifelike and detailed imagery.

One of the key challenges in HDR video is maintaining a consistent appearance over time. As the camera exposure changes from frame to frame, the video can appear to "flicker" or have inconsistent lighting. The researchers' method addresses this by using a neural network to estimate the missing exposure information in each LDR frame. This exposure "completion" helps the video maintain a more uniform and natural look.

The exposure completion process works by analyzing the content of each LDR frame and predicting what the full HDR version of that frame would look like. This prediction is then used to blend the LDR frames into a cohesive HDR video with smooth transitions between frames.

By solving the temporal consistency problem, this approach makes HDR video more practical and usable for various applications, like video capture, display, and editing.

Technical Explanation

The key component of the proposed method is a neural network that "completes" the missing exposure information in the LDR input frames. This network takes an LDR frame as input and predicts the corresponding full HDR frame.

To train this network, the researchers used a dataset of HDR videos, which they converted to LDR by applying different camera exposure settings. The network was then tasked with reconstructing the original HDR frames from the LDR inputs.

The network architecture consists of an encoder-decoder structure with skip connections. This allows the model to capture both global and local information, which is important for accurately predicting the missing HDR details.

During inference, the completed HDR frames are blended together using a temporal consistency module. This module considers the exposure differences between adjacent frames and applies a smoothing operation to maintain a coherent appearance throughout the video.

The researchers evaluated their approach on various HDR video datasets and found that it outperformed previous methods in terms of both visual quality and temporal consistency. The method was able to effectively recover details in high-brightness and low-brightness regions of the frames, resulting in visually pleasing HDR videos.

Critical Analysis

The paper provides a robust and well-designed solution for the problem of temporally consistent HDR video reconstruction. The exposure completion network and temporal consistency module work together effectively to address the key challenges in this domain.

One potential limitation of the approach is that it relies on having access to a dataset of HDR videos, which may not always be readily available. The researchers mention that their method could potentially be extended to work with single LDR input frames, but this would likely require additional training and architectural modifications.

Additionally, the paper does not explore the computational efficiency of the proposed method, which could be an important consideration for real-time or resource-constrained applications. Further research could investigate ways to optimize the model's inference speed without significantly compromising performance.

Overall, the researchers have made a valuable contribution to the field of HDR video processing, and their work could have important implications for optimizing illuminant estimation and improving HDR image compression techniques.

Conclusion

The paper presents a novel method for generating temporally consistent HDR video from LDR input frames. By using a neural network to "complete" the missing exposure information in each frame, the approach is able to maintain a visually cohesive and natural-looking HDR video.

This work addresses a significant challenge in HDR video processing and has the potential to impact various applications, such as video capture, display, and editing. The researchers have demonstrated the effectiveness of their method through extensive evaluations, and their findings could inspire further advancements in the field of computational photography and image processing.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Exposure Completing for Temporally Consistent Neural High Dynamic Range Video Rendering

Jiahao Cui, Wei Jiang, Zhan Peng, Zhiyu Pan, Zhiguo Cao

High dynamic range (HDR) video rendering from low dynamic range (LDR) videos where frames are of alternate exposure encounters significant challenges, due to the exposure change and absence at each time stamp. The exposure change and absence make existing methods generate flickering HDR results. In this paper, we propose a novel paradigm to render HDR frames via completing the absent exposure information, hence the exposure information is complete and consistent. Our approach involves interpolating neighbor LDR frames in the time dimension to reconstruct LDR frames for the absent exposures. Combining the interpolated and given LDR frames, the complete set of exposure information is available at each time stamp. This benefits the fusing process for HDR results, reducing noise and ghosting artifacts therefore improving temporal consistency. Extensive experimental evaluations on standard benchmarks demonstrate that our method achieves state-of-the-art performance, highlighting the importance of absent exposure completing in HDR video rendering. The code is available at https://github.com/cuijiahao666/NECHDR.

7/19/2024

Exposure Diffusion: HDR Image Generation by Consistent LDR denoising

Mojtaba Bemana, Thomas Leimkuhler, Karol Myszkowski, Hans-Peter Seidel, Tobias Ritschel

We demonstrate generating high-dynamic range (HDR) images using the concerted action of multiple black-box, pre-trained low-dynamic range (LDR) image diffusion models. Common diffusion models are not HDR as, first, there is no sufficiently large HDR image dataset available to re-train them, and second, even if it was, re-training such models is impossible for most compute budgets. Instead, we seek inspiration from the HDR image capture literature that traditionally fuses sets of LDR images, called brackets, to produce a single HDR image. We operate multiple denoising processes to generate multiple LDR brackets that together form a valid HDR result. To this end, we introduce an exposure consistency term into the diffusion process to couple the brackets such that they agree across the exposure range they share. We demonstrate HDR versions of state-of-the-art unconditional and conditional as well as restoration-type (LDR2HDR) generative modeling.

5/24/2024

Diffusion-Promoted HDR Video Reconstruction

Yuanshen Guan, Ruikang Xu, Mingde Yao, Ruisheng Gao, Lizhi Wang, Zhiwei Xiong

High dynamic range (HDR) video reconstruction aims to generate HDR videos from low dynamic range (LDR) frames captured with alternating exposures. Most existing works solely rely on the regression-based paradigm, leading to adverse effects such as ghosting artifacts and missing details in saturated regions. In this paper, we propose a diffusion-promoted method for HDR video reconstruction, termed HDR-V-Diff, which incorporates a diffusion model to capture the HDR distribution. As such, HDR-V-Diff can reconstruct HDR videos with realistic details while alleviating ghosting artifacts. However, the direct introduction of video diffusion models would impose massive computational burden. Instead, to alleviate this burden, we first propose an HDR Latent Diffusion Model (HDR-LDM) to learn the distribution prior of single HDR frames. Specifically, HDR-LDM incorporates a tonemapping strategy to compress HDR frames into the latent space and a novel exposure embedding to aggregate the exposure information into the diffusion process. We then propose a Temporal-Consistent Alignment Module (TCAM) to learn the temporal information as a complement for HDR-LDM, which conducts coarse-to-fine feature alignment at different scales among video frames. Finally, we design a Zero-Init Cross-Attention (ZiCA) mechanism to effectively integrate the learned distribution prior and temporal information for generating HDR frames. Extensive experiments validate that HDR-V-Diff achieves state-of-the-art results on several representative datasets.

6/13/2024

Neural Augmentation Based Panoramic High Dynamic Range Stitching

Chaobing Zheng, Yilun Xu, Weihai Chen, Shiqian Wu, Zhengguo Li

Due to saturated regions of inputting low dynamic range (LDR) images and large intensity changes among the LDR images caused by different exposures, it is challenging to produce an information enriched panoramic LDR image without visual artifacts for a high dynamic range (HDR) scene through stitching multiple geometrically synchronized LDR images with different exposures and pairwise overlapping fields of views (OFOVs). Fortunately, the stitching of such images is innately a perfect scenario for the fusion of a physics-driven approach and a data-driven approach due to their OFOVs. Based on this new insight, a novel neural augmentation based panoramic HDR stitching algorithm is proposed in this paper. The physics-driven approach is built up using the OFOVs. Different exposed images of each view are initially generated by using the physics-driven approach, are then refined by a data-driven approach, and are finally used to produce panoramic LDR images with different exposures. All the panoramic LDR images with different exposures are combined together via a multi-scale exposure fusion algorithm to produce the final panoramic LDR image. Experimental results demonstrate the proposed algorithm outperforms existing panoramic stitching algorithms.

9/10/2024