Semantic Aware Diffusion Inverse Tone Mapping

Read original: arXiv:2405.15468 - Published 5/27/2024 by Abhishek Goswami, Aru Ranjan Singh, Francesco Banterle, Kurt Debattista, Thomas Bashford-Rogers

Semantic Aware Diffusion Inverse Tone Mapping

Overview

This paper presents a novel approach called "Semantic Aware Diffusion Inverse Tone Mapping" (SADITM) for reconstructing high dynamic range (HDR) images from low dynamic range (LDR) inputs.
The key idea is to leverage semantic information to guide the inverse tone mapping process, enabling more accurate HDR reconstruction compared to prior methods.
The authors demonstrate the effectiveness of SADITM through extensive experiments and comparisons to state-of-the-art HDR reconstruction techniques.

Plain English Explanation

The paper discusses a new way to create high-quality high dynamic range (HDR) images from low dynamic range (LDR) images. HDR images can capture a wider range of brightness levels compared to regular LDR images, which is important for realistic rendering and editing.

The researchers developed a method called "Semantic Aware Diffusion Inverse Tone Mapping" (SADITM) that uses information about the semantic content of the image (e.g., objects, scenes) to guide the process of converting an LDR image into an HDR one. This semantic awareness helps the algorithm reconstruct the HDR image more accurately compared to previous approaches that did not incorporate this type of information.

The authors thoroughly tested their SADITM method and showed that it outperforms other state-of-the-art HDR reconstruction techniques. This advance could have important applications in fields like photography, video, and image editing, where high-quality HDR content is increasingly in demand.

Technical Explanation

The key innovation in this paper is the incorporation of semantic information to guide the inverse tone mapping process for HDR image reconstruction. Prior methods (TGTM, Exposure Diffusion) relied solely on the low dynamic range (LDR) input image, without leveraging the semantic context.

The authors propose a "Semantic Aware Diffusion Inverse Tone Mapping" (SADITM) approach that uses a diffusion model trained on a large corpus of HDR and LDR image pairs, along with their corresponding semantic segmentation maps. This allows the model to learn the relationship between the LDR input, semantic information, and the target HDR output.

During inference, the model first generates a semantic segmentation map for the input LDR image using a pre-trained segmentation network. This semantic map is then combined with the LDR input and fed into the diffusion-based HDR reconstruction network. The diffusion process is guided by the semantic information, enabling more accurate HDR reconstruction compared to prior methods that lacked this semantic awareness (Multimodal Semantic-Aware Automatic Colorization).

The authors conduct extensive experiments to evaluate the performance of SADITM, including comparisons to state-of-the-art HDR reconstruction techniques on various benchmarks (Perceptual Assessment Optimization for HDR Images). Their results demonstrate the effectiveness of the semantic-aware approach in producing high-quality HDR images from LDR inputs.

Critical Analysis

The paper presents a well-designed and thorough study, with a clear focus on leveraging semantic information to improve HDR reconstruction. The authors acknowledge that their method relies on the availability of high-quality semantic segmentation, which could be a limitation in some real-world scenarios where the segmentation model may not be reliable.

Additionally, the paper does not discuss the potential computational cost or runtime of the SADITM approach, which could be an important consideration for practical applications. Further research may be needed to explore the trade-offs between reconstruction quality and computational efficiency.

While the authors demonstrate the effectiveness of SADITM on various benchmarks, it would be valuable to see its performance in more diverse real-world scenarios, such as challenging lighting conditions or complex scenes. This could provide additional insights into the method's strengths and limitations.

Conclusion

The "Semantic Aware Diffusion Inverse Tone Mapping" (SADITM) approach presented in this paper represents a significant advancement in the field of HDR image reconstruction. By leveraging semantic information, the method is able to produce high-quality HDR images from LDR inputs with greater accuracy compared to previous techniques.

The successful implementation of SADITM could have far-reaching implications in various industries, such as photography, video production, and image editing, where the demand for high-quality HDR content continues to grow. The authors' work highlights the importance of incorporating semantic awareness in computational imaging tasks, and it paves the way for further research and applications in this direction.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Semantic Aware Diffusion Inverse Tone Mapping

Abhishek Goswami, Aru Ranjan Singh, Francesco Banterle, Kurt Debattista, Thomas Bashford-Rogers

The range of real-world scene luminance is larger than the capture capability of many digital camera sensors which leads to details being lost in captured images, most typically in bright regions. Inverse tone mapping attempts to boost these captured Standard Dynamic Range (SDR) images back to High Dynamic Range (HDR) by creating a mapping that linearizes the well exposed values from the SDR image, and provides a luminance boost to the clipped content. However, in most cases, the details in the clipped regions cannot be recovered or estimated. In this paper, we present a novel inverse tone mapping approach for mapping SDR images to HDR that generates lost details in clipped regions through a semantic-aware diffusion based inpainting approach. Our method proposes two major contributions - first, we propose to use a semantic graph to guide SDR diffusion based inpainting in masked regions in a saturated image. Second, drawing inspiration from traditional HDR imaging and bracketing methods, we propose a principled formulation to lift the SDR inpainted regions to HDR that is compatible with generative inpainting methods. Results show that our method demonstrates superior performance across different datasets on objective metrics, and subjective experiments show that the proposed method matches (and in most cases outperforms) state-of-art inverse tone mapping operators in terms of objective metrics and outperforms them for visual fidelity.

5/27/2024

HDRT: Infrared Capture for HDR Imaging

Jingchao Peng, Thomas Bashford-Rogers, Francesco Banterle, Haitao Zhao, Kurt Debattista

Capturing real world lighting is a long standing challenge in imaging and most practical methods acquire High Dynamic Range (HDR) images by either fusing multiple exposures, or boosting the dynamic range of Standard Dynamic Range (SDR) images. Multiple exposure capture is problematic as it requires longer capture times which can often lead to ghosting problems. The main alternative, inverse tone mapping is an ill-defined problem that is especially challenging as single captured exposures usually contain clipped and quantized values, and are therefore missing substantial amounts of content. To alleviate this, we propose a new approach, High Dynamic Range Thermal (HDRT), for HDR acquisition using a separate, commonly available, thermal infrared (IR) sensor. We propose a novel deep neural method (HDRTNet) which combines IR and SDR content to generate HDR images. HDRTNet learns to exploit IR features linked to the RGB image and the IR-specific parameters are subsequently used in a dual branch method that fuses features at shallow layers. This produces an HDR image that is significantly superior to that generated using naive fusion approaches. To validate our method, we have created the first HDR and thermal dataset, and performed extensive experiments comparing HDRTNet with the state-of-the-art. We show substantial quantitative and qualitative quality improvements on both over- and under-exposed images, showing that our approach is robust to capturing in multiple different lighting conditions.

6/11/2024

Sagiri: Low Dynamic Range Image Enhancement with Generative Diffusion Prior

Baiang Li, Sizhuo Ma, Yanhong Zeng, Xiaogang Xu, Youqing Fang, Zhao Zhang, Jian Wang, Kai Chen

Capturing High Dynamic Range (HDR) scenery using 8-bit cameras often suffers from over-/underexposure, loss of fine details due to low bit-depth compression, skewed color distributions, and strong noise in dark areas. Traditional LDR image enhancement methods primarily focus on color mapping, which enhances the visual representation by expanding the image's color range and adjusting the brightness. However, these approaches fail to effectively restore content in dynamic range extremes, which are regions with pixel values close to 0 or 255. To address the full scope of challenges in HDR imaging and surpass the limitations of current models, we propose a novel two-stage approach. The first stage maps the color and brightness to an appropriate range while keeping the existing details, and the second stage utilizes a diffusion prior to generate content in dynamic range extremes lost during capture. This generative refinement module can also be used as a plug-and-play module to enhance and complement existing LDR enhancement models. The proposed method markedly improves the quality and details of LDR images, demonstrating superior performance through rigorous experimental validation. The project page is at https://sagiri0208.github.io

6/14/2024

Diffusion-Promoted HDR Video Reconstruction

Yuanshen Guan, Ruikang Xu, Mingde Yao, Ruisheng Gao, Lizhi Wang, Zhiwei Xiong

High dynamic range (HDR) video reconstruction aims to generate HDR videos from low dynamic range (LDR) frames captured with alternating exposures. Most existing works solely rely on the regression-based paradigm, leading to adverse effects such as ghosting artifacts and missing details in saturated regions. In this paper, we propose a diffusion-promoted method for HDR video reconstruction, termed HDR-V-Diff, which incorporates a diffusion model to capture the HDR distribution. As such, HDR-V-Diff can reconstruct HDR videos with realistic details while alleviating ghosting artifacts. However, the direct introduction of video diffusion models would impose massive computational burden. Instead, to alleviate this burden, we first propose an HDR Latent Diffusion Model (HDR-LDM) to learn the distribution prior of single HDR frames. Specifically, HDR-LDM incorporates a tonemapping strategy to compress HDR frames into the latent space and a novel exposure embedding to aggregate the exposure information into the diffusion process. We then propose a Temporal-Consistent Alignment Module (TCAM) to learn the temporal information as a complement for HDR-LDM, which conducts coarse-to-fine feature alignment at different scales among video frames. Finally, we design a Zero-Init Cross-Attention (ZiCA) mechanism to effectively integrate the learned distribution prior and temporal information for generating HDR frames. Extensive experiments validate that HDR-V-Diff achieves state-of-the-art results on several representative datasets.

6/13/2024