Learned HDR Image Compression for Perceptually Optimal Storage and Display

Read original: arXiv:2407.13179 - Published 7/19/2024 by Peibei Cao, Haoyu Chen, Jingzhe Ma, Yu-Chieh Yuan, Zhiyong Xie, Xin Xie, Haiqing Bai, Kede Ma

Learned HDR Image Compression for Perceptually Optimal Storage and Display

Overview

This paper presents a new approach for compressing HDR (High Dynamic Range) images in a way that optimizes for perceptual quality during storage and display.
The researchers developed a learned compression model that can efficiently encode HDR images while preserving important visual details and colors.
The compressed HDR images can be decoded and displayed on a wide range of devices, from low-dynamic range (LDR) displays to high-end HDR displays, without sacrificing visual quality.

Plain English Explanation

HDR images can capture a much wider range of brightness levels than traditional LDR (Low Dynamic Range) images. This allows them to more accurately represent the real world, with details in both bright and dark areas. However, HDR images also require more storage space and processing power to display properly.

The researchers in this paper tackled the challenge of compressing HDR images in a way that maintains their visual quality, even when viewed on different display types. They created a machine learning model that can efficiently encode HDR images into a compact format. When the compressed images are decoded, they can be displayed on everything from basic LDR screens to advanced HDR displays, with the visuals optimized for each type of screen.

This approach helps bridge the gap between the increasing availability of HDR content and the wide range of display technologies currently in use. By compressing HDR images in a "perceptually optimal" way, the researchers enabled HDR visuals to be stored and viewed more conveniently, without sacrificing the image quality that makes HDR so compelling.

Technical Explanation

The paper introduces a learned HDR image compression model that optimizes for perceptual quality across a variety of display types. The model consists of an encoder that compresses the HDR image into a compact latent representation, and a decoder that reconstructs the HDR image from this latent code.

The key innovations include:

Perceptual Optimization: The model is trained to minimize a perceptual loss function that accounts for the characteristics of human visual perception, rather than just pixel-wise differences. This ensures the compressed images maintain important visual details and color fidelity.
Universal Decoding: The decoder is designed to handle a wide range of display types, from low-end LDR to high-end HDR. It can dynamically adjust the output to match the capabilities of the target display, without requiring separate models.
Efficient Encoding: The encoder leverages modern deep learning techniques to achieve high compression ratios while preserving perceptual quality, enabling efficient storage and transmission of HDR content.

The researchers evaluated their approach on a diverse dataset of HDR images, demonstrating significant quality improvements over traditional HDR compression methods, especially when viewed on displays with different dynamic range capabilities.

Critical Analysis

The paper presents a well-designed and thoroughly evaluated solution for HDR image compression. The perceptual optimization approach and universal decoding capability are notable strengths that address important practical challenges in HDR imaging.

However, the paper does not delve into the potential limitations or failure cases of the proposed model. For example, it would be helpful to understand how the model performs on specific types of HDR content, such as scenes with extremely high or low luminance levels, or images with unusual color distributions.

Additionally, the paper does not discuss the computational complexity of the encoder and decoder, which could be an important consideration for real-world deployment, especially on resource-constrained devices.

Further research could explore ways to extend the compression technique to video, or to integrate it with emerging HDR display technologies and color spaces. Investigating the model's performance on a broader range of HDR content and display types would also help validate the generalizability of the approach.

Conclusion

This paper presents a novel learned HDR image compression model that optimizes for perceptual quality across a wide range of display technologies. By leveraging modern deep learning techniques, the researchers have developed a solution that can efficiently encode HDR images while preserving important visual details and colors.

The ability to decode the compressed HDR images for optimal display on diverse hardware, from low-end LDR screens to high-end HDR displays, is a significant advancement that can help bridge the gap between the increasing availability of HDR content and the varied display capabilities of existing and future devices.

Overall, this research represents an important step forward in enabling the widespread adoption and enjoyment of HDR imaging, by making it more practical to store, transmit, and view HDR visuals without sacrificing quality or compatibility.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Learned HDR Image Compression for Perceptually Optimal Storage and Display

Peibei Cao, Haoyu Chen, Jingzhe Ma, Yu-Chieh Yuan, Zhiyong Xie, Xin Xie, Haiqing Bai, Kede Ma

High dynamic range (HDR) capture and display have seen significant growth in popularity driven by the advancements in technology and increasing consumer demand for superior image quality. As a result, HDR image compression is crucial to fully realize the benefits of HDR imaging without suffering from large file sizes and inefficient data handling. Conventionally, this is achieved by introducing a residual/gain map as additional metadata to bridge the gap between HDR and low dynamic range (LDR) images, making the former compatible with LDR image codecs but offering suboptimal rate-distortion performance. In this work, we initiate efforts towards end-to-end optimized HDR image compression for perceptually optimal storage and display. Specifically, we learn to compress an HDR image into two bitstreams: one for generating an LDR image to ensure compatibility with legacy LDR displays, and another as side information to aid HDR image reconstruction from the output LDR image. To measure the perceptual quality of output HDR and LDR images, we use two recently proposed image distortion metrics, both validated against human perceptual data of image quality and with reference to the uncompressed HDR image. Through end-to-end optimization for rate-distortion performance, our method dramatically improves HDR and LDR image quality at all bit rates.

7/19/2024

🛠️

Perceptual Assessment and Optimization of High Dynamic Range Image Rendering

Peibei Cao, Rafal K. Mantiuk, Kede Ma

High dynamic range (HDR) rendering has the ability to faithfully reproduce the wide luminance ranges in natural scenes, but how to accurately assess the rendering quality is relatively underexplored. Existing quality models are mostly designed for low dynamic range (LDR) images, and do not align well with human perception of HDR image quality. To fill this gap, we propose a family of HDR quality metrics, in which the key step is employing a simple inverse display model to decompose an HDR image into a stack of LDR images with varying exposures. Subsequently, these decomposed images are assessed through well-established LDR quality metrics. Our HDR quality models present three distinct benefits. First, they directly inherit the recent advancements of LDR quality metrics. Second, they do not rely on human perceptual data of HDR image quality for re-calibration. Third, they facilitate the alignment and prioritization of specific luminance ranges for more accurate and detailed quality assessment. Experimental results show that our HDR quality metrics consistently outperform existing models in terms of quality assessment on four HDR image quality datasets and perceptual optimization of HDR novel view synthesis.

6/18/2024

Efficient HDR Reconstruction from Real-World Raw Images

Qirui Yang, Yihao Liu, Qihua Chen, Huanjing Yue, Kun Li, Jingyu Yang

The widespread usage of high-definition screens on edge devices stimulates a strong demand for efficient high dynamic range (HDR) algorithms. However, many existing HDR methods either deliver unsatisfactory results or consume too much computational and memory resources, hindering their application to high-resolution images (usually with more than 12 megapixels) in practice. In addition, existing HDR dataset collection methods often are labor-intensive. In this work, in a new aspect, we discover an excellent opportunity for HDR reconstructing directly from raw images and investigating novel neural network structures that benefit the deployment of mobile devices. Our key insights are threefold: (1) we develop a lightweight-efficient HDR model, RepUNet, using the structural re-parameterization technique to achieve fast and robust HDR; (2) we design a new computational raw HDR data formation pipeline and construct a real-world raw HDR dataset, RealRaw-HDR; (3) we propose a plug-and-play motion alignment loss to mitigate motion ghosting under limited bandwidth conditions. Our model contains less than 830K parameters and takes less than 3 ms to process an image of 4K resolution using one RTX 3090 GPU. While being highly efficient, our model also outperforms the state-of-the-art HDR methods in terms of PSNR, SSIM, and a color difference metric.

6/6/2024

Exposure Diffusion: HDR Image Generation by Consistent LDR denoising

Mojtaba Bemana, Thomas Leimkuhler, Karol Myszkowski, Hans-Peter Seidel, Tobias Ritschel

We demonstrate generating high-dynamic range (HDR) images using the concerted action of multiple black-box, pre-trained low-dynamic range (LDR) image diffusion models. Common diffusion models are not HDR as, first, there is no sufficiently large HDR image dataset available to re-train them, and second, even if it was, re-training such models is impossible for most compute budgets. Instead, we seek inspiration from the HDR image capture literature that traditionally fuses sets of LDR images, called brackets, to produce a single HDR image. We operate multiple denoising processes to generate multiple LDR brackets that together form a valid HDR result. To this end, we introduce an exposure consistency term into the diffusion process to couple the brackets such that they agree across the exposure range they share. We demonstrate HDR versions of state-of-the-art unconditional and conditional as well as restoration-type (LDR2HDR) generative modeling.

5/24/2024