DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding

Read original: arXiv:2408.12150 - Published 8/23/2024 by Jooyoung Lee, Se Yoon Jeong, Munchurl Kim

DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding

Overview

This paper presents DeepHQ, a new deep learning-based image compression method that uses a hierarchical quantizer for progressive coding.
DeepHQ can achieve high-quality image compression while enabling progressive decoding, where the image quality is gradually improved as more bits are received.
The key innovations include a novel hierarchical quantization scheme and a progressive coding framework that leverages the learned quantizer.

Plain English Explanation

The paper describes a new way to compress images using deep learning. The main idea is to use a hierarchical quantizer - a system that breaks the image down into smaller pieces and encodes each piece using a different level of detail.

This allows the compressed image to be gradually improved as more of the encoded information is received, a process called progressive coding. For example, you might first get a low-quality version of the image, then as more data arrives, it becomes higher and higher quality.

The key innovation is the design of this hierarchical quantizer, which is trained using deep learning techniques. This allows it to learn the optimal way to break down and encode the image, resulting in high-quality compression while still enabling the progressive decoding feature.

Technical Explanation

The DeepHQ framework consists of an encoder network that transforms the input image into a compact latent representation, and a hierarchical quantizer that encodes this latent representation progressively.

The hierarchical quantizer has multiple stages, where each stage quantizes the latent representation at a different level of detail. The early stages produce a coarse approximation of the image, while later stages refine this approximation with more detailed information.

This progressive encoding allows the compressed bitstream to be gradually decoded, starting from a low-quality version of the image and improving in quality as more bits are received. The authors show that DeepHQ can achieve state-of-the-art compression performance while enabling this efficient progressive decoding.

Critical Analysis

The paper provides a thorough evaluation of DeepHQ, demonstrating its superiority over existing deep image compression methods in terms of rate-distortion performance and progressive decoding capability. However, the authors do not extensively discuss the computational complexity or runtime of the framework, which could be an important consideration for real-world applications.

Additionally, the paper does not explore the potential biases or limitations of the deep learning models used in DeepHQ. It would be valuable to investigate how the framework performs on diverse image datasets and whether there are any systematic errors or artifacts introduced by the hierarchical quantization approach.

Conclusion

The DeepHQ framework presents a significant advancement in deep image compression by introducing a novel hierarchical quantization scheme that enables high-quality progressive coding. This work demonstrates the potential of deep learning techniques to surpass traditional image compression methods and opens up new possibilities for efficient image transmission and storage, especially in bandwidth-constrained scenarios.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding

Jooyoung Lee, Se Yoon Jeong, Munchurl Kim

Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes to the transformed latent representations in a hierarchical manner. These approaches are designed to compress only the progressively added information as the quality improves, considering that a wider quantization interval for lower-quality compression includes multiple narrower sub-intervals for higher-quality compression. However, the existing methods are based on handcrafted quantization hierarchies, resulting in sub-optimal compression efficiency. In this paper, we propose an NN-based progressive coding method that firstly utilizes learned quantization step sizes via learning for each quantization layer. We also incorporate selective compression with which only the essential representation components are compressed for each quantization layer. We demonstrate that our method achieves significantly higher coding efficiency than the existing approaches with decreased decoding time and reduced model size.

8/23/2024

🛠️

HPC: Hierarchical Progressive Coding Framework for Volumetric Video

Zihan Zheng, Houqiang Zhong, Qiang Hu, Xiaoyun Zhang, Li Song, Ya Zhang, Yanfeng Wang

Volumetric video based on Neural Radiance Field (NeRF) holds vast potential for various 3D applications, but its substantial data volume poses significant challenges for compression and transmission. Current NeRF compression lacks the flexibility to adjust video quality and bitrate within a single model for various network and device capacities. To address these issues, we propose HPC, a novel hierarchical progressive volumetric video coding framework achieving variable bitrate using a single model. Specifically, HPC introduces a hierarchical representation with a multi-resolution residual radiance field to reduce temporal redundancy in long-duration sequences while simultaneously generating various levels of detail. Then, we propose an end-to-end progressive learning approach with a multi-rate-distortion loss function to jointly optimize both hierarchical representation and compression. Our HPC trained only once can realize multiple compression levels, while the current methods need to train multiple fixed-bitrate models for different rate-distortion (RD) tradeoffs. Extensive experiments demonstrate that HPC achieves flexible quality levels with variable bitrate by a single model and exhibits competitive RD performance, even outperforming fixed-bitrate models across various datasets.

8/6/2024

Super-High-Fidelity Image Compression via Hierarchical-ROI and Adaptive Quantization

Jixiang Luo, Yan Wang, Hongwei Qin

Learned Image Compression (LIC) has achieved dramatic progress regarding objective and subjective metrics. MSE-based models aim to improve objective metrics while generative models are leveraged to improve visual quality measured by subjective metrics. However, they all suffer from blurring or deformation at low bit rates, especially at below $0.2bpp$. Besides, deformation on human faces and text is unacceptable for visual quality assessment, and the problem becomes more prominent on small faces and text. To solve this problem, we combine the advantage of MSE-based models and generative models by utilizing region of interest (ROI). We propose Hierarchical-ROI (H-ROI), to split images into several foreground regions and one background region to improve the reconstruction of regions containing faces, text, and complex textures. Further, we propose adaptive quantization by non-linear mapping within the channel dimension to constrain the bit rate while maintaining the visual quality. Exhaustive experiments demonstrate that our methods achieve better visual quality on small faces and text with lower bit rates, e.g., $0.7X$ bits of HiFiC and $0.5X$ bits of BPG.

5/24/2024

Rate-Distortion-Cognition Controllable Versatile Neural Image Compression

Jinming Liu, Ruoyu Feng, Yunpeng Qi, Qiuyu Chen, Zhibo Chen, Wenjun Zeng, Xin Jin

Recently, the field of Image Coding for Machines (ICM) has garnered heightened interest and significant advances thanks to the rapid progress of learning-based techniques for image compression and analysis. Previous studies often require training separate codecs to support various bitrate levels, machine tasks, and networks, thus lacking both flexibility and practicality. To address these challenges, we propose a rate-distortion-cognition controllable versatile image compression, which method allows the users to adjust the bitrate (i.e., Rate), image reconstruction quality (i.e., Distortion), and machine task accuracy (i.e., Cognition) with a single neural model, achieving ultra-controllability. Specifically, we first introduce a cognition-oriented loss in the primary compression branch to train a codec for diverse machine tasks. This branch attains variable bitrate by regulating quantization degree through the latent code channels. To further enhance the quality of the reconstructed images, we employ an auxiliary branch to supplement residual information with a scalable bitstream. Ultimately, two branches use a `$beta x + (1 - beta) y$' interpolation strategy to achieve a balanced cognition-distortion trade-off. Extensive experiments demonstrate that our method yields satisfactory ICM performance and flexible Rate-Distortion-Cognition controlling.

7/18/2024