DriftRec: Adapting diffusion models to blind JPEG restoration

2211.06757

Published 4/4/2024 by Simon Welker, Henry N. Chapman, Timo Gerkmann

🔮

Abstract

In this work, we utilize the high-fidelity generation abilities of diffusion models to solve blind JPEG restoration at high compression levels. We propose an elegant modification of the forward stochastic differential equation of diffusion models to adapt them to this restoration task and name our method DriftRec. Comparing DriftRec against an $L_2$ regression baseline with the same network architecture and state-of-the-art techniques for JPEG restoration, we show that our approach can escape the tendency of other methods to generate blurry images, and recovers the distribution of clean images significantly more faithfully. For this, only a dataset of clean/corrupted image pairs and no knowledge about the corruption operation is required, enabling wider applicability to other restoration tasks. In contrast to other conditional and unconditional diffusion models, we utilize the idea that the distributions of clean and corrupted images are much closer to each other than each is to the usual Gaussian prior of the reverse process in diffusion models. Our approach therefore requires only low levels of added noise and needs comparatively few sampling steps even without further optimizations. We show that DriftRec naturally generalizes to realistic and difficult scenarios such as unaligned double JPEG compression and blind restoration of JPEGs found online, without having encountered such examples during training.

Create account to get full access

Overview

The researchers use advanced AI models called diffusion models to solve the problem of JPEG image restoration at high compression levels.
They propose a method called DriftRec that modifies the mathematical equations of diffusion models to better adapt them to this restoration task.
DriftRec is able to recover the distribution of clean images more faithfully compared to other state-of-the-art JPEG restoration techniques.
The approach only requires a dataset of clean and corrupted image pairs, without needing to know details about the corruption process.
DriftRec generalizes well to challenging real-world scenarios like unaligned double JPEG compression and restoration of images found online.

Plain English Explanation

JPEG image compression is a common way to reduce file sizes, but it can cause noticeable quality degradation, especially at high compression levels. The researchers tackled this challenge by using a powerful AI technique called diffusion models.

Diffusion models work by starting with pure noise and gradually transforming it into realistic-looking images. The researchers found a way to modify how these models work so they could undo the damage caused by JPEG compression, without needing to know the exact details of how the compression was done.

The key insight is that the distribution (or "shape") of the clean and compressed images is actually closer to each other than either one is to the typical distribution the diffusion model expects. By incorporating this idea, the researchers' DriftRec method is able to recover the clean image quality much better than other approaches, without introducing the blurriness that often plagues JPEG restoration.

Impressively, DriftRec works well even for very challenging real-world scenarios, like when the JPEG compression is applied multiple times in an unaligned way, or when restoring images found on the internet. This demonstrates the flexibility and power of the diffusion model-based approach.

Technical Explanation

The core of the researchers' contribution is an elegant modification to the mathematical equations that govern the forward (noise-adding) process in diffusion models. This allows the model to better handle the specific type of noise introduced by JPEG compression, compared to simply training on corrupted/clean image pairs.

Specifically, the researchers change the "drift" term in the stochastic differential equation that describes how the diffusion process moves the image distribution towards pure noise. This drift term now accounts for the fact that the corrupted and clean image distributions are closer together than either is to the standard Gaussian prior used in vanilla diffusion models.

Experiments show that DriftRec outperforms an $L_2$ regression baseline with the same network architecture, as well as state-of-the-art JPEG restoration techniques. DriftRec is able to more faithfully recover the distribution of clean images, avoiding the blurriness common in other methods.

The researchers also demonstrate DriftRec's ability to generalize to challenging real-world scenarios, such as unaligned double JPEG compression and restoration of internet images, without ever seeing such examples during training. This highlights the flexibility and broad applicability of the diffusion model-based approach.

Critical Analysis

The paper provides a thorough evaluation of DriftRec's performance, comparing it to strong baselines across a range of JPEG restoration benchmarks. The results convincingly demonstrate the advantages of the proposed modification to diffusion models.

That said, the paper does not delve into potential limitations or failure cases of the approach. For example, it would be valuable to understand how DriftRec's performance scales with the level of JPEG compression, or how it might handle more extreme types of image degradation beyond JPEG artifacts.

Additionally, while the researchers highlight DriftRec's ability to generalize, they do not explore the model's robustness to distribution shift or its sensitivity to the choice of training data. These are important considerations for real-world deployment of such restoration techniques.

Overall, the paper presents a compelling diffusion-based solution to the challenging JPEG restoration problem. Further exploration of the approach's limitations and robustness would help strengthen the understanding of its strengths and weaknesses.

Conclusion

The researchers have demonstrated the power of diffusion models for solving the problem of JPEG image restoration at high compression levels. Their DriftRec method is able to faithfully recover the distribution of clean images, outperforming state-of-the-art techniques and avoiding the common issue of blurriness.

Importantly, DriftRec requires only a dataset of clean and corrupted image pairs, without needing to know the details of the JPEG compression process. This makes the approach widely applicable to other restoration tasks beyond JPEG artifacts.

The researchers' insight to modify the mathematical structure of diffusion models to better match the corrupted and clean image distributions is a clever and effective solution. This highlights the flexibility and potential of diffusion models to tackle a variety of image processing challenges.

Overall, this work showcases the exciting advances in AI-powered image restoration and the continued progress in leveraging powerful generative models like diffusion for real-world applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

📊

Decoupled Data Consistency with Diffusion Purification for Image Restoration

Xiang Li, Soo Min Kwon, Ismail R. Alkhouri, Saiprasad Ravishankar, Qing Qu

Diffusion models have recently gained traction as a powerful class of deep generative priors, excelling in a wide range of image restoration tasks due to their exceptional ability to model data distributions. To solve image restoration problems, many existing techniques achieve data consistency by incorporating additional likelihood gradient steps into the reverse sampling process of diffusion models. However, the additional gradient steps pose a challenge for real-world practical applications as they incur a large computational overhead, thereby increasing inference time. They also present additional difficulties when using accelerated diffusion model samplers, as the number of data consistency steps is limited by the number of reverse sampling steps. In this work, we propose a novel diffusion-based image restoration solver that addresses these issues by decoupling the reverse process from the data consistency steps. Our method involves alternating between a reconstruction phase to maintain data consistency and a refinement phase that enforces the prior via diffusion purification. Our approach demonstrates versatility, making it highly adaptable for efficient problem-solving in latent space. Additionally, it reduces the necessity for numerous sampling steps through the integration of consistency models. The efficacy of our approach is validated through comprehensive experiments across various image restoration tasks, including image denoising, deblurring, inpainting, and super-resolution.

5/30/2024

eess.IV cs.AI cs.CV cs.LG eess.SP

Lossy Image Compression with Foundation Diffusion Models

Lucas Relic, Roberto Azevedo, Markus Gross, Christopher Schroers

Incorporating diffusion models in the image compression domain has the potential to produce realistic and detailed reconstructions, especially at extremely low bitrates. Previous methods focus on using diffusion models as expressive decoders robust to quantization errors in the conditioning signals, yet achieving competitive results in this manner requires costly training of the diffusion model and long inference times due to the iterative generative process. In this work we formulate the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent. Our approach allows us to perform less than 10% of the full diffusion generative process and requires no architectural changes to the diffusion model, enabling the use of foundation models as a strong prior without additional fine tuning of the backbone. Our proposed codec outperforms previous methods in quantitative realism metrics, and we verify that our reconstructions are qualitatively preferred by end users, even when other methods use twice the bitrate.

4/15/2024

eess.IV cs.CV

Blind Image Restoration via Fast Diffusion Inversion

Hamadi Chihaoui, Abdelhak Lemkhenter, Paolo Favaro

Recently, various methods have been proposed to solve Image Restoration (IR) tasks using a pre-trained diffusion model leading to state-of-the-art performance. However, most of these methods assume that the degradation operator in the IR task is completely known. Furthermore, a common characteristic among these approaches is that they alter the diffusion sampling process in order to satisfy the consistency with the degraded input image. This choice has recently been shown to be sub-optimal and to cause the restored image to deviate from the data manifold. To address these issues, we propose Blind Image Restoration via fast Diffusion inversion (BIRD) a blind IR method that jointly optimizes for the degradation model parameters and the restored image. To ensure that the restored images lie onto the data manifold, we propose a novel sampling technique on a pre-trained diffusion model. A key idea in our method is not to modify the reverse sampling, i.e., not to alter all the intermediate latents, once an initial noise is sampled. This is ultimately equivalent to casting the IR task as an optimization problem in the space of the input noise. Moreover, to mitigate the computational cost associated with inverting a fully unrolled diffusion model, we leverage the inherent capability of these models to skip ahead in the forward diffusion process using large time steps. We experimentally validate BIRD on several image restoration tasks and show that it achieves state of the art performance on all of them. Our code is available at https://github.com/hamadichihaoui/BIRD.

5/31/2024

cs.CV

Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models

Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjolund, Thomas B. Schon

Though diffusion models have been successfully applied to various image restoration (IR) tasks, their performance is sensitive to the choice of training datasets. Typically, diffusion models trained in specific datasets fail to recover images that have out-of-distribution degradations. To address this problem, this work leverages a capable vision-language model and a synthetic degradation pipeline to learn image restoration in the wild (wild IR). More specifically, all low-quality images are simulated with a synthetic degradation pipeline that contains multiple common degradations such as blur, resize, noise, and JPEG compression. Then we introduce robust training for a degradation-aware CLIP model to extract enriched image content features to assist high-quality image restoration. Our base diffusion model is the image restoration SDE (IR-SDE). Built upon it, we further present a posterior sampling strategy for fast noise-free image generation. We evaluate our model on both synthetic and real-world degradation datasets. Moreover, experiments on the unified image restoration task illustrate that the proposed posterior sampling improves image generation quality for various degradations.

4/16/2024

cs.CV