DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

Read original: arXiv:2407.03757 - Published 7/8/2024 by Zheng-Peng Duan, Jiawei zhang, Zheng Lin, Xin Jin, Dongqing Zou, Chunle Guo, Chongyi Li

DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

Overview

This paper introduces DiffRetouch, a new image retouching system that leverages diffusion models to enhance photos on the "shoulder of experts".
DiffRetouch aims to provide an intuitive and powerful tool for non-expert users to retouch photos with high-quality results.
The system is designed to enable users to perform various retouching tasks, such as blemish removal, skin smoothing, and hair enhancement, using a simple and interactive interface.

Plain English Explanation

The paper presents DiffRetouch, a new tool for improving the appearance of photos. The key idea is to use a diffusion model - a type of AI system that can generate or modify images - to help people who aren't experts in photo editing make their photos look better.

Typically, professional photographers and editors use specialized software and a lot of skill to retouch photos, removing blemishes, smoothing skin, and enhancing hair. DiffRetouch aims to make these types of retouching tasks much easier for regular users. By leveraging the power of diffusion models, the system can provide high-quality retouching results without requiring extensive training or expertise.

The paper describes how DiffRetouch works and how it was evaluated. The key innovation is the way the diffusion model is adapted and combined with user input to produce the desired retouching effects. This allows non-experts to edit their photos in an intuitive and effective way, on the shoulder of experts.

Technical Explanation

The DiffRetouch system is built around a diffusion model, which is a type of generative AI model that can be used to modify and enhance images. The researchers trained the diffusion model on a large dataset of retouched photos, allowing it to learn the patterns and techniques used by expert photo editors.

To use DiffRetouch, the user first provides an input photo. The system then allows the user to interactively "paint" on the image, specifying which areas they would like to retouch. The diffusion model then uses this input to generate a retouched version of the photo, tailoring the changes to the specific needs of the user's image.

The key technical innovation is the way the diffusion model is adapted and combined with the user's input. The researchers developed novel techniques for effectively incorporating the user's guidance and for producing high-quality retouching results that blend seamlessly with the original image.

Through extensive evaluation, the researchers demonstrated that DiffRetouch can produce retouching results that are on par with those created by expert photo editors, while requiring much less effort and expertise from the user.

Critical Analysis

The paper presents a compelling approach to image retouching that leverages the power of diffusion models to empower non-expert users. The key strengths of the DiffRetouch system are its ability to produce high-quality retouching results and its intuitive, interactive interface.

However, the paper does acknowledge some limitations and areas for further research. For example, the system may struggle with particularly complex or challenging retouching tasks, and the researchers note that the performance of the diffusion model could be further improved.

Additionally, there are broader concerns around the use of AI-powered retouching tools, such as the potential for perpetuating unrealistic beauty standards or the ethical implications of manipulating images. While the paper does not address these issues directly, they are important considerations for the development and deployment of such systems.

Overall, the DiffRetouch system represents an exciting step forward in making professional-quality photo editing accessible to a wider audience. However, continued research and careful consideration of the societal impact of these technologies will be crucial as they continue to evolve.

Conclusion

The DiffRetouch paper introduces a novel approach to image retouching that leverages the power of diffusion models to enable non-expert users to enhance their photos with high-quality results. By adapting the diffusion model to incorporate user guidance, the system provides an intuitive and effective way for people to perform a variety of retouching tasks, such as blemish removal and skin smoothing.

The technical innovations described in the paper, coupled with the system's strong performance, suggest that DiffRetouch represents a significant advancement in making professional-level photo editing more accessible. As AI-powered tools continue to evolve, it will be important to consider their broader societal implications, but the DiffRetouch project demonstrates the potential for these technologies to empower users and democratize creative workflows.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

Zheng-Peng Duan, Jiawei zhang, Zheng Lin, Xin Jin, Dongqing Zou, Chunle Guo, Chongyi Li

Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which not only neglects the style diversity in the expert-retouched results and tends to learn an average style during training, but also lacks sample diversity during inference. In this paper, we propose a diffusion-based method, named DiffRetouch. Thanks to the excellent distribution modeling ability of diffusion, our method can capture the complex fine-retouched distribution covering various visual-pleasing styles in the training data. Moreover, four image attributes are made adjustable to provide a user-friendly editing mechanism. By adjusting these attributes in specified ranges, users are allowed to customize preferred styles within the learned fine-retouched distribution. Additionally, the affine bilateral grid and contrastive learning scheme are introduced to handle the problem of texture distortion and control insensitivity respectively. Extensive experiments have demonstrated the superior performance of our method on visually appealing and sample diversity. The code will be made available to the community.

7/8/2024

LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models

Hai Jiang, Ao Luo, Xiaohong Liu, Songchen Han, Shuaicheng Liu

In this paper, we propose a diffusion-based unsupervised framework that incorporates physically explainable Retinex theory with diffusion models for low-light image enhancement, named LightenDiffusion. Specifically, we present a content-transfer decomposition network that performs Retinex decomposition within the latent space instead of image space as in previous approaches, enabling the encoded features of unpaired low-light and normal-light images to be decomposed into content-rich reflectance maps and content-free illumination maps. Subsequently, the reflectance map of the low-light image and the illumination map of the normal-light image are taken as input to the diffusion model for unsupervised restoration with the guidance of the low-light feature, where a self-constrained consistency loss is further proposed to eliminate the interference of normal-light content on the restored results to improve overall visual quality. Extensive experiments on publicly available real-world benchmarks show that the proposed LightenDiffusion outperforms state-of-the-art unsupervised competitors and is comparable to supervised methods while being more generalizable to various scenes. Our code is available at https://github.com/JianghaiSCU/LightenDiffusion.

7/15/2024

New!Taming Diffusion Models for Image Restoration: A Review

Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjolund, Thomas B. Schon

Diffusion models have achieved remarkable progress in generative modelling, particularly in enhancing image quality to conform to human preferences. Recently, these models have also been applied to low-level computer vision for photo-realistic image restoration (IR) in tasks such as image denoising, deblurring, dehazing, etc. In this review paper, we introduce key constructions in diffusion models and survey contemporary techniques that make use of diffusion models in solving general IR tasks. Furthermore, we point out the main challenges and limitations of existing diffusion-based IR frameworks and provide potential directions for future work.

9/17/2024

👀

FRRffusion: Unveiling Authenticity with Diffusion-Based Face Retouching Reversal

Fengchuang Xing, Xiaowen Shi, Yuan-Gen Wang, Chunsheng Yang

Unveiling the real appearance of retouched faces to prevent malicious users from deceptive advertising and economic fraud has been an increasing concern in the era of digital economics. This article makes the first attempt to investigate the face retouching reversal (FRR) problem. We first collect an FRR dataset, named deepFRR, which contains 50,000 StyleGAN-generated high-resolution (1024*1024) facial images and their corresponding retouched ones by a commercial online API. To our best knowledge, deepFRR is the first FRR dataset tailored for training the deep FRR models. Then, we propose a novel diffusion-based FRR approach (FRRffusion) for the FRR task. Our FRRffusion consists of a coarse-to-fine two-stage network: A diffusion-based Facial Morpho-Architectonic Restorer (FMAR) is constructed to generate the basic contours of low-resolution faces in the first stage, while a Transformer-based Hyperrealistic Facial Detail Generator (HFDG) is designed to create high-resolution facial details in the second stage. Tested on deepFRR, our FRRffusion surpasses the GP-UNIT and Stable Diffusion methods by a large margin in four widespread quantitative metrics. Especially, the de-retouched images by our FRRffusion are visually much closer to the raw face images than both the retouched face images and those restored by the GP-UNIT and Stable Diffusion methods in terms of qualitative evaluation with 85 subjects. These results sufficiently validate the efficacy of our work, bridging the recently-standing gap between the FRR and generic image restoration tasks. The dataset and code are available at https://github.com/GZHU-DVL/FRRffusion.

5/14/2024