Removing Reflections from RAW Photos

2404.14414

135

Published 4/24/2024 by Eric Kee, Adam Pikielny, Kevin Blackburn-Matzen, Marc Levoy

Abstract

We describe a system to remove real-world reflections from images for consumer photography. Our system operates on linear (RAW) photos, with the (optional) addition of a contextual photo looking in the opposite direction, e.g., using the selfie camera on a mobile device, which helps disambiguate what should be considered the reflection. The system is trained using synthetic mixtures of real-world RAW images, which are combined using a reflection simulation that is photometrically and geometrically accurate. Our system consists of a base model that accepts the captured photo and optional contextual photo as input, and runs at 256p, followed by an up-sampling model that transforms output 256p images to full resolution. The system can produce images for review at 1K in 6.5 seconds on an iPhone 14 Pro. We test on RAW photos that were captured in the field and embody typical consumer photographs.

Get summaries of the top AI research delivered straight to your inbox:

Overview

This paper presents a method for removing reflections from RAW photos, which can be a common issue in photography.
The proposed approach involves synthesizing realistic reflections and using them to train a neural network to remove reflections from RAW images.
The method is evaluated on a new dataset of RAW images with reflections, and it demonstrates state-of-the-art performance in reflection removal.

Plain English Explanation

Reflections in photographs can be a nuisance, obscuring the main subject and reducing the overall quality of the image. This paper tackles this problem by developing a way to automatically remove reflections from RAW photos, which are the unprocessed image files captured by digital cameras.

The key insight is that by synthesizing realistic-looking reflections and using them to train a neural network, the model can learn to identify and remove reflections from new RAW images. This is a clever approach, as it allows the model to be trained on a large and diverse dataset of reflection-containing images, even if such a dataset doesn't exist in the real world.

The researchers evaluate their method on a newly created dataset of RAW photos with reflections, and show that it outperforms existing state-of-the-art reflection removal techniques. This is an exciting development, as it could help photographers capture cleaner, more polished images, even in challenging lighting conditions.

Technical Explanation

The paper first reviews prior work on reflection removal, noting that most existing methods either require additional information (e.g., multiple images) or struggle with complex, real-world reflections.

To address these limitations, the authors propose a novel reflection synthesis approach. They develop a deep learning model that can generate realistic-looking reflections, which are then used to train a separate reflection removal network. The reflection removal network takes a RAW image as input and outputs the underlying scene without the reflection.

The key technical contributions include:

A reflection synthesis model that can generate diverse, photorealistic reflections, trained on a large dataset of real-world reflections.
A reflection removal network architecture that effectively separates the reflection layer from the underlying scene, leveraging the synthetic reflection data.
A new RAW reflection dataset, comprising RAW images with carefully captured reflections, used for training and evaluation.

Experiments show that the proposed approach outperforms state-of-the-art methods on the new RAW reflection dataset, demonstrating its effectiveness in real-world photography scenarios.

Critical Analysis

The paper presents a well-designed and thoroughly evaluated solution to the challenging problem of reflection removal in RAW photos. The key strength of the approach is the clever use of synthetic reflections to train the reflection removal model, which allows it to handle a wide range of real-world reflection scenarios.

However, the paper does not discuss the potential limitations of this approach. For example, it's not clear how the synthetic reflections compare to real reflections in terms of their visual characteristics or the challenges they present for removal. Additionally, the paper does not explore the potential generalization issues that could arise if the synthetic reflections do not fully capture the complexity of real-world reflections.

Another area for further research could be exploring the use of additional input modalities, such as depth information or multiple exposures, to further improve the reflection removal performance, especially in challenging cases.

Conclusion

This paper presents a novel approach to removing reflections from RAW photos, a common problem in photography. By synthesizing realistic reflections and using them to train a deep learning-based reflection removal model, the authors demonstrate state-of-the-art performance on a new dataset of RAW images with reflections.

The proposed method offers a promising solution to a practical problem faced by photographers, and the use of synthetic data to train the model is a clever and effective technique. While the paper does not address all potential limitations, it represents an important step forward in the field of computational photography and could have a significant impact on real-world image capture and processing.

Related Papers

🖼️

Real-time Noise Source Estimation of a Camera System from an Image and Metadata

Maik Wischow, Patrick Irmisch, Anko Boerner, Guillermo Gallego

Autonomous machines must self-maintain proper functionality to ensure the safety of humans and themselves. This pertains particularly to its cameras as predominant sensors to perceive the environment and support actions. A fundamental camera problem addressed in this study is noise. Solutions often focus on denoising images a posteriori, that is, fighting symptoms rather than root causes. However, tackling root causes requires identifying the noise sources, considering the limitations of mobile platforms. This work investigates a real-time, memory-efficient and reliable noise source estimator that combines data- and physically-based models. To this end, a DNN that examines an image with camera metadata for major camera noise sources is built and trained. In addition, it quantifies unexpected factors that impact image noise or metadata. This study investigates seven different estimators on six datasets that include synthetic noise, real-world noise from two camera systems, and real field campaigns. For these, only the model with most metadata is capable to accurately and robustly quantify all individual noise contributions. This method outperforms total image noise estimators and can be plug-and-play deployed. It also serves as a basis to include more advanced noise sources, or as part of an automatic countermeasure feedback-loop to approach fully reliable machines.

4/5/2024

cs.CV cs.RO eess.IV

🛠️

Total Selfie: Generating Full-Body Selfies

Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman, Steven M. Seitz

We present a method to generate full-body selfies from photographs originally taken at arms length. Because self-captured photos are typically taken close up, they have limited field of view and exaggerated perspective that distorts facial shapes. We instead seek to generate the photo some one else would take of you from a few feet away. Our approach takes as input four selfies of your face and body, a background image, and generates a full-body selfie in a desired target pose. We introduce a novel diffusion-based approach to combine all of this information into high-quality, well-composed photos of you with the desired pose and background.

4/4/2024

cs.CV cs.GR cs.LG

Overcoming Scene Context Constraints for Object Detection in wild using Defilters

Vamshi Krishna Kancharla, Neelam sinha

This paper focuses on improving object detection performance by addressing the issue of image distortions, commonly encountered in uncontrolled acquisition environments. High-level computer vision tasks such as object detection, recognition, and segmentation are particularly sensitive to image distortion. To address this issue, we propose a novel approach employing an image defilter to rectify image distortion prior to object detection. This method enhances object detection accuracy, as models perform optimally when trained on non-distorted images. Our experiments demonstrate that utilizing defiltered images significantly improves mean average precision compared to training object detection models on distorted images. Consequently, our proposed method offers considerable benefits for real-world applications plagued by image distortion. To our knowledge, the contribution lies in employing distortion-removal paradigm for object detection on images captured in natural settings. We achieved an improvement of 0.562 and 0.564 of mean Average precision on validation and test data.

4/15/2024

cs.CV

Hybrid Training of Denoising Networks to Improve the Texture Acutance of Digital Cameras

Raphael Achddou, Yann Gousseau, Said Ladjal

In order to evaluate the capacity of a camera to render textures properly, the standard practice, used by classical scoring protocols, is to compute the frequential response to a dead leaves image target, from which is built a texture acutance metric. In this work, we propose a mixed training procedure for image restoration neural networks, relying on both natural and synthetic images, that yields a strong improvement of this acutance metric without impairing fidelity terms. The feasibility of the approach is demonstrated both on the denoising of RGB images and the full development of RAW images, opening the path to a systematic improvement of the texture acutance of real imaging devices.

4/12/2024

eess.IV cs.AI cs.CV