RMFA-Net: A Neural ISP for Real RAW to RGB Image Reconstruction

Read original: arXiv:2406.11469 - Published 6/18/2024 by Fei Li, Wenbo Hou, Peng Jia

RMFA-Net: A Neural ISP for Real RAW to RGB Image Reconstruction

Overview

Proposes RMFA-Net, a neural image signal processor (ISP) for reconstructing RGB images from real-world raw camera data
Addresses the challenge of uneven exposure in real-world settings, which can lead to unsatisfactory results with traditional ISPs
Introduces a novel residual multi-scale attention feature (RMFA) module to effectively handle uneven exposure and preserve details

Plain English Explanation

RMFA-Net is a new type of neural network that can take raw camera data and convert it into high-quality color images. This is important because raw camera data often has uneven lighting or exposure, which can make it hard for traditional image processing methods to produce good results.

RMFA-Net uses a special module called RMFA to address this challenge. The RMFA module helps the network understand how different parts of the image should be processed, even when the lighting is uneven. This allows RMFA-Net to preserve important details and produce better-looking color images from the raw camera data.

In essence, RMFA-Net is a more advanced "translator" that can take messy raw camera input and turn it into clean, well-exposed color photos. This could be useful in a variety of applications, such as improving the photo quality from smartphone cameras or enabling more powerful computer vision capabilities.

Technical Explanation

The researchers propose the RMFA-Net, a neural image signal processor (ISP) that can reconstruct RGB images from real-world raw camera data. This is an important task, as traditional ISPs often struggle with uneven exposure in real-world settings, leading to unsatisfactory results.

To address this challenge, the researchers introduce a novel Residual Multi-Scale Attention Feature (RMFA) module. The RMFA module effectively handles uneven exposure by selectively emphasizing relevant features at different scales, helping the network preserve important details in the final RGB image.

The RMFA-Net architecture consists of an encoder-decoder structure with skip connections. The encoder extracts multi-scale features from the input raw image, while the decoder uses the RMFA module to selectively combine these features and reconstruct the final RGB image.

The researchers evaluate RMFA-Net on several real-world raw-to-RGB datasets, including ParaISP, NIR-Assisted Image Denoising, and Efficient HDR Reconstruction. The results demonstrate that RMFA-Net outperforms state-of-the-art methods in terms of both objective and subjective image quality metrics.

Critical Analysis

The paper presents a novel and well-designed neural ISP solution for reconstructing RGB images from real-world raw camera data. The key strength of RMFA-Net is its ability to effectively handle uneven exposure, a common challenge in real-world settings.

However, the paper does not fully address the issue of generalization. While RMFA-Net performs well on the evaluated datasets, it is unclear how the model would perform on completely unseen data from different camera models or under more diverse lighting conditions. Further research is needed to assess the model's robustness and adaptability to a wider range of real-world scenarios.

Additionally, the paper lacks a detailed discussion of the computational complexity and inference time of RMFA-Net. As a neural network-based solution, the model's efficiency and practicality in real-world applications should be examined more closely.

Finally, the paper could have provided more insight into the interpretability of the RMFA module. Understanding how the attention mechanism works and what specific image features it emphasizes would help strengthen the theoretical understanding of the proposed approach.

Conclusion

The RMFA-Net presented in this paper is a promising solution for addressing the challenge of uneven exposure in real-world raw-to-RGB image reconstruction. By introducing the novel RMFA module, the researchers have demonstrated the ability to effectively preserve important details and produce high-quality color images from raw camera data.

While the paper shows strong performance on several benchmarks, further research is needed to assess the generalization and efficiency of RMFA-Net, as well as to provide a more in-depth understanding of the inner workings of the attention mechanism. Nonetheless, this work represents an important step forward in advancing the capabilities of neural-based image signal processors for real-world applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

RMFA-Net: A Neural ISP for Real RAW to RGB Image Reconstruction

Fei Li, Wenbo Hou, Peng Jia

Deep learning-based ISP algorithms have demonstrated significant potential in raw2rgb reconstruction. However, existing networks have not fully considered the specific characteristics of raw data, such as black level and CFA, which can negatively impact texture and color if mishandled. Moreover, uneven exposure in raw data is also not considered carefully, leading to adverse effects on contrast and brightness. In this paper, we introduce RMFA-Net to tackle these problems. We perform implicit black level correction to mitigate color shifts in dim scenes. To preserve high-frequency information and prevent misalignment, we propose a novel Three-Channel-Split mode. To address the issue of uneven exposure, we designed an explicit tone mapping module based on the Retinex theory. We train and evaluate our models using the dataset released by the Mobile AI 2022 Learned Smartphone ISP Challenge. It is demonstrated that RMFA-Net outperforms previous algorithms, achieving a PSNR score of over 25 dB, surpassing the state-of-the-art by +1 dB. Furthermore, we developed a lightweight version, RMFANet-tiny, for engineering deployment while still maintaining strong performance, surpassing the SOTA by +0.5 dB.

6/18/2024

Retinex-RAWMamba: Bridging Demosaicing and Denoising for Low-Light RAW Image Enhancement

Xianmin Chen, Peiliang Huang, Xiaoxu Feng, Dingwen Zhang, Longfei Han, Junwei Han

Low-light image enhancement, particularly in cross-domain tasks such as mapping from the raw domain to the sRGB domain, remains a significant challenge. Many deep learning-based methods have been developed to address this issue and have shown promising results in recent years. However, single-stage methods, which attempt to unify the complex mapping across both domains, leading to limited denoising performance. In contrast, two-stage approaches typically decompose a raw image with color filter arrays (CFA) into a four-channel RGGB format before feeding it into a neural network. However, this strategy overlooks the critical role of demosaicing within the Image Signal Processing (ISP) pipeline, leading to color distortions under varying lighting conditions, especially in low-light scenarios. To address these issues, we design a novel Mamba scanning mechanism, called RAWMamba, to effectively handle raw images with different CFAs. Furthermore, we present a Retinex Decomposition Module (RDM) grounded in Retinex prior, which decouples illumination from reflectance to facilitate more effective denoising and automatic non-linear exposure correction. By bridging demosaicing and denoising, better raw image enhancement is achieved. Experimental evaluations conducted on public datasets SID and MCR demonstrate that our proposed RAWMamba achieves state-of-the-art performance on cross-domain mapping.

9/12/2024

ParamISP: Learned Forward and Inverse ISPs using Camera Parameters

Woohyeok Kim, Geonu Kim, Junyong Lee, Seungyong Lee, Seung-Hwan Baek, Sunghyun Cho

RAW images are rarely shared mainly due to its excessive data size compared to their sRGB counterparts obtained by camera ISPs. Learning the forward and inverse processes of camera ISPs has been recently demonstrated, enabling physically-meaningful RAW-level image processing on input sRGB images. However, existing learning-based ISP methods fail to handle the large variations in the ISP processes with respect to camera parameters such as ISO and exposure time, and have limitations when used for various applications. In this paper, we propose ParamISP, a learning-based method for forward and inverse conversion between sRGB and RAW images, that adopts a novel neural-network module to utilize camera parameters, which is dubbed as ParamNet. Given the camera parameters provided in the EXIF data, ParamNet converts them into a feature vector to control the ISP networks. Extensive experiments demonstrate that ParamISP achieve superior RAW and sRGB reconstruction results compared to previous methods and it can be effectively used for a variety of applications such as deblurring dataset synthesis, raw deblurring, HDR reconstruction, and camera-to-camera transfer.

4/16/2024

RMAFF-PSN: A Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network

Kai Luo, Yakun Ju, Lin Qi, Kaixuan Wang, Junyu Dong

Predicting accurate normal maps of objects from two-dimensional images in regions of complex structure and spatial material variations is challenging using photometric stereo methods due to the influence of surface reflection properties caused by variations in object geometry and surface materials. To address this issue, we propose a photometric stereo network called a RMAFF-PSN that uses residual multiscale attentional feature fusion to handle the ``difficult'' regions of the object. Unlike previous approaches that only use stacked convolutional layers to extract deep features from the input image, our method integrates feature information from different resolution stages and scales of the image. This approach preserves more physical information, such as texture and geometry of the object in complex regions, through shallow-deep stage feature extraction, double branching enhancement, and attention optimization. To test the network structure under real-world conditions, we propose a new real dataset called Simple PS data, which contains multiple objects with varying structures and materials. Experimental results on a publicly available benchmark dataset demonstrate that our method outperforms most existing calibrated photometric stereo methods for the same number of input images, especially in the case of highly non-convex object structures. Our method also obtains good results under sparse lighting conditions.

4/16/2024