ECAFormer: Low-light Image Enhancement using Cross Attention

Read original: arXiv:2406.13281 - Published 8/21/2024 by Yudi Ruan, Hao Ma, Weikai Li, Xiao Wang

ECAFormer: Low-light Image Enhancement using Cross Attention

Overview

This paper introduces a new image enhancement technique called ECAFormer that uses cross-attention to improve low-light image quality.
ECAFormer leverages a transformer-based architecture to effectively fuse features from different levels, enabling it to recover details and colors in dark regions.
The proposed method outperforms state-of-the-art low-light image enhancement techniques on a range of benchmarks, demonstrating its effectiveness.

Plain English Explanation

Low-light images often suffer from poor visibility, lacking details and accurate colors. ECAFormer: Low-light Image Enhancement using Cross Attention introduces a novel approach to tackle this problem.

The key idea is to use a type of neural network called a transformer, which is good at fusing information from different sources. ECAFormer takes a low-light image as input and passes it through multiple transformer layers. These layers learn to effectively combine features from different parts of the image, allowing the model to recover details and colors in the dark regions.

Compared to previous methods, ECAFormer is able to produce much clearer and more vibrant low-light images. This can be particularly useful in applications like night photography, surveillance, and autonomous driving, where good visibility in dark conditions is critical.

Technical Explanation

The ECAFormer architecture is built upon a transformer-based design. Transformers are a type of neural network that excel at capturing long-range dependencies and integrating information from diverse sources.

In ECAFormer, the transformer layers are used to fuse features from different levels of the network. This cross-attention mechanism allows the model to selectively focus on and combine the most relevant features for enhancing low-light images. The fused features are then passed through additional convolutional layers to produce the final enhanced output.

The authors evaluate ECAFormer on several low-light image enhancement benchmarks, including LOL, LIME, and Extr-Dark. The results demonstrate that ECAFormer outperforms state-of-the-art methods in terms of both quantitative metrics and visual quality, highlighting its effectiveness in recovering details and colors in low-light conditions.

Critical Analysis

The ECAFormer paper presents a promising approach to low-light image enhancement, leveraging the power of transformer-based architectures. The cross-attention mechanism used to fuse features from different levels of the network is a key innovation that enables the model to adaptively combine relevant information for improving image quality.

However, the paper does not provide much discussion on the limitations or potential issues with the proposed method. For example, it would be useful to understand the computational complexity of ECAFormer and how it compares to other low-light enhancement techniques in terms of inference speed and memory usage. Additionally, the paper could explore the robustness of the model to different types of low-light conditions or the potential for the method to be extended to other image enhancement tasks.

Overall, the ECAFormer paper presents an interesting and effective approach to low-light image enhancement, but further research and analysis could help to fully evaluate its strengths, limitations, and potential for real-world applications.

Conclusion

ECAFormer: Low-light Image Enhancement using Cross Attention introduces a novel transformer-based architecture for improving the quality of low-light images. By leveraging a cross-attention mechanism to fuse features from different levels of the network, the model is able to effectively recover details and colors in dark regions, outperforming state-of-the-art methods.

The proposed approach has the potential to significantly improve low-light image quality in a wide range of applications, from night photography to autonomous driving. Further research could explore the model's computational efficiency, robustness, and potential extensions to other image enhancement tasks, helping to unlock the full benefits of this innovative technique.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

ECAFormer: Low-light Image Enhancement using Cross Attention

Yudi Ruan, Hao Ma, Weikai Li, Xiao Wang

Low-light image enhancement (LLIE) is critical in computer vision. Existing LLIE methods often fail to discover the underlying relationships between different sub-components, causing the loss of complementary information between multiple modules and network layers, ultimately resulting in the loss of image details. To beat this shortage, we design a hierarchical mutual Enhancement via a Cross Attention transformer (ECAFormer), which introduces an architecture that enables concurrent propagation and interaction of multiple features. The model preserves detailed information by introducing a Dual Multi-head self-attention (DMSA), which leverages visual and semantic features across different scales, allowing them to guide and complement each other. Besides, a Cross-Scale DMSA block is introduced to capture the residual connection, integrating cross-layer information to further enhance image detail. Experimental results show that ECAFormer reaches competitive performance across multiple benchmarks, yielding nearly a 3% improvement in PSNR over the suboptimal method, demonstrating the effectiveness of information interaction in LLIE.

8/21/2024

A Lightweight Low-Light Image Enhancement Network via Channel Prior and Gamma Correction

Shyang-En Weng, Shaou-Gang Miaou, Ricky Christanto

Human vision relies heavily on available ambient light to perceive objects. Low-light scenes pose two distinct challenges: information loss due to insufficient illumination and undesirable brightness shifts. Low-light image enhancement (LLIE) refers to image enhancement technology tailored to handle this scenario. We introduce CPGA-Net, an innovative LLIE network that combines dark/bright channel priors and gamma correction via deep learning and integrates features inspired by the Atmospheric Scattering Model and the Retinex Theory. This approach combines the use of traditional and deep learning methodologies, designed within a simple yet efficient architectural framework that focuses on essential feature extraction. The resulting CPGA-Net is a lightweight network with only 0.025 million parameters and 0.030 seconds for inference time, yet it achieves superior performance over existing LLIE methods on both objective and subjective evaluation criteria. Furthermore, we utilized knowledge distillation with explainable factors and proposed an efficient version that achieves 0.018 million parameters and 0.006 seconds for inference time. The proposed approaches inject new solution ideas into LLIE, providing practical applications in challenging low-light scenarios.

7/12/2024

Latent Disentanglement for Low Light Image Enhancement

Zhihao Zheng, Mooi Choo Chuah

Many learning-based low-light image enhancement (LLIE) algorithms are based on the Retinex theory. However, the Retinex-based decomposition techniques in such models introduce corruptions which limit their enhancement performance. In this paper, we propose a Latent Disentangle-based Enhancement Network (LDE-Net) for low light vision tasks. The latent disentanglement module disentangles the input image in latent space such that no corruption remains in the disentangled Content and Illumination components. For LLIE task, we design a Content-Aware Embedding (CAE) module that utilizes Content features to direct the enhancement of the Illumination component. For downstream tasks (e.g. nighttime UAV tracking and low-light object detection), we develop an effective light-weight enhancer based on the latent disentanglement framework. Comprehensive quantitative and qualitative experiments demonstrate that our LDE-Net significantly outperforms state-of-the-art methods on various LLIE benchmarks. In addition, the great results obtained by applying our framework on the downstream tasks also demonstrate the usefulness of our latent disentanglement design.

8/13/2024

Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations

Tom'av{s} Chobola, Yu Liu, Hanyi Zhang, Julia A. Schnabel, Tingying Peng

Current deep learning-based low-light image enhancement methods often struggle with high-resolution images, and fail to meet the practical demands of visual perception across diverse and unseen scenarios. In this paper, we introduce a novel approach termed CoLIE, which redefines the enhancement process through mapping the 2D coordinates of an underexposed image to its illumination component, conditioned on local context. We propose a reconstruction of enhanced-light images within the HSV space utilizing an implicit neural function combined with an embedded guided filter, thereby significantly reducing computational overhead. Moreover, we introduce a single image-based training loss function to enhance the model's adaptability to various scenes, further enhancing its practical applicability. Through rigorous evaluations, we analyze the properties of our proposed framework, demonstrating its superiority in both image quality and scene adaptability. Furthermore, our evaluation extends to applications in downstream tasks within low-light scenarios, underscoring the practical utility of CoLIE. The source code is available at https://github.com/ctom2/colie.

7/18/2024