Mamba-UIE: Enhancing Underwater Images with Physical Model Constraint

Read original: arXiv:2407.19248 - Published 8/1/2024 by Song Zhang, Yuqing Duan, Daoliang Li, Ran Zhao

📈

Overview

Underwater image enhancement (UIE) is a challenging task due to inherent limitations of convolutional neural networks (CNNs) in modeling long-range dependencies and less effective recovery of global features.
While Transformers excel at modeling long-range dependencies, their quadratic computational complexity with increasing image resolution presents efficiency challenges.
Most supervised learning methods lack effective physical model constraint, leading to insufficient realism and overfitting in generated images.

Plain English Explanation

Underwater images often suffer from poor quality due to factors like lighting, water turbidity, and other environmental conditions. Convolutional neural networks (CNNs) are commonly used for image enhancement, but they struggle to capture long-range dependencies and global features in underwater scenes.

Transformers, on the other hand, are better at modeling these long-range relationships. However, as the image resolution increases, Transformers become computationally very expensive, making them impractical for real-world applications.

Additionally, many existing image enhancement methods rely on supervised learning, which can lead to generated images that lack realism and are prone to overfitting the training data.

Technical Explanation

To address these challenges, the researchers propose a new framework called Mamba-UIE, which introduces a physical model constraint-based approach to underwater image enhancement.

The key idea is to decompose the input image into four components: underwater scene radiance, direct transmission map, backscatter transmission map, and global background light. These components are then reassembled according to an improved underwater image formation model, and a reconstruction consistency constraint is applied between the reconstructed image and the original image. This helps to ensure that the enhanced image adheres to the underlying physical principles of underwater image formation.

To handle the computational complexity of Transformers, Mamba-UIE uses a linear complexity state space model (SSM) to model long-range dependencies. The Mamba-UIE network combines this SSM-based approach with a CNN backbone, allowing it to capture both long-range and local features efficiently.

Critical Analysis

The authors acknowledge that their proposed Mamba-UIE framework assumes a simplified underwater image formation model, which may not fully capture all the complexities of real-world underwater environments. Additionally, the evaluation is limited to three public datasets, and further testing on a wider range of underwater scenes would be beneficial to assess the generalization capabilities of the method.

While the researchers demonstrate impressive performance improvements over existing state-of-the-art methods, it would be valuable to explore the sensitivity of Mamba-UIE to factors like water turbidity, camera depth, and lighting conditions. Understanding the limitations and failure modes of the system could guide future research directions.

Conclusion

The Mamba-UIE framework presents a novel approach to underwater image enhancement that combines physical model constraints with efficient deep learning architectures. By decomposing the image into key components and leveraging linear complexity state space models, the method is able to effectively capture both long-range and local features, leading to realistic and visually appealing enhanced underwater images.

This research represents an important step forward in addressing the challenges of underwater image enhancement and could have significant implications for a wide range of underwater applications, such as marine exploration, underwater robotics, and environmental monitoring.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📈

Mamba-UIE: Enhancing Underwater Images with Physical Model Constraint

Song Zhang, Yuqing Duan, Daoliang Li, Ran Zhao

In underwater image enhancement (UIE), convolutional neural networks (CNN) have inherent limitations in modeling long-range dependencies and are less effective in recovering global features. While Transformers excel at modeling long-range dependencies, their quadratic computational complexity with increasing image resolution presents significant efficiency challenges. Additionally, most supervised learning methods lack effective physical model constraint, which can lead to insufficient realism and overfitting in generated images. To address these issues, we propose a physical model constraint-based underwater image enhancement framework, Mamba-UIE. Specifically, we decompose the input image into four components: underwater scene radiance, direct transmission map, backscatter transmission map, and global background light. These components are reassembled according to the revised underwater image formation model, and the reconstruction consistency constraint is applied between the reconstructed image and the original image, thereby achieving effective physical constraint on the underwater image enhancement process. To tackle the quadratic computational complexity of Transformers when handling long sequences, we introduce the Mamba-UIE network based on linear complexity state space models. By incorporating the Mamba in Convolution block, long-range dependencies are modeled at both the channel and spatial levels, while the CNN backbone is retained to recover local features and details. Extensive experiments on three public datasets demonstrate that our proposed Mamba-UIE outperforms existing state-of-the-art methods, achieving a PSNR of 27.13 and an SSIM of 0.93 on the UIEB dataset. Our method is available at https://github.com/zhangsong1213/Mamba-UIE.

8/1/2024

🤷

MambaUIE&SR: Unraveling the Ocean's Secrets with Only 2.8 FLOPs

Zhihao Chen, Yiyuan Ge

Underwater Image Enhancement (UIE) techniques aim to address the problem of underwater image degradation due to light absorption and scattering. In recent years, both Convolution Neural Network (CNN)-based and Transformer-based methods have been widely explored. In addition, combining CNN and Transformer can effectively combine global and local information for enhancement. However, this approach is still affected by the secondary complexity of the Transformer and cannot maximize the performance. Recently, the state-space model (SSM) based architecture Mamba has been proposed, which excels in modeling long distances while maintaining linear complexity. This paper explores the potential of this SSM-based model for UIE from both efficiency and effectiveness perspectives. However, the performance of directly applying Mamba is poor because local fine-grained features, which are crucial for image enhancement, cannot be fully utilized. Specifically, we customize the MambaUIE architecture for efficient UIE. Specifically, we introduce visual state space (VSS) blocks to capture global contextual information at the macro level while mining local information at the micro level. Also, for these two kinds of information, we propose a Dynamic Interaction Block (DIB) and Spatial feed-forward Network (SGFN) for intra-block feature aggregation. MambaUIE is able to efficiently synthesize global and local information and maintains a very small number of parameters with high accuracy. Experiments on UIEB datasets show that our method reduces GFLOPs by 67.4% (2.715G) relative to the SOTA method. To the best of our knowledge, this is the first UIE model constructed based on SSM that breaks the limitation of FLOPs on accuracy in UIE. The official repository of MambaUIE at https://github.com/1024AILab/MambaUIE.

5/27/2024

WaterMamba: Visual State Space Model for Underwater Image Enhancement

Meisheng Guan, Haiyong Xu, Gangyi Jiang, Mei Yu, Yeyao Chen, Ting Luo, Yang Song

Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water. To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed. However, CNN-based UIE methods are limited in modeling long-range dependencies, and Transformer-based methods involve a large number of parameters and complex self-attention mechanisms, posing efficiency challenges. Considering computational complexity and severe underwater image degradation, a state space model (SSM) with linear computational complexity for UIE, named WaterMamba, is proposed. We propose spatial-channel omnidirectional selective scan (SCOSS) blocks comprising spatial-channel coordinate omnidirectional selective scan (SCCOSS) modules and a multi-scale feedforward network (MSFFN). The SCOSS block models pixel and channel information flow, addressing dependencies. The MSFFN facilitates information flow adjustment and promotes synchronized operations within SCCOSS modules. Extensive experiments showcase WaterMamba's cutting-edge performance with reduced parameters and computational resources, outperforming state-of-the-art methods on various datasets, validating its effectiveness and generalizability. The code will be released on GitHub after acceptance.

5/15/2024

PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement

Wei-Tung Lin, Yong-Xiang Lin, Jyun-Wei Chen, Kai-Lung Hua

Underwater Image Enhancement (UIE) is critical for marine research and exploration but hindered by complex color distortions and severe blurring. Recent deep learning-based methods have achieved remarkable results, yet these methods struggle with high computational costs and insufficient global modeling, resulting in locally under- or over- adjusted regions. We present PixMamba, a novel architecture, designed to overcome these challenges by leveraging State Space Models (SSMs) for efficient global dependency modeling. Unlike convolutional neural networks (CNNs) with limited receptive fields and transformer networks with high computational costs, PixMamba efficiently captures global contextual information while maintaining computational efficiency. Our dual-level strategy features the patch-level Efficient Mamba Net (EMNet) for reconstructing enhanced image feature and the pixel-level PixMamba Net (PixNet) to ensure fine-grained feature capturing and global consistency of enhanced image that were previously difficult to obtain. PixMamba achieves state-of-the-art performance across various underwater image datasets and delivers visually superior results. Code is available at: https://github.com/weitunglin/pixmamba.

6/13/2024