F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring

Read original: arXiv:2409.02056 - Published 9/4/2024 by Subhajit Paul, Sahil Kumawat, Ashutosh Gupta, Deepak Mishra

F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring

Overview

Proposes a novel image deblurring model called F2former that combines fractional Fourier transform, deep Wiener deconvolution, and selective frequency transformer
Achieves state-of-the-art performance on several benchmarks for blind and non-blind image deblurring
Outperforms previous methods in both quantitative and qualitative evaluations

Plain English Explanation

The paper introduces a new approach called F2former for improving image quality after blurring. Blurring can happen due to camera shake, object motion, or other factors, and can degrade the sharpness and clarity of images.

F2former works by using a combination of techniques:

Fractional Fourier Transform: This mathematical operation helps to better model the frequency content of the blurred image.
Deep Wiener Deconvolution: A deep learning-based approach that removes the blur and restores the original sharp image.
Selective Frequency Transformer: This component selectively processes different frequency bands of the image to improve the deblurring quality.

By combining these techniques, F2former is able to effectively recover the original sharp image from the blurred input, outperforming previous state-of-the-art methods.

Technical Explanation

The F2former model consists of three main components:

Fractional Fourier Transform (FrFT) Module: This module applies the fractional Fourier transform to the blurred input image. The FrFT is a generalization of the standard Fourier transform that can better model the frequency content of blurred images.
Deep Wiener Deconvolution (DWD) Module: This deep learning-based module performs deconvolution to remove the blur and restore the sharp image. It uses a Wiener filter-based approach, which is effective for both blind and non-blind deblurring.
Selective Frequency Transformer (SFT) Module: This module selectively processes different frequency bands of the image to further improve the deblurring quality. It uses a transformer-based architecture to adaptively attend to and enhance the relevant frequency components.

The F2former model is trained end-to-end using a combination of loss functions that encourage sharp reconstruction, preservation of high-frequency details, and consistency with the blurred input.

The authors evaluate F2former on several benchmark datasets for both blind and non-blind image deblurring, and demonstrate state-of-the-art performance compared to previous methods. Qualitative results show that F2former can effectively restore fine details and sharp edges in the deblurred images.

Critical Analysis

The paper provides a thorough technical description of the F2former model and its key components. The authors have conducted extensive experiments to validate the effectiveness of their approach, and the results are promising.

However, the paper does not discuss potential limitations or areas for further research. For example, it would be interesting to understand how F2former performs on real-world, complex blur scenarios, or how it might be adapted for video deblurring applications.

Additionally, the computational complexity and inference time of the model are not reported, which could be an important consideration for practical deployment. Further analysis of the model's efficiency and tradeoffs would help readers better understand its practical applications.

Conclusion

The F2former model presented in this paper represents a significant advancement in the field of image deblurring. By combining fractional Fourier transform, deep Wiener deconvolution, and selective frequency transformer, the authors have developed a powerful and effective solution for recovering sharp details from blurred images.

The state-of-the-art performance demonstrated on benchmark datasets suggests that F2former could have a meaningful impact on various applications that require high-quality image restoration, such as computational photography, medical imaging, and surveillance. As the authors continue to refine and expand their work, it will be exciting to see how F2former and similar frequency-domain approaches evolve to tackle even more challenging deblurring scenarios.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring

Subhajit Paul, Sahil Kumawat, Ashutosh Gupta, Deepak Mishra

Recent progress in image deblurring techniques focuses mainly on operating in both frequency and spatial domains using the Fourier transform (FT) properties. However, their performance is limited due to the dependency of FT on stationary signals and its lack of capability to extract spatial-frequency properties. In this paper, we propose a novel approach based on the Fractional Fourier Transform (FRFT), a unified spatial-frequency representation leveraging both spatial and frequency components simultaneously, making it ideal for processing non-stationary signals like images. Specifically, we introduce a Fractional Fourier Transformer (F2former), where we combine the classical fractional Fourier based Wiener deconvolution (F2WD) as well as a multi-branch encoder-decoder transformer based on a new fractional frequency aware transformer block (F2TB). We design F2TB consisting of a fractional frequency aware self-attention (F2SA) to estimate element-wise product attention based on important frequency components and a novel feed-forward network based on frequency division multiplexing (FM-FFN) to refine high and low frequency features separately for efficient latent clear image restoration. Experimental results for the cases of both motion deblurring as well as defocus deblurring show that the performance of our proposed method is superior to other state-of-the-art (SOTA) approaches.

9/4/2024

LoFormer: Local Frequency Transformer for Image Deblurring

Xintian Mao, Jiansheng Wang, Xingran Xie, Qingli Li, Yan Wang

Due to the computational complexity of self-attention (SA), prevalent techniques for image deblurring often resort to either adopting localized SA or employing coarse-grained global SA methods, both of which exhibit drawbacks such as compromising global modeling or lacking fine-grained correlation. In order to address this issue by effectively modeling long-range dependencies without sacrificing fine-grained details, we introduce a novel approach termed Local Frequency Transformer (LoFormer). Within each unit of LoFormer, we incorporate a Local Channel-wise SA in the frequency domain (Freq-LC) to simultaneously capture cross-covariance within low- and high-frequency local windows. These operations offer the advantage of (1) ensuring equitable learning opportunities for both coarse-grained structures and fine-grained details, and (2) exploring a broader range of representational properties compared to coarse-grained global SA methods. Additionally, we introduce an MLP Gating mechanism complementary to Freq-LC, which serves to filter out irrelevant features while enhancing global learning capabilities. Our experiments demonstrate that LoFormer significantly improves performance in the image deblurring task, achieving a PSNR of 34.09 dB on the GoPro dataset with 126G FLOPs. https://github.com/DeepMed-Lab-ECNU/Single-Image-Deblur

7/25/2024

🖼️

DEFormer: DCT-driven Enhancement Transformer for Low-light Image and Dark Vision

Xiangchen Yin, Zhenda Yu, Xin Gao, Xiao Sun

Low-light image enhancement restores colors and details of single image and improves high-level visual tasks. However, restoring the lost details in the dark area is a challenge by only relying on the RGB domain. In this paper, we introduce frequency as a new clue into the network and propose a DCT-driven enhancement transformer (DEFormer) framework. First, we propose a learnable frequency branch (LFB) for frequency enhancement contains DCT processing and curvature-based frequency enhancement (CFE) to represent frequency features. In addition, we propose a cross domain fusion (CDF) for reducing the differences between the RGB domain and the frequency domain. Our DEFormer has achieved advanced results in both the LOL and MIT-Adobe FiveK datasets and improved the performance of dark detection.

9/10/2024

📊

On a time-frequency blurring operator with applications in data augmentation

Simon Halvdansson

Inspired by the success of recent data augmentation methods for signals which act on time-frequency representations, we introduce an operator which convolves the short-time Fourier transform of a signal with a specified kernel. Analytical properties including boundedness, compactness and positivity are investigated from the perspective of time-frequency analysis. A convolutional neural network and a vision transformer are trained to classify audio signals using spectrograms with different augmentation setups, including the above mentioned time-frequency blurring operator, with results indicating that the operator can significantly improve test performance, especially in the data-starved regime.

5/22/2024