Exploring Efficient Asymmetric Blind-Spots for Self-Supervised Denoising in Real-World Scenarios

Read original: arXiv:2303.16783 - Published 4/12/2024 by Shiyan Chen, Jiyuan Zhang, Zhaofei Yu, Tiejun Huang

✅

Overview

Self-supervised denoising is a technique that can train models without using clean images, which is useful for real-world scenarios.
However, noise in the real world is often spatially correlated, causing issues for many self-supervised algorithms that assume independent noise.
Existing methods have tried to address this, such as downsampling or neighborhood masking, but these have drawbacks like aliasing effects or high computational complexity.

Plain English Explanation

In the real world, images often have noise that is not randomly scattered, but rather concentrated in certain areas. This can cause problems for many self-supervised denoising algorithms, which assume the noise is distributed evenly throughout the image.

Recent research has tried to address this by either downsampling the image or masking out neighborhoods of the image during training. However, downsampling can lead to a loss of detail, while neighborhood masking can be computationally complex.

The key insight from this paper is that the best way to handle real-world, spatially correlated noise is to train the denoising model at the original input resolution, and use asymmetric operations during training and inference. This helps preserve local spatial information while also suppressing the correlated noise.

Technical Explanation

This paper proposes a new model called the Asymmetric Tunable Blind-Spot Network (AT-BSN) for self-supervised denoising of real-world, spatially correlated noise. The key idea is to allow the size of the "blind spot" (i.e. the region of the image that is masked out during training) to be freely adjusted.

This balances the need to suppress noise correlation, while also preserving local spatial information in the image. The pre-trained AT-BSN is then used as a "meta-teacher" network to distill a smaller, more efficient student network using a blind-spot based multi-teacher distillation strategy.

Experiments show that this approach achieves state-of-the-art performance on multiple datasets, while also being more computationally efficient than other self-supervised denoising methods that rely on downsampling or complex neighborhood masking schemes.

Critical Analysis

The paper makes a strong case for the importance of handling spatially correlated noise in real-world self-supervised denoising tasks. The proposed AT-BSN model and multi-teacher distillation approach appear to be effective solutions, as evidenced by the impressive experimental results.

However, the paper does not discuss potential limitations or caveats of the method. For example, it's unclear how the approach would scale to very high-resolution images, or how sensitive the performance is to the choice of blind-spot size. Additionally, the computational complexity of the meta-teacher network and distillation process is not analyzed in depth.

Further research could explore ways to make the blind-spot tuning and distillation process more efficient, as well as investigate the model's robustness to different types of spatially correlated noise. Comparisons to other denoising techniques, such as filtering-based methods or diffusion-based approaches, could also provide additional insights.

Conclusion

This paper presents a novel self-supervised denoising method, AT-BSN, that is specifically designed to handle spatially correlated noise in real-world scenarios. By preserving local spatial information and using asymmetric operations, the model is able to achieve state-of-the-art performance while being more computationally efficient than previous approaches.

The key innovation is the use of a tunable blind-spot size, which allows for a better balance between noise correlation suppression and local spatial preservation. The meta-teacher distillation strategy further improves the model's efficiency, making it a promising solution for practical denoising applications.

Overall, this research highlights the importance of addressing spatially correlated noise in self-supervised denoising, and provides a compelling approach that could have significant impact on real-world image and video processing tasks.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✅

Exploring Efficient Asymmetric Blind-Spots for Self-Supervised Denoising in Real-World Scenarios

Shiyan Chen, Jiyuan Zhang, Zhaofei Yu, Tiejun Huang

Self-supervised denoising has attracted widespread attention due to its ability to train without clean images. However, noise in real-world scenarios is often spatially correlated, which causes many self-supervised algorithms that assume pixel-wise independent noise to perform poorly. Recent works have attempted to break noise correlation with downsampling or neighborhood masking. However, denoising on downsampled subgraphs can lead to aliasing effects and loss of details due to a lower sampling rate. Furthermore, the neighborhood masking methods either come with high computational complexity or do not consider local spatial preservation during inference. Through the analysis of existing methods, we point out that the key to obtaining high-quality and texture-rich results in real-world self-supervised denoising tasks is to train at the original input resolution structure and use asymmetric operations during training and inference. Based on this, we propose Asymmetric Tunable Blind-Spot Network (AT-BSN), where the blind-spot size can be freely adjusted, thus better balancing noise correlation suppression and image local spatial destruction during training and inference. In addition, we regard the pre-trained AT-BSN as a meta-teacher network capable of generating various teacher networks by sampling different blind-spots. We propose a blind-spot based multi-teacher distillation strategy to distill a lightweight network, significantly improving performance. Experimental results on multiple datasets prove that our method achieves state-of-the-art, and is superior to other self-supervised algorithms in terms of computational overhead and visual effects.

4/12/2024

Asymmetric Mask Scheme for Self-Supervised Real Image Denoising

Xiangyu Liao, Tianheng Zheng, Jiayu Zhong, Pingping Zhang, Chao Ren

In recent years, self-supervised denoising methods have gained significant success and become critically important in the field of image restoration. Among them, the blind spot network based methods are the most typical type and have attracted the attentions of a large number of researchers. Although the introduction of blind spot operations can prevent identity mapping from noise to noise, it imposes stringent requirements on the receptive fields in the network design, thereby limiting overall performance. To address this challenge, we propose a single mask scheme for self-supervised denoising training, which eliminates the need for blind spot operation and thereby removes constraints on the network structure design. Furthermore, to achieve denoising across entire image during inference, we propose a multi-mask scheme. Our method, featuring the asymmetric mask scheme in training and inference, achieves state-of-the-art performance on existing real noisy image datasets. All the source code will be made available to the public.

7/16/2024

TBSN: Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising

Junyi Li, Zhilu Zhang, Wangmeng Zuo

Blind-spot networks (BSN) have been prevalent network architectures in self-supervised image denoising (SSID). Existing BSNs are mostly conducted with convolution layers. Although transformers offer potential solutions to the limitations of convolutions and have demonstrated success in various image restoration tasks, their attention mechanisms may violate the blind-spot requirement, thus restricting their applicability in SSID. In this paper, we present a transformer-based blind-spot network (TBSN) by analyzing and redesigning the transformer operators that meet the blind-spot requirement. Specifically, TBSN follows the architectural principles of dilated BSNs, and incorporates spatial as well as channel self-attention layers to enhance the network capability. For spatial self-attention, an elaborate mask is applied to the attention matrix to restrict its receptive field, thus mimicking the dilated convolution. For channel self-attention, we observe that it may leak the blind-spot information when the channel number is greater than spatial size in the deep layers of multi-scale architectures. To eliminate this effect, we divide the channel into several groups and perform channel attention separately. Furthermore, we introduce a knowledge distillation strategy that distills TBSN into smaller denoisers to improve computational efficiency while maintaining performance. Extensive experiments on real-world image denoising datasets show that TBSN largely extends the receptive field and exhibits favorable performance against state-of-the-art SSID methods. The code and pre-trained models will be publicly available at https://github.com/nagejacob/TBSN.

4/12/2024

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

We introduce a novel approach to single image denoising based on the Blind Spot Denoising principle, which we call MAsked and SHuffled Blind Spot Denoising (MASH). We focus on the case of correlated noise, which often plagues real images. MASH is the result of a careful analysis to determine the relationships between the level of blindness (masking) of the input and the (unknown) noise correlation. Moreover, we introduce a shuffling technique to weaken the local correlation of noise, which in turn yields an additional denoising performance improvement. We evaluate MASH via extensive experiments on real-world noisy image datasets. We demonstrate on par or better results compared to existing self-supervised denoising methods.

4/16/2024