ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier Transformer

Read original: arXiv:2406.02559 - Published 7/4/2024 by Wei Dong, Han Zhou, Yuqiong Tian, Jingke Sun, Xiaohong Liu, Guangtao Zhai, Jun Chen

🤷

Overview

Introduces a mask-free Shadow Removal and Refinement network (ShadowRefiner) that effectively removes shadows from real-world images while preserving details
Employs a Shadow Removal module to learn effective mappings between shadow-affected and shadow-free images using spatial and frequency representation learning
Proposes a Fast-Fourier Attention based Transformer (FFAT) architecture for refinement to mitigate pixel misalignment and improve image quality
Achieves top performance in the Perceptual Track and second-best in the Fidelity Track of the NTIRE 2024 Image Shadow Removal Challenge

Plain English Explanation

Shadows in images can cause significant problems for computer vision systems like object detection and segmentation. ShadowRefiner aims to effectively remove shadows from real-world images while preserving important details and producing visually appealing results.

The key idea is to use a two-part approach. First, the Shadow Removal module learns how to convert shadow-affected images into shadow-free versions by analyzing both the spatial and frequency aspects of the image. This helps the system understand the relationship between the shadow areas and the underlying scene.

Then, the Fast-Fourier Attention based Transformer (FFAT) refines the output to further improve the image quality. This part of the system is designed to fix any misalignment of pixels that may have occurred during the shadow removal process.

The researchers tested this approach on a challenging shadow removal competition and achieved impressive results, winning the Perceptual Track and placing second in the Fidelity Track. This demonstrates the effectiveness of their Shadow Removal and Refinement network for addressing the problem of shadows in real-world images.

Technical Explanation

The paper introduces a mask-free Shadow Removal and Refinement network (ShadowRefiner) that leverages a Fast Fourier Transformer to effectively remove shadows from real-world images while preserving intricate details.

The Shadow Removal module aims to establish effective mappings between shadow-affected and shadow-free images by learning spatial and frequency representations. This allows the system to understand the relationship between the shadow areas and the underlying scene content. To further improve the image quality and mitigate pixel misalignment, the authors propose a novel Fast-Fourier Attention based Transformer (FFAT) architecture, which incorporates an innovative attention mechanism for meticulous refinement.

The proposed ShadowRefiner method outperformed other submissions in the Perceptual Track and achieved the second-best performance in the Fidelity Track of the NTIRE 2024 Image Shadow Removal Challenge. The researchers also conducted comprehensive experiments that demonstrate the effectiveness of their Adaptive Attention Driven Self Soft Shadow (DES3) approach.

Critical Analysis

The paper provides a compelling solution for removing shadows from real-world images while preserving important details. The authors' use of spatial and frequency representation learning, along with the FFAT architecture for refinement, appears to be a effective approach for addressing the challenge of shadow removal.

However, the paper does not discuss any potential limitations or caveats of the proposed method. For example, it would be helpful to understand how the ShadowRefiner network performs on a wider range of shadow types, lighting conditions, or image resolutions. Additionally, the researchers could explore the computational efficiency and inference speed of their approach, as these factors may be important for real-world applications.

Further research could also investigate the generalization capabilities of the ShadowRefiner network, such as its ability to handle unseen shadow patterns or transfer to different domains. Exploring the interpretability of the model's decision-making process could also provide valuable insights for practitioners.

Overall, the ShadowRefiner method represents a promising step forward in the field of shadow removal, but additional research and analysis could help uncover the limitations and further improve the technique.

Conclusion

The ShadowRefiner network introduced in this paper offers a compelling solution for effectively removing shadows from real-world images while preserving important details and producing visually compelling results. By leveraging spatial and frequency representation learning, as well as a novel FFAT architecture for refinement, the method achieved top performance in a challenging shadow removal competition.

The technical insights and experimental results demonstrated in this work suggest that the ShadowRefiner approach could have significant implications for various computer vision applications that are impacted by the presence of shadows, such as object detection and segmentation. Further research to explore the limitations and generalization capabilities of the method could help advance the field of shadow removal and unlock new possibilities for real-world image processing.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤷

ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier Transformer

Wei Dong, Han Zhou, Yuqiong Tian, Jingke Sun, Xiaohong Liu, Guangtao Zhai, Jun Chen

Shadow-affected images often exhibit pronounced spatial discrepancies in color and illumination, consequently degrading various vision applications including object detection and segmentation systems. To effectively eliminate shadows in real-world images while preserving intricate details and producing visually compelling outcomes, we introduce a mask-free Shadow Removal and Refinement network (ShadowRefiner) via Fast Fourier Transformer. Specifically, the Shadow Removal module in our method aims to establish effective mappings between shadow-affected and shadow-free images via spatial and frequency representation learning. To mitigate the pixel misalignment and further improve the image quality, we propose a novel Fast-Fourier Attention based Transformer (FFAT) architecture, where an innovative attention mechanism is designed for meticulous refinement. Our method wins the championship in the Perceptual Track and achieves the second best performance in the Fidelity Track of NTIRE 2024 Image Shadow Removal Challenge. Besides, comprehensive experiment result also demonstrate the compelling effectiveness of our proposed method. The code is publicly available: https://github.com/movingforward100/Shadow_R.

7/4/2024

ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal

Zhuohao Li, Guoyang Xie, Guannan Jiang, Zhichao Lu

Transformer recently emerged as the de facto model for computer vision tasks and has also been successfully applied to shadow removal. However, these existing methods heavily rely on intricate modifications to the attention mechanisms within the transformer blocks while using a generic patch embedding. As a result, it often leads to complex architectural designs requiring additional computation resources. In this work, we aim to explore the efficacy of incorporating shadow information within the early processing stage. Accordingly, we propose a transformer-based framework with a novel patch embedding that is tailored for shadow removal, dubbed ShadowMaskFormer. Specifically, we present a simple and effective mask-augmented patch embedding to integrate shadow information and promote the model's emphasis on acquiring knowledge for shadow regions. Extensive experiments conducted on the ISTD, ISTD+, and SRD benchmark datasets demonstrate the efficacy of our method against state-of-the-art approaches while using fewer model parameters.

5/1/2024

🔎

DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal

Ziyang Zhou, Yingtie Lei, Xuhang Chen, Shenghong Luo, Wenjun Zhang, Chi-Man Pun, Zhen Wang

Shadows in scanned documents pose significant challenges for document analysis and recognition tasks due to their negative impact on visual quality and readability. Current shadow removal techniques, including traditional methods and deep learning approaches, face limitations in handling varying shadow intensities and preserving document details. To address these issues, we propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid. By decomposing the shadow image into multiple frequency bands and employing two critical modules: the Attention-Aggregation Network for low-frequency shadow removal and the Gated Multi-scale Fusion Transformer for global refinement. DocDeshadower effectively removes shadows at different scales while preserving document content. Extensive experiments demonstrate DocDeshadower's superior performance compared to state-of-the-art methods, highlighting its potential to significantly improve document shadow removal techniques. The code is available at https://github.com/leiyingtie/DocDeshadower.

7/31/2024

SoftShadow: Leveraging Penumbra-Aware Soft Masks for Shadow Removal

Xinrui Wang, Lanqing Guo, Xiyu Wang, Siyu Huang, Bihan Wen

Recent advancements in deep learning have yielded promising results for the image shadow removal task. However, most existing methods rely on binary pre-generated shadow masks. The binary nature of such masks could potentially lead to artifacts near the boundary between shadow and non-shadow areas. In view of this, inspired by the physical model of shadow formation, we introduce novel soft shadow masks specifically designed for shadow removal. To achieve such soft masks, we propose a textit{SoftShadow} framework by leveraging the prior knowledge of pretrained SAM and integrating physical constraints. Specifically, we jointly tune the SAM and the subsequent shadow removal network using penumbra formation constraint loss and shadow removal loss. This framework enables accurate predictions of penumbra (partially shaded regions) and umbra (fully shaded regions) areas while simultaneously facilitating end-to-end shadow removal. Through extensive experiments on popular datasets, we found that our SoftShadow framework, which generates soft masks, can better restore boundary artifacts, achieve state-of-the-art performance, and demonstrate superior generalizability.

9/12/2024