Npix2Cpix: A GAN-based Image-to-Image Translation Network with Retrieval-Classification Integration for Watermark Retrieval from Historical Document Images

Read original: arXiv:2406.03556 - Published 9/17/2024 by Utsab Saha, Sawradip Saha, Shaikh Anowarul Fattah, Mohammad Saquib

Npix2Cpix: A GAN-based Image-to-Image Translation Network with Retrieval-Classification Integration for Watermark Retrieval from Historical Document Images

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Npix2Cpix: A GAN-based Image-to-Image Translation Network with Retrieval-Classification Integration for Watermark Retrieval from Historical Document Images

Utsab Saha, Sawradip Saha, Shaikh Anowarul Fattah, Mohammad Saquib

The identification and restoration of ancient watermarks have long been a major topic in codicology and history. Classifying historical documents based on watermarks is challenging due to their diversity, noisy samples, multiple representation modes, and minor distinctions between classes and intra-class variations. This paper proposes a modified U-net-based conditional generative adversarial network (GAN) named Npix2Cpix to translate noisy raw historical watermarked images into clean, handwriting-free watermarked images by performing image translation from degraded (noisy) pixels to clean pixels. Using image-to-image translation and adversarial learning, the network creates clutter-free images for watermark restoration and categorization. The generator and discriminator of the proposed GAN are trained using two separate loss functions, each based on the distance between images, to learn the mapping from the input noisy image to the output clean image. After using the proposed GAN to pre-process noisy watermarked images, Siamese-based one-shot learning is employed for watermark classification. Experimental results on a large-scale historical watermark dataset demonstrate that cleaning the noisy watermarked images can help to achieve high one-shot classification accuracy. The qualitative and quantitative evaluation of the retrieved watermarked image highlights the effectiveness of the proposed approach.

9/17/2024

Novel Hybrid Integrated Pix2Pix and WGAN Model with Gradient Penalty for Binary Images Denoising

Luca Tirel, Ali Mohamed Ali, Hashim A. Hashim

This paper introduces a novel approach to image denoising that leverages the advantages of Generative Adversarial Networks (GANs). Specifically, we propose a model that combines elements of the Pix2Pix model and the Wasserstein GAN (WGAN) with Gradient Penalty (WGAN-GP). This hybrid framework seeks to capitalize on the denoising capabilities of conditional GANs, as demonstrated in the Pix2Pix model, while mitigating the need for an exhaustive search for optimal hyperparameters that could potentially ruin the stability of the learning process. In the proposed method, the GAN's generator is employed to produce denoised images, harnessing the power of a conditional GAN for noise reduction. Simultaneously, the implementation of the Lipschitz continuity constraint during updates, as featured in WGAN-GP, aids in reducing susceptibility to mode collapse. This innovative design allows the proposed model to benefit from the strong points of both Pix2Pix and WGAN-GP, generating superior denoising results while ensuring training stability. Drawing on previous work on image-to-image translation and GAN stabilization techniques, the proposed research highlights the potential of GANs as a general-purpose solution for denoising. The paper details the development and testing of this model, showcasing its effectiveness through numerical experiments. The dataset was created by adding synthetic noise to clean images. Numerical results based on real-world dataset validation underscore the efficacy of this approach in image-denoising tasks, exhibiting significant enhancements over traditional techniques. Notably, the proposed model demonstrates strong generalization capabilities, performing effectively even when trained with synthetic noise.

8/1/2024

🖼️

Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

Zhenglin Li, Bo Guan, Yuanzhou Wei, Yiming Zhou, Jingyu Zhang, Jinxin Xu

Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images, and enhanced by a tailored training regimen. The results demonstrate the model's capability to accurately render complex urban features, establishing its efficacy and potential for broad real-world applications.

5/2/2024

🌀

Fingerprinting Image-to-Image Generative Adversarial Networks

Guanlin Li, Guowen Xu, Han Qiu, Shangwei Guo, Run Wang, Jiwei Li, Tianwei Zhang, Rongxing Lu

Generative Adversarial Networks (GANs) have been widely used in various application scenarios. Since the production of a commercial GAN requires substantial computational and human resources, the copyright protection of GANs is urgently needed. This paper presents a novel fingerprinting scheme for the Intellectual Property (IP) protection of image-to-image GANs based on a trusted third party. We break through the stealthiness and robustness bottlenecks suffered by previous fingerprinting methods for classification models being naively transferred to GANs. Specifically, we innovatively construct a composite deep learning model from the target GAN and a classifier. Then we generate fingerprint samples from this composite model, and embed them in the classifier for effective ownership verification. This scheme inspires some concrete methodologies to practically protect the modern image-to-image translation GANs. Theoretical analysis proves that these methods can satisfy different security requirements necessary for IP protection. We also conduct extensive experiments to show that our solutions outperform existing strategies.

8/9/2024