Patch of Invisibility: Naturalistic Physical Black-Box Adversarial Attacks on Object Detectors

Read original: arXiv:2303.04238 - Published 8/20/2024 by Raz Lapid, Eylon Mizrahi, Moshe Sipper

🌿

Overview

Adversarial attacks on deep learning models have been a growing area of research.
Most work has focused on white-box attacks, where the attacker has access to the model's internal parameters, which is unrealistic in the real world.
Some attacks use the entire pixel space to fool a model, which is impractical and not physical (i.e., real-world).
This paper proposes a direct, black-box, gradient-free method that uses a pretrained generative adversarial network (GAN) to generate naturalistic physical adversarial patches for object detectors.
This is the first method that performs black-box physical attacks directly on object detection models, resulting in a model-agnostic attack.

Plain English Explanation

<aimodels.fyi/papers/arxiv/bb-patch-blackbox-adversarial-patch-attack-using>Adversarial attacks</aimodels.fyi/papers/arxiv/bb-patch-blackbox-adversarial-patch-attack-using> are a way of fooling deep learning models, like object detectors, by making small changes to an image that humans can't see but the model thinks is something else. Most of these attacks so far have assumed the attacker has full access to the inner workings of the model, which is not realistic in real-world situations.

<aimodels.fyi/papers/arxiv/pad-patch-agnostic-defense-against-adversarial-patch>Some attacks have also tried to fool the model by changing the entire image</aimodels.fyi/papers/arxiv/pad-patch-agnostic-defense-against-adversarial-patch>, but that's not practical or possible in the real world. Instead, this paper proposes a new way to create small, sticker-like patches that can be added to an image to fool the object detector, without needing to know how the detector works on the inside.

<aimodels.fyi/papers/arxiv/mvpatch-more-vivid-patch-adversarial-camouflaged-attacks>The key idea is to use a pretrained "generative adversarial network" (GAN)</aimodels.fyi/papers/arxiv/mvpatch-more-vivid-patch-adversarial-camouflaged-attacks>, which is a type of AI model that can generate realistic-looking images. The researchers use the GAN to create these adversarial patches that can fool object detectors, without needing to know how the detectors work.

This is the first time a method like this has been used to attack object detectors directly, in a way that works both digitally and in the real world. The researchers show their approach outperforms other black-box attack methods that have been tried.

Technical Explanation

<aimodels.fyi/papers/arxiv/multi-view-black-box-physical-attacks-infrared>The paper proposes a direct, black-box, gradient-free method to generate physical adversarial patches for object detectors</aimodels.fyi/papers/arxiv/multi-view-black-box-physical-attacks-infrared>. Unlike previous white-box attacks that require access to the model's internal parameters, this approach is model-agnostic and only uses the input and output of the target object detector.

The key innovation is the use of a pretrained generative adversarial network (GAN) to generate the adversarial patches. The GAN learns the distribution of natural images, and the researchers then optimize patches within this learned image manifold to reliably fool the object detector, both digitally and physically.

The paper compares this approach against four different black-box attack methods across several object detection datasets and models. The results show the proposed method significantly outperforms the other approaches in both digital and physical attacks.

Critical Analysis

The paper presents a novel and effective technique for generating adversarial patches that can fool object detectors in the real world, without needing to know the internal details of the target model. This is an important step forward, as real-world attacks need to be feasible with limited information about the target system.

However, the paper does not address how these adversarial patches could be defended against. While the method is model-agnostic, there may be ways to detect or remove these types of adversarial perturbations that are not covered. Additionally, the paper only evaluates the attack on a limited set of object detection models and datasets.

Further research is needed to understand the broader implications and limitations of this black-box patching approach. Investigating potential countermeasures and testing the attack on a wider range of models and real-world scenarios would help provide a more comprehensive understanding of this technique and its impact.

Conclusion

This paper presents a novel black-box approach to generating adversarial patches that can reliably fool object detectors, both digitally and physically. By leveraging a pretrained generative adversarial network, the method can create naturalistic-looking patches without needing to know the internal details of the target model.

<aimodels.fyi/papers/arxiv/defending-against-physical-adversarial-patch-attacks-infrared>This is an important step forward in making adversarial attacks more practical and realistic in the real world</aimodels.fyi/papers/arxiv/defending-against-physical-adversarial-patch-attacks-infrared>. However, further research is needed to understand the broader implications and potential defenses against this type of attack. Overall, this work contributes significant advancements to the field of adversarial machine learning.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌿

Patch of Invisibility: Naturalistic Physical Black-Box Adversarial Attacks on Object Detectors

Raz Lapid, Eylon Mizrahi, Moshe Sipper

Adversarial attacks on deep-learning models have been receiving increased attention in recent years. Work in this area has mostly focused on gradient-based techniques, so-called white-box attacks, wherein the attacker has access to the targeted model's internal parameters; such an assumption is usually unrealistic in the real world. Some attacks additionally use the entire pixel space to fool a given model, which is neither practical nor physical (i.e., real-world). On the contrary, we propose herein a direct, black-box, gradient-free method that uses the learned image manifold of a pretrained generative adversarial network (GAN) to generate naturalistic physical adversarial patches for object detectors. To our knowledge this is the first and only method that performs black-box physical attacks directly on object-detection models, which results with a model-agnostic attack. We show that our proposed method works both digitally and physically. We compared our approach against four different black-box attacks with different configurations. Our approach outperformed all other approaches that were tested in our experiments by a large margin.

8/20/2024

BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization

Satyadwyoom Kumar, Saurabh Gupta, Arun Balaji Buduru

Deep Learning has become popular due to its vast applications in almost all domains. However, models trained using deep learning are prone to failure for adversarial samples and carry a considerable risk in sensitive applications. Most of these adversarial attack strategies assume that the adversary has access to the training data, the model parameters, and the input during deployment, hence, focus on perturbing the pixel level information present in the input image. Adversarial Patches were introduced to the community which helped in bringing out the vulnerability of deep learning models in a much more pragmatic manner but here the attacker has a white-box access to the model parameters. Recently, there has been an attempt to develop these adversarial attacks using black-box techniques. However, certain assumptions such as availability large training data is not valid for a real-life scenarios. In a real-life scenario, the attacker can only assume the type of model architecture used from a select list of state-of-the-art architectures while having access to only a subset of input dataset. Hence, we propose an black-box adversarial attack strategy that produces adversarial patches which can be applied anywhere in the input image to perform an adversarial attack.

5/13/2024

🔮

PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Lihua Jing, Rui Wang, Wenqi Ren, Xin Dong, Cong Zou

Adversarial patch attacks present a significant threat to real-world object detectors due to their practical feasibility. Existing defense methods, which rely on attack data or prior knowledge, struggle to effectively address a wide range of adversarial patches. In this paper, we show two inherent characteristics of adversarial patches, semantic independence and spatial heterogeneity, independent of their appearance, shape, size, quantity, and location. Semantic independence indicates that adversarial patches operate autonomously within their semantic context, while spatial heterogeneity manifests as distinct image quality of the patch area that differs from original clean image due to the independent generation process. Based on these observations, we propose PAD, a novel adversarial patch localization and removal method that does not require prior knowledge or additional training. PAD offers patch-agnostic defense against various adversarial patches, compatible with any pre-trained object detectors. Our comprehensive digital and physical experiments involving diverse patch types, such as localized noise, printable, and naturalistic patches, exhibit notable improvements over state-of-the-art works. Our code is available at https://github.com/Lihua-Jing/PAD.

4/26/2024

🎯

Multi-View Black-Box Physical Attacks on Infrared Pedestrian Detectors Using Adversarial Infrared Grid

Kalibinuer Tiliwalidi, Chengyin Hu, Weiwen Shi

While extensive research exists on physical adversarial attacks within the visible spectrum, studies on such techniques in the infrared spectrum are limited. Infrared object detectors are vital in modern technological applications but are susceptible to adversarial attacks, posing significant security threats. Previous studies using physical perturbations like light bulb arrays and aerogels for white-box attacks, or hot and cold patches for black-box attacks, have proven impractical or limited in multi-view support. To address these issues, we propose the Adversarial Infrared Grid (AdvGrid), which models perturbations in a grid format and uses a genetic algorithm for black-box optimization. These perturbations are cyclically applied to various parts of a pedestrian's clothing to facilitate multi-view black-box physical attacks on infrared pedestrian detectors. Extensive experiments validate AdvGrid's effectiveness, stealthiness, and robustness. The method achieves attack success rates of 80.00% in digital environments and 91.86% in physical environments, outperforming baseline methods. Additionally, the average attack success rate exceeds 50% against mainstream detectors, demonstrating AdvGrid's robustness. Our analyses include ablation studies, transfer attacks, and adversarial defenses, confirming the method's superiority.

7/9/2024