BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks

Read original: arXiv:2404.00924 - Published 5/28/2024 by Zhiyuan Cheng, Zhaoyi Liu, Tengda Guo, Shiwei Feng, Dongfang Liu, Mingjie Tang, Xiangyu Zhang

BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks

Overview

This paper introduces a new black-box adversarial patch attack called "BadPart" that can effectively target pixel-wise regression tasks like monocular depth estimation.
The authors propose a unified optimization framework that can generate adversarial patches to fool a wide range of pixel-wise regression models.
The paper also presents a novel defense mechanism called "PAD" that can help protect against these adversarial patch attacks.

Plain English Explanation

The paper discusses a type of attack called an "adversarial patch attack" that can be used to fool computer vision models, particularly those used for tasks like estimating the depth of an image. These attacks involve creating a small, visually unassuming patch that can be added to an image to trick the model into producing incorrect outputs.

The key innovation in this paper is a new attack method called "BadPart" that can generate these adversarial patches in a more effective and unified way, allowing them to work against a wide variety of depth estimation models, even in a "black-box" setting where the attacker doesn't have full access to the model. This makes the attack more practical and concerning from a security perspective.

To address this threat, the authors also propose a new defense mechanism called "PAD" that can help protect depth estimation models from these adversarial patch attacks. This is an important step in making these computer vision systems more robust and secure.

Overall, this research highlights the ongoing challenge of making AI systems, especially those used for sensitive applications, resistant to malicious attacks. The work explores new attack techniques and defenses, contributing to our understanding of the adversarial robustness of these models.

Technical Explanation

The paper introduces a new black-box adversarial patch attack called "BadPart" that can effectively target pixel-wise regression tasks like monocular depth estimation. The authors propose a unified optimization framework that can generate adversarial patches to fool a wide range of pixel-wise regression models, even in a black-box setting where the attacker has limited access to the target model.

The key technical contributions of the paper include:

A unified optimization formulation for generating adversarial patches that can be applied to various pixel-wise regression tasks, including depth estimation, surface normal estimation, and semantic segmentation.
A black-box optimization strategy that can effectively craft adversarial patches without requiring full access to the target model's architecture or parameters.
Extensive evaluations on various depth estimation models, demonstrating the effectiveness of the BadPart attack in fooling these systems.

To address the threat of these adversarial patch attacks, the paper also introduces a novel defense mechanism called "PAD" (Patch-Agnostic Defense). PAD aims to make depth estimation models more robust to a wide range of adversarial patches, including those generated by the BadPart attack.

Critical Analysis

The paper makes valuable contributions to the field of adversarial robustness, particularly in the context of pixel-wise regression tasks like depth estimation. The proposed BadPart attack is a significant advancement, as it can generate effective adversarial patches in a black-box setting, which is a more realistic and challenging scenario compared to previous white-box attacks.

However, the paper also acknowledges some limitations of the research. For example, the BadPart attack may not be as effective against more advanced depth estimation models that employ techniques like diffusion models [object Object]. Additionally, the PAD defense mechanism may not be as effective against more sophisticated adversarial patch attacks, such as those that use tiling techniques [object Object].

Further research is needed to explore the robustness of depth estimation models against a wider range of adversarial attacks, including those that target physical-world scenarios [object Object]. Additionally, the development of more comprehensive defense mechanisms that can withstand a diverse set of adversarial threats would be valuable [object Object].

Conclusion

The "BadPart" paper introduces a new black-box adversarial patch attack that can effectively target pixel-wise regression tasks, such as monocular depth estimation. The authors propose a unified optimization framework for generating adversarial patches and demonstrate the attack's effectiveness through extensive evaluations.

To address this threat, the paper also presents a novel defense mechanism called "PAD" that aims to make depth estimation models more robust to a wide range of adversarial patches. This research contributes to the ongoing efforts to improve the adversarial robustness of computer vision systems, which is crucial for their secure and reliable deployment in real-world applications.

While the paper makes valuable advancements, further research is needed to explore the limitations of the proposed attack and defense, as well as to develop more comprehensive solutions that can withstand a diverse set of adversarial threats. Continued progress in this area will help ensure the safety and reliability of critical AI-powered systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks

Zhiyuan Cheng, Zhaoyi Liu, Tengda Guo, Shiwei Feng, Dongfang Liu, Mingjie Tang, Xiangyu Zhang

Pixel-wise regression tasks (e.g., monocular depth estimation (MDE) and optical flow estimation (OFE)) have been widely involved in our daily life in applications like autonomous driving, augmented reality and video composition. Although certain applications are security-critical or bear societal significance, the adversarial robustness of such models are not sufficiently studied, especially in the black-box scenario. In this work, we introduce the first unified black-box adversarial patch attack framework against pixel-wise regression tasks, aiming to identify the vulnerabilities of these models under query-based black-box attacks. We propose a novel square-based adversarial patch optimization framework and employ probabilistic square sampling and score-based gradient estimation techniques to generate the patch effectively and efficiently, overcoming the scalability problem of previous black-box patch attacks. Our attack prototype, named BadPart, is evaluated on both MDE and OFE tasks, utilizing a total of 7 models. BadPart surpasses 3 baseline methods in terms of both attack performance and efficiency. We also apply BadPart on the Google online service for portrait depth estimation, causing 43.5% relative distance error with 50K queries. State-of-the-art (SOTA) countermeasures cannot defend our attack effectively.

5/28/2024

BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization

Satyadwyoom Kumar, Saurabh Gupta, Arun Balaji Buduru

Deep Learning has become popular due to its vast applications in almost all domains. However, models trained using deep learning are prone to failure for adversarial samples and carry a considerable risk in sensitive applications. Most of these adversarial attack strategies assume that the adversary has access to the training data, the model parameters, and the input during deployment, hence, focus on perturbing the pixel level information present in the input image. Adversarial Patches were introduced to the community which helped in bringing out the vulnerability of deep learning models in a much more pragmatic manner but here the attacker has a white-box access to the model parameters. Recently, there has been an attempt to develop these adversarial attacks using black-box techniques. However, certain assumptions such as availability large training data is not valid for a real-life scenarios. In a real-life scenario, the attacker can only assume the type of model architecture used from a select list of state-of-the-art architectures while having access to only a subset of input dataset. Hence, we propose an black-box adversarial attack strategy that produces adversarial patches which can be applied anywhere in the input image to perform an adversarial attack.

5/13/2024

Physical Adversarial Attack on Monocular Depth Estimation via Shape-Varying Patches

Chenxing Zhao, Yang Li, Shihao Wu, Wenyi Tan, Shuangju Zhou, Quan Pan

Adversarial attacks against monocular depth estimation (MDE) systems pose significant challenges, particularly in safety-critical applications such as autonomous driving. Existing patch-based adversarial attacks for MDE are confined to the vicinity of the patch, making it difficult to affect the entire target. To address this limitation, we propose a physics-based adversarial attack on monocular depth estimation, employing a framework called Attack with Shape-Varying Patches (ASP), aiming to optimize patch content, shape, and position to maximize effectiveness. We introduce various mask shapes, including quadrilateral, rectangular, and circular masks, to enhance the flexibility and efficiency of the attack. Furthermore, we propose a new loss function to extend the influence of the patch beyond the overlapping regions. Experimental results demonstrate that our attack method generates an average depth error of 18 meters on the target car with a patch area of 1/9, affecting over 98% of the target area.

7/25/2024

🌿

Patch of Invisibility: Naturalistic Physical Black-Box Adversarial Attacks on Object Detectors

Raz Lapid, Eylon Mizrahi, Moshe Sipper

Adversarial attacks on deep-learning models have been receiving increased attention in recent years. Work in this area has mostly focused on gradient-based techniques, so-called white-box attacks, wherein the attacker has access to the targeted model's internal parameters; such an assumption is usually unrealistic in the real world. Some attacks additionally use the entire pixel space to fool a given model, which is neither practical nor physical (i.e., real-world). On the contrary, we propose herein a direct, black-box, gradient-free method that uses the learned image manifold of a pretrained generative adversarial network (GAN) to generate naturalistic physical adversarial patches for object detectors. To our knowledge this is the first and only method that performs black-box physical attacks directly on object-detection models, which results with a model-agnostic attack. We show that our proposed method works both digitally and physically. We compared our approach against four different black-box attacks with different configurations. Our approach outperformed all other approaches that were tested in our experiments by a large margin.

8/20/2024