Enhancing Adversarial Transferability Through Neighborhood Conditional Sampling

Read original: arXiv:2405.16181 - Published 5/28/2024 by Chunlin Qiu, Yiheng Duan, Lingchen Zhao, Qian Wang

Enhancing Adversarial Transferability Through Neighborhood Conditional Sampling

Overview

Focuses on improving the transferability of adversarial examples across different machine learning models
Introduces a new technique called "Neighborhood Conditional Sampling" to generate more transferable adversarial examples
Demonstrates improved attack performance compared to existing methods on various datasets and model architectures

Plain English Explanation

Adversarial examples are inputs that have been slightly modified in a way that tricks machine learning models into making incorrect predictions. This paper explores ways to make these adversarial examples more transferable, meaning they can be effective against a wider range of models, not just the one they were designed for.

The key idea is to use a technique called "Neighborhood Conditional Sampling" to generate the adversarial examples. This involves sampling from the neighborhood of the original input, while conditioning on the target class the attacker wants to misclassify the input as. This helps produce adversarial examples that are more likely to transfer across different models.

The researchers show that this approach outperforms previous methods for generating transferable adversarial examples. By making adversarial attacks more transferable, this work could have implications for enhancing transferability of adversarial examples and improving the robustness of AI systems against such attacks.

Technical Explanation

The paper proposes a new technique called "Neighborhood Conditional Sampling" (NCS) to generate adversarial examples that can more effectively transfer to different machine learning models.

The key steps are:

Sample a neighborhood around the original input image
Condition the sampling on the target class the attacker wants to misclassify the input as
Use the sampled adversarial example to update the original image towards the target class

This conditional sampling approach helps produce adversarial examples that preserve relevant features of the original input while effectively transferring to other models. The authors evaluate NCS on various image classification datasets and model architectures, demonstrating improved transferability compared to existing methods like adversarial example soups and transferability ranking.

Critical Analysis

The authors acknowledge several limitations of their work. First, the NCS approach requires additional computational overhead compared to simpler adversarial example generation methods. Additionally, the paper does not explore the properties that enable or prohibit transferability of adversarial examples. Further research is needed to fully understand the factors affecting transferability.

While the results demonstrate improved transferability, the paper does not provide a comprehensive survey of transferability across the field. Readers should consider this work in the broader context of ongoing research on this important challenge.

Conclusion

This paper introduces a new technique called Neighborhood Conditional Sampling to generate more transferable adversarial examples. By conditioning the sampling process on the target class, the approach produces adversarial examples that can more effectively fool a variety of machine learning models, not just the one they were designed for.

The improved transferability demonstrated in this work could have significant implications for enhancing the robustness of AI systems against adversarial attacks. However, the technique also has limitations in terms of computational overhead, and further research is needed to fully understand the factors affecting transferability of adversarial examples.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Enhancing Adversarial Transferability Through Neighborhood Conditional Sampling

Chunlin Qiu, Yiheng Duan, Lingchen Zhao, Qian Wang

Transfer-based attacks craft adversarial examples utilizing a white-box surrogate model to compromise various black-box target models, posing significant threats to many real-world applications. However, existing transfer attacks suffer from either weak transferability or expensive computation. To bridge the gap, we propose a novel sample-based attack, named neighborhood conditional sampling (NCS), which enjoys high transferability with lightweight computation. Inspired by the observation that flat maxima result in better transferability, NCS is formulated as a max-min bi-level optimization problem to seek adversarial regions with high expected adversarial loss and small standard deviations. Specifically, due to the inner minimization problem being computationally intensive to resolve, and affecting the overall transferability, we propose a momentum-based previous gradient inversion approximation (PGIA) method to effectively solve the inner problem without any computation cost. In addition, we prove that two newly proposed attacks, which achieve flat maxima for better transferability, are actually specific cases of NCS under particular conditions. Extensive experiments demonstrate that NCS efficiently generates highly transferable adversarial examples, surpassing the current best method in transferability while requiring only 50% of the computational cost. Additionally, NCS can be seamlessly integrated with other methods to further enhance transferability.

5/28/2024

Improving Adversarial Transferability with Neighbourhood Gradient Information

Haijing Guo, Jiafeng Wang, Zhaoyu Chen, Kaixun Jiang, Lingyi Hong, Pinxue Guo, Jinglun Li, Wenqiang Zhang

Deep neural networks (DNNs) are known to be susceptible to adversarial examples, leading to significant performance degradation. In black-box attack scenarios, a considerable attack performance gap between the surrogate model and the target model persists. This work focuses on enhancing the transferability of adversarial examples to narrow this performance gap. We observe that the gradient information around the clean image, i.e. Neighbourhood Gradient Information, can offer high transferability. Leveraging this, we propose the NGI-Attack, which incorporates Example Backtracking and Multiplex Mask strategies, to use this gradient information and enhance transferability fully. Specifically, we first adopt Example Backtracking to accumulate Neighbourhood Gradient Information as the initial momentum term. Multiplex Mask, which forms a multi-way attack strategy, aims to force the network to focus on non-discriminative regions, which can obtain richer gradient information during only a few iterations. Extensive experiments demonstrate that our approach significantly enhances adversarial transferability. Especially, when attacking numerous defense models, we achieve an average attack success rate of 95.8%. Notably, our method can plugin with any off-the-shelf algorithm to improve their attack performance without additional time cost.

8/13/2024

📉

Bag of Tricks to Boost Adversarial Transferability

Zeliang Zhang, Wei Yao, Xiaosen Wang

Deep neural networks are widely known to be vulnerable to adversarial examples. However, vanilla adversarial examples generated under the white-box setting often exhibit low transferability across different models. Since adversarial transferability poses more severe threats to practical applications, various approaches have been proposed for better transferability, including gradient-based, input transformation-based, and model-related attacks, etc. In this work, we find that several tiny changes in the existing adversarial attacks can significantly affect the attack performance, eg, the number of iterations and step size. Based on careful studies of existing adversarial attacks, we propose a bag of tricks to enhance adversarial transferability, including momentum initialization, scheduled step size, dual example, spectral-based input transformation, and several ensemble strategies. Extensive experiments on the ImageNet dataset validate the high effectiveness of our proposed tricks and show that combining them can further boost adversarial transferability. Our work provides practical insights and techniques to enhance adversarial transferability, and offers guidance to improve the attack performance on the real-world application through simple adjustments.

7/23/2024

Adversarial Example Soups: Improving Transferability and Stealthiness for Free

Bo Yang, Hengwei Zhang, Jindong Wang, Yulong Yang, Chenhao Lin, Chao Shen, Zhengyu Zhao

Transferable adversarial examples cause practical security risks since they can mislead a target model without knowing its internal knowledge. A conventional recipe for maximizing transferability is to keep only the optimal adversarial example from all those obtained in the optimization pipeline. In this paper, for the first time, we question this convention and demonstrate that those discarded, sub-optimal adversarial examples can be reused to boost transferability. Specifically, we propose ``Adversarial Example Soups'' (AES), with AES-tune for averaging discarded adversarial examples in hyperparameter tuning and AES-rand for stability testing. In addition, our AES is inspired by ``model soups'', which averages weights of multiple fine-tuned models for improved accuracy without increasing inference time. Extensive experiments validate the global effectiveness of our AES, boosting 10 state-of-the-art transfer attacks and their combinations by up to 13% against 10 diverse (defensive) target models. We also show the possibility of generalizing AES to other types, e.g., directly averaging multiple in-the-wild adversarial examples that yield comparable success. A promising byproduct of AES is the improved stealthiness of adversarial examples since the perturbation variances are naturally reduced.

5/1/2024