Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior

2405.19098

YC

0

Reddit

0

Published 5/30/2024 by Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, Jun Zhu
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior

Abstract

This paper studies the challenging black-box adversarial attack that aims to generate adversarial examples against a black-box model by only using output feedback of the model to input queries. Some previous methods improve the query efficiency by incorporating the gradient of a surrogate white-box model into query-based attacks due to the adversarial transferability. However, the localized gradient is not informative enough, making these methods still query-intensive. In this paper, we propose a Prior-guided Bayesian Optimization (P-BO) algorithm that leverages the surrogate model as a global function prior in black-box adversarial attacks. As the surrogate model contains rich prior information of the black-box one, P-BO models the attack objective with a Gaussian process whose mean function is initialized as the surrogate model's loss. Our theoretical analysis on the regret bound indicates that the performance of P-BO may be affected by a bad prior. Therefore, we further propose an adaptive integration strategy to automatically adjust a coefficient on the function prior by minimizing the regret bound. Extensive experiments on image classifiers and large vision-language models demonstrate the superiority of the proposed algorithm in reducing queries and improving attack success rates compared with the state-of-the-art black-box attacks. Code is available at https://github.com/yibo-miao/PBO-Attack.

Create account to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper introduces an efficient black-box adversarial attack method using Bayesian optimization guided by a function prior.
  • Adversarial attacks aim to fool machine learning models by adding small, imperceptible perturbations to input data.
  • Black-box attacks do not require knowledge of the model's internals, making them more practical than white-box attacks.
  • The proposed approach leverages Bayesian optimization techniques to efficiently search for adversarial examples, guided by a learned function prior.
  • This leads to improved attack success rates and reduced query complexity compared to existing black-box attack methods.

Plain English Explanation

Machine learning models, such as those used for image recognition, can be fooled by making tiny, virtually undetectable changes to the input data. These "adversarial attacks" can cause the model to misclassify the input, even though a human would still recognize it correctly.

In this paper, the researchers developed a new approach for conducting these black-box adversarial attacks, where the attacker doesn't know the internal details of the target model. Their method uses a technique called Bayesian optimization, which intelligently explores the space of possible perturbations to find the most effective ones.

Crucially, the researchers also incorporate a "function prior" - essentially, they give the optimization process some initial guidance about the kind of perturbations that are likely to work well. This prior knowledge helps the optimization converge to successful adversarial examples more efficiently, requiring fewer queries to the target model.

Compared to previous black-box attack methods, the researchers' approach achieves higher attack success rates while needing to query the model fewer times. This makes it a more practical and effective tool for evaluating the robustness of machine learning systems to adversarial manipulations.

Technical Explanation

The paper presents a novel black-box adversarial attack method based on Bayesian optimization guided by a learned function prior.

In the black-box setting, the attacker does not have access to the model's architecture or parameters, and can only interact with it through input-output queries. The goal is to find small perturbations to the input that cause the model to misclassify, while remaining imperceptible to human observers.

The key innovations of this work are:

  1. Formulating the adversarial attack as a Bayesian optimization problem, where the objective is to minimize the model's confidence in the correct class.
  2. Incorporating a learned function prior to guide the optimization process, based on data-driven prior learning techniques.
  3. Developing a provably efficient Bayesian optimization algorithm that can handle the heteroscedastic noise inherent in black-box adversarial attacks.

The experiments demonstrate that this BO4IO-based attack outperforms previous state-of-the-art black-box methods in terms of attack success rate and query efficiency, across various model architectures and datasets.

Critical Analysis

The paper provides a strong technical contribution to the field of adversarial machine learning. The use of Bayesian optimization guided by a learned function prior is a novel and effective approach for conducting black-box attacks.

However, the authors acknowledge several limitations and caveats to their work. First, the function prior is learned on a subset of the target model's parameters, which may not generalize perfectly to unseen architectures. Additionally, the optimization is still query-intensive compared to white-box attacks, making it less scalable for large-scale evaluation.

Furthermore, the paper does not address the broader implications of adversarial attacks or discuss potential mitigation strategies. While the research advances the technical capabilities of black-box attacks, it is important to consider the ethical implications and responsible development of such techniques.

Future work could explore ways to make the function prior more generalizable, reduce the query complexity further, and investigate defensive methods to enhance the robustness of machine learning models against these types of attacks.

Conclusion

This paper introduces an efficient black-box adversarial attack method that uses Bayesian optimization guided by a learned function prior. By intelligently exploring the space of possible perturbations, the approach achieves higher attack success rates and reduced query complexity compared to previous state-of-the-art black-box attacks.

The technical innovations, including the Bayesian optimization formulation and the use of a function prior, represent a significant advancement in the field of adversarial machine learning. However, the potential negative impacts of such techniques must be carefully considered, and future research should also focus on developing robust defense mechanisms.

Overall, this work contributes valuable insights and tools for evaluating the security and reliability of machine learning systems, which is an important area of study as these technologies become increasingly ubiquitous in our lives.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🛠️

Pseudo-Bayesian Optimization

Haoxian Chen, Henry Lam

YC

0

Reddit

0

Bayesian Optimization is a popular approach for optimizing expensive black-box functions. Its key idea is to use a surrogate model to approximate the objective and, importantly, quantify the associated uncertainty that allows a sequential search of query points that balance exploitation-exploration. Gaussian process (GP) has been a primary candidate for the surrogate model, thanks to its Bayesian-principled uncertainty quantification power and modeling flexibility. However, its challenges have also spurred an array of alternatives whose convergence properties could be more opaque. Motivated by these, we study in this paper an axiomatic framework that elicits the minimal requirements to guarantee black-box optimization convergence that could apply beyond GP-based methods. Moreover, we leverage the design freedom in our framework, which we call Pseudo-Bayesian Optimization, to construct empirically superior algorithms. In particular, we show how using simple local regression, and a suitable randomized prior construction to quantify uncertainty, not only guarantees convergence but also consistently outperforms state-of-the-art benchmarks in examples ranging from high-dimensional synthetic experiments to realistic hyperparameter tuning and robotic applications.

Read more

6/21/2024

🛠️

Principled Preferential Bayesian Optimization

Wenjie Xu, Wenbin Wang, Yuning Jiang, Bratislav Svetozarevic, Colin N. Jones

YC

0

Reddit

0

We study the problem of preferential Bayesian optimization (BO), where we aim to optimize a black-box function with only preference feedback over a pair of candidate solutions. Inspired by the likelihood ratio idea, we construct a confidence set of the black-box function using only the preference feedback. An optimistic algorithm with an efficient computational method is then developed to solve the problem, which enjoys an information-theoretic bound on the total cumulative regret, a first-of-its-kind for preferential BO. This bound further allows us to design a scheme to report an estimated best solution, with a guaranteed convergence rate. Experimental results on sampled instances from Gaussian processes, standard test functions, and a thermal comfort optimization problem all show that our method stably achieves better or competitive performance as compared to the existing state-of-the-art heuristics, which, however, do not have theoretical guarantees on regret bounds or convergence.

Read more

5/30/2024

Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization

New!Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization

Dongxia Wu, Nikki Lijing Kuang, Ruijia Niu, Yi-An Ma, Rose Yu

YC

0

Reddit

0

Black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle. This process demands sample-efficient optimization due to the high computational cost of function evaluations. While prior studies focus on forward approaches to learn surrogates for the unknown objective function, they struggle with high-dimensional inputs where valid inputs form a small subspace (e.g., valid protein sequences), which is common in real-world tasks. Recently, diffusion models have demonstrated impressive capability in learning the high-dimensional data manifold. They have shown promising performance in black-box optimization tasks but only in offline settings. In this work, we propose diffusion-based inverse modeling for black-box optimization (Diff-BBO), the first inverse approach leveraging diffusion models for online BBO problem. Diff-BBO distinguishes itself from forward approaches through the design of acquisition function. Instead of proposing candidates in the design space, Diff-BBO employs a novel acquisition function Uncertainty-aware Exploration (UaE) to propose objective function values, which leverages the uncertainty of a conditional diffusion model to generate samples in the design space. Theoretically, we prove that using UaE leads to optimal optimization outcomes. Empirically, we redesign experiments on the Design-Bench benchmark for online settings and show that Diff-BBO achieves state-of-the-art performance.

Read more

7/2/2024

Pareto Front-Diverse Batch Multi-Objective Bayesian Optimization

Pareto Front-Diverse Batch Multi-Objective Bayesian Optimization

Alaleh Ahmadianshalchi, Syrine Belakaria, Janardhan Rao Doppa

YC

0

Reddit

0

We consider the problem of multi-objective optimization (MOO) of expensive black-box functions with the goal of discovering high-quality and diverse Pareto fronts where we are allowed to evaluate a batch of inputs. This problem arises in many real-world applications including penicillin production where diversity of solutions is critical. We solve this problem in the framework of Bayesian optimization (BO) and propose a novel approach referred to as Pareto front-Diverse Batch Multi-Objective BO (PDBO). PDBO tackles two important challenges: 1) How to automatically select the best acquisition function in each BO iteration, and 2) How to select a diverse batch of inputs by considering multiple objectives. We propose principled solutions to address these two challenges. First, PDBO employs a multi-armed bandit approach to select one acquisition function from a given library. We solve a cheap MOO problem by assigning the selected acquisition function for each expensive objective function to obtain a candidate set of inputs for evaluation. Second, it utilizes Determinantal Point Processes (DPPs) to choose a Pareto-front-diverse batch of inputs for evaluation from the candidate set obtained from the first step. The key parameters for the methods behind these two steps are updated after each round of function evaluations. Experiments on multiple MOO benchmarks demonstrate that PDBO outperforms prior methods in terms of both the quality and diversity of Pareto solutions.

Read more

6/14/2024