Distributed Black-box Attack: Do Not Overestimate Black-box Attacks

Read original: arXiv:2210.16371 - Published 7/8/2024 by Han Wu, Sareh Rowlands, Johan Wahlstrom

🔎

Overview

Black-box attacks can trick image classifiers into misclassifying images without accessing the model's structure or weights.
Recent studies have reported attack success rates over 95% with less than 1,000 queries.
There is a concern about the threat of black-box attacks against IoT devices that rely on cloud APIs for image classification.
Prior research has focused on improving success rate and reducing the number of queries, but the time required to perform the attack is also crucial.

Plain English Explanation

Black-box attacks are a type of attack on image classification models where the attacker can manipulate the input images to trick the model into misclassifying them, without needing to know the details of the model's structure or the values of its internal parameters. Recent studies have shown that these attacks can be very successful, with over 95% of images being misclassified using less than 1,000 attempts to find the right manipulation.

This is concerning because many devices, especially in the Internet of Things (IoT), rely on cloud-based image classification services to identify objects or scenes in images. If an attacker can quickly and reliably trick these cloud-based models, it could pose a real threat to the security and reliability of these IoT systems.

Prior research in this area has mainly focused on improving the success rate of the attacks and reducing the number of attempts needed. However, the researchers of this paper argue that the time it takes to carry out the attack is also an important factor, especially when targeting cloud-based services, which may have mechanisms in place to detect and block suspicious activity.

Technical Explanation

This paper applies black-box attacks directly to cloud-based image classification APIs, rather than just to local models as in previous research. This helps avoid some of the mistakes made in prior studies, where the perturbations were applied before the image was encoded and pre-processed, which may not accurately reflect real-world cloud API usage.

Additionally, the researchers exploit load balancing mechanisms in the cloud to enable distributed black-box attacks. This allows them to use multiple machines to perform the attacks in parallel, which can reduce the attack time by around a factor of five compared to single-machine attacks, for both local search and gradient estimation attack methods.

Critical Analysis

The paper provides a useful and practical exploration of black-box attacks against cloud-based image classification services, which is an important area of research given the increasing reliance on such services in IoT and other applications.

One potential limitation of the research is that it focuses solely on the time required to carry out the attack, without considering other factors that may be important in a real-world scenario, such as the detectability of the attacks or the potential for countermeasures to be implemented by cloud providers.

Additionally, the researchers do not explore the potential impact of these attacks on end-users or the broader implications for the security and reliability of IoT systems. Further research in this area could examine these broader issues and provide a more comprehensive understanding of the risks and challenges posed by black-box attacks against cloud-based image classification.

Conclusion

This research highlights the importance of considering the time required to carry out black-box attacks against cloud-based image classification services, in addition to the success rate and number of queries. By exploiting load balancing mechanisms, the researchers were able to significantly reduce the attack time, which could make these types of attacks more practical and concerning in real-world scenarios.

As IoT devices and cloud-based services continue to play an increasingly important role in our lives, understanding and addressing the security vulnerabilities posed by black-box attacks will be crucial to ensuring the reliability and trustworthiness of these systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔎

Distributed Black-box Attack: Do Not Overestimate Black-box Attacks

Han Wu, Sareh Rowlands, Johan Wahlstrom

Black-box adversarial attacks can fool image classifiers into misclassifying images without requiring access to model structure and weights. Recent studies have reported attack success rates of over 95% with less than 1,000 queries. The question then arises of whether black-box attacks have become a real threat against IoT devices that rely on cloud APIs to achieve image classification. To shed some light on this, note that prior research has primarily focused on increasing the success rate and reducing the number of queries. However, another crucial factor for black-box attacks against cloud APIs is the time required to perform the attack. This paper applies black-box attacks directly to cloud APIs rather than to local models, thereby avoiding mistakes made in prior research that applied the perturbation before image encoding and pre-processing. Further, we exploit load balancing to enable distributed black-box attacks that can reduce the attack time by a factor of about five for both local search and gradient estimation methods.

7/8/2024

✅

Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence

Hanbin Hong, Xinyu Zhang, Binghui Wang, Zhongjie Ba, Yuan Hong

Black-box adversarial attacks have demonstrated strong potential to compromise machine learning models by iteratively querying the target model or leveraging transferability from a local surrogate model. Recently, such attacks can be effectively mitigated by state-of-the-art (SOTA) defenses, e.g., detection via the pattern of sequential queries, or injecting noise into the model. To our best knowledge, we take the first step to study a new paradigm of black-box attacks with provable guarantees -- certifiable black-box attacks that can guarantee the attack success probability (ASP) of adversarial examples before querying over the target model. This new black-box attack unveils significant vulnerabilities of machine learning models, compared to traditional empirical black-box attacks, e.g., breaking strong SOTA defenses with provable confidence, constructing a space of (infinite) adversarial examples with high ASP, and the ASP of the generated adversarial examples is theoretically guaranteed without verification/queries over the target model. Specifically, we establish a novel theoretical foundation for ensuring the ASP of the black-box attack with randomized adversarial examples (AEs). Then, we propose several novel techniques to craft the randomized AEs while reducing the perturbation size for better imperceptibility. Finally, we have comprehensively evaluated the certifiable black-box attacks on the CIFAR10/100, ImageNet, and LibriSpeech datasets, while benchmarking with 16 SOTA black-box attacks, against various SOTA defenses in the domains of computer vision and speech recognition. Both theoretical and experimental results have validated the significance of the proposed attack. The code and all the benchmarks are available at url{https://github.com/datasec-lab/CertifiedAttack}.

9/9/2024

From Attack to Defense: Insights into Deep Learning Security Measures in Black-Box Settings

Firuz Juraev, Mohammed Abuhamad, Eric Chan-Tin, George K. Thiruvathukal, Tamer Abuhmed

Deep Learning (DL) is rapidly maturing to the point that it can be used in safety- and security-crucial applications. However, adversarial samples, which are undetectable to the human eye, pose a serious threat that can cause the model to misbehave and compromise the performance of such applications. Addressing the robustness of DL models has become crucial to understanding and defending against adversarial attacks. In this study, we perform comprehensive experiments to examine the effect of adversarial attacks and defenses on various model architectures across well-known datasets. Our research focuses on black-box attacks such as SimBA, HopSkipJump, MGAAttack, and boundary attacks, as well as preprocessor-based defensive mechanisms, including bits squeezing, median smoothing, and JPEG filter. Experimenting with various models, our results demonstrate that the level of noise needed for the attack increases as the number of layers increases. Moreover, the attack success rate decreases as the number of layers increases. This indicates that model complexity and robustness have a significant relationship. Investigating the diversity and robustness relationship, our experiments with diverse models show that having a large number of parameters does not imply higher robustness. Our experiments extend to show the effects of the training dataset on model robustness. Using various datasets such as ImageNet-1000, CIFAR-100, and CIFAR-10 are used to evaluate the black-box attacks. Considering the multiple dimensions of our analysis, e.g., model complexity and training dataset, we examined the behavior of black-box attacks when models apply defenses. Our results show that applying defense strategies can significantly reduce attack effectiveness. This research provides in-depth analysis and insight into the robustness of DL models against various attacks, and defenses.

5/6/2024

BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization

Satyadwyoom Kumar, Saurabh Gupta, Arun Balaji Buduru

Deep Learning has become popular due to its vast applications in almost all domains. However, models trained using deep learning are prone to failure for adversarial samples and carry a considerable risk in sensitive applications. Most of these adversarial attack strategies assume that the adversary has access to the training data, the model parameters, and the input during deployment, hence, focus on perturbing the pixel level information present in the input image. Adversarial Patches were introduced to the community which helped in bringing out the vulnerability of deep learning models in a much more pragmatic manner but here the attacker has a white-box access to the model parameters. Recently, there has been an attempt to develop these adversarial attacks using black-box techniques. However, certain assumptions such as availability large training data is not valid for a real-life scenarios. In a real-life scenario, the attacker can only assume the type of model architecture used from a select list of state-of-the-art architectures while having access to only a subset of input dataset. Hence, we propose an black-box adversarial attack strategy that produces adversarial patches which can be applied anywhere in the input image to perform an adversarial attack.

5/13/2024