OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift

2310.12793

Published 6/5/2024 by Lin Li, Yifei Wang, Chawin Sitawarin, Michael Spratling

🎯

Abstract

Existing works have made great progress in improving adversarial robustness, but typically test their method only on data from the same distribution as the training data, i.e. in-distribution (ID) testing. As a result, it is unclear how such robustness generalizes under input distribution shifts, i.e. out-of-distribution (OOD) testing. This omission is concerning as such distribution shifts are unavoidable when methods are deployed in the wild. To address this issue we propose a benchmark named OODRobustBench to comprehensively assess OOD adversarial robustness using 23 dataset-wise shifts (i.e. naturalistic shifts in input distribution) and 6 threat-wise shifts (i.e., unforeseen adversarial threat models). OODRobustBench is used to assess 706 robust models using 60.7K adversarial evaluations. This large-scale analysis shows that: 1) adversarial robustness suffers from a severe OOD generalization issue; 2) ID robustness correlates strongly with OOD robustness in a positive linear way. The latter enables the prediction of OOD robustness from ID robustness. We then predict and verify that existing methods are unlikely to achieve high OOD robustness. Novel methods are therefore required to achieve OOD robustness beyond our prediction. To facilitate the development of these methods, we investigate a wide range of techniques and identify several promising directions. Code and models are available at: https://github.com/OODRobustBench/OODRobustBench.

Create account to get full access

Overview

This paper examines the generalization of adversarial robustness to out-of-distribution (OOD) data, beyond just in-distribution (ID) testing.
The authors propose a benchmark called OODRobustBench to comprehensively assess OOD adversarial robustness across various dataset and threat-wise shifts.
Their large-scale analysis of 706 robust models shows that adversarial robustness suffers from a severe OOD generalization issue, but ID robustness can be used to predict OOD robustness.
The authors investigate techniques to improve OOD robustness and identify several promising directions for future research.

Plain English Explanation

Existing machine learning models have made progress in becoming more robust to adversarial attacks, where small, imperceptible changes to the input can cause the model to make incorrect predictions. However, these models are typically tested only on data that comes from the same distribution as the training data, known as in-distribution (ID) testing.

The authors of this paper argue that this is a problem, because in real-world deployments, the input data may come from a different distribution, known as out-of-distribution (OOD) data. To address this issue, the authors propose a new benchmark called OODRobustBench, which tests the robustness of models to a variety of dataset-wise and threat-wise distribution shifts.

Their large-scale analysis of over 700 robust models shows that adversarial robustness does not generalize well to OOD data. However, they also find that there is a positive linear relationship between a model's ID robustness and its OOD robustness. This means that the OOD robustness of a model can be predicted from its ID robustness.

The authors then use this insight to predict that existing methods are unlikely to achieve high OOD robustness. They investigate a range of techniques and identify several promising directions for future research to improve OOD robustness, beyond what can be achieved by simply optimizing for ID robustness.

Technical Explanation

The paper starts by highlighting that existing works have made great progress in improving adversarial robustness, but typically only test their methods on data from the same distribution as the training data (in-distribution or ID testing). This is concerning, as in real-world deployments, the input data may come from a different distribution (out-of-distribution or OOD), and it is unclear how the robustness generalizes to such distribution shifts.

To address this issue, the authors propose a new benchmark called OODRobustBench, which comprehensively assesses OOD adversarial robustness using 23 dataset-wise shifts (e.g., changes in image resolution, color, or background) and 6 threat-wise shifts (e.g., unforeseen adversarial threat models). This benchmark is used to evaluate 706 robust models across 60.7K adversarial evaluations.

The large-scale analysis reveals two key findings:

Adversarial robustness suffers from a severe OOD generalization issue, with models performing much worse on OOD data compared to ID data.
There is a strong positive linear correlation between ID robustness and OOD robustness, enabling the prediction of OOD robustness from ID robustness.

Based on this observation, the authors predict that existing methods are unlikely to achieve high OOD robustness, as they are typically optimized for ID robustness. To address this, the authors investigate a wide range of techniques, including robust object detection, OOD detection, and covariate shift adaptation, and identify several promising directions for future research.

Critical Analysis

The paper provides a comprehensive and insightful analysis of the OOD generalization problem in adversarial robustness. The authors' use of the OODRobustBench benchmark to systematically evaluate a large number of robust models is a significant contribution to the field.

One potential limitation of the study is that it focuses primarily on image classification tasks, and it is unclear how the findings would translate to other domains, such as natural language processing or reinforcement learning. Additionally, the authors acknowledge that the benchmark does not cover all possible distribution shifts, and there may be other types of shifts that were not considered.

Furthermore, while the authors identify several promising research directions, the actual techniques for improving OOD robustness are not fully explored in this paper. It would be valuable to see more detailed investigations and evaluations of the proposed methods in future work.

Overall, this paper makes an important step forward in understanding the challenges of OOD generalization in adversarial robustness and provides a solid foundation for future research in this area.

Conclusion

This paper highlights the critical issue of OOD generalization in adversarial robustness, an area that has been overlooked in much of the existing research. By proposing the OODRobustBench benchmark and conducting a large-scale analysis, the authors have shown that current methods struggle to maintain their robustness when faced with distribution shifts.

The key finding that ID robustness can be used to predict OOD robustness is a valuable insight that can guide future research. However, the authors also make it clear that novel methods are needed to achieve high OOD robustness beyond what can be predicted from ID robustness.

The investigation of promising techniques, such as robust object detection and covariate shift adaptation, suggests that there are multiple avenues for improving OOD robustness. Continued research in this direction could lead to more reliable and trustworthy machine learning models that can perform well in diverse real-world scenarios.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Out-of-Distribution Data: An Acquaintance of Adversarial Examples -- A Survey

Naveen Karunanayake, Ravin Gunawardena, Suranga Seneviratne, Sanjay Chawla

Deep neural networks (DNNs) deployed in real-world applications can encounter out-of-distribution (OOD) data and adversarial examples. These represent distinct forms of distributional shifts that can significantly impact DNNs' reliability and robustness. Traditionally, research has addressed OOD detection and adversarial robustness as separate challenges. This survey focuses on the intersection of these two areas, examining how the research community has investigated them together. Consequently, we identify two key research directions: robust OOD detection and unified robustness. Robust OOD detection aims to differentiate between in-distribution (ID) data and OOD data, even when they are adversarially manipulated to deceive the OOD detector. Unified robustness seeks a single approach to make DNNs robust against both adversarial attacks and OOD inputs. Accordingly, first, we establish a taxonomy based on the concept of distributional shifts. This framework clarifies how robust OOD detection and unified robustness relate to other research areas addressing distributional shifts, such as OOD detection, open set recognition, and anomaly detection. Subsequently, we review existing work on robust OOD detection and unified robustness. Finally, we highlight the limitations of the existing work and propose promising research directions that explore adversarial and OOD inputs within a unified framework.

4/9/2024

cs.LG

🏋️

Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Peter Lorenz, Mario Fernandez, Jens Muller, Ullrich Kothe

Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast and showing an option to protect a pre-trained classifier against natural distribution shifts, claiming to be ready for real-world scenarios. However, its efficacy in handling adversarial examples has been neglected in the majority of studies. This paper investigates the adversarial robustness of the 16 post-hoc detectors on several evasion attacks and discuss a roadmap towards adversarial defense in OOD detectors.

6/27/2024

cs.CR cs.CV

🌀

Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains

Eunsu Baek, Keondo Park, Jiyoon Kim, Hyung-Sin Kim

Computer vision applications predict on digital images acquired by a camera from physical scenes through light. However, conventional robustness benchmarks rely on perturbations in digitized images, diverging from distribution shifts occurring in the image acquisition process. To bridge this gap, we introduce a new distribution shift dataset, ImageNet-ES, comprising variations in environmental and camera sensor factors by directly capturing 202k images with a real camera in a controllable testbed. With the new dataset, we evaluate out-of-distribution (OOD) detection and model robustness. We find that existing OOD detection methods do not cope with the covariate shifts in ImageNet-ES, implying that the definition and detection of OOD should be revisited to embrace real-world distribution shifts. We also observe that the model becomes more robust in both ImageNet-C and -ES by learning environment and sensor variations in addition to existing digital augmentations. Lastly, our results suggest that effective shift mitigation via camera sensor control can significantly improve performance without increasing model size. With these findings, our benchmark may aid future research on robustness, OOD, and camera sensor control for computer vision. Our code and dataset are available at https://github.com/Edw2n/ImageNet-ES.

4/26/2024

cs.CV cs.AI

Toward a Realistic Benchmark for Out-of-Distribution Detection

Pietro Recalcati, Fabio Garcea, Luca Piano, Fabrizio Lamberti, Lia Morra

Deep neural networks are increasingly used in a wide range of technologies and services, but remain highly susceptible to out-of-distribution (OOD) samples, that is, drawn from a different distribution than the original training set. A common approach to address this issue is to endow deep neural networks with the ability to detect OOD samples. Several benchmarks have been proposed to design and validate OOD detection techniques. However, many of them are based on far-OOD samples drawn from very different distributions, and thus lack the complexity needed to capture the nuances of real-world scenarios. In this work, we introduce a comprehensive benchmark for OOD detection, based on ImageNet and Places365, that assigns individual classes as in-distribution or out-of-distribution depending on the semantic similarity with the training set. Several techniques can be used to determine which classes should be considered in-distribution, yielding benchmarks with varying properties. Experimental results on different OOD detection techniques show how their measured efficacy depends on the selected benchmark and how confidence-based techniques may outperform classifier-based ones on near-OOD samples.

4/17/2024

cs.LG cs.CV