Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains

2404.15882

Published 4/26/2024 by Eunsu Baek, Keondo Park, Jiyoon Kim, Hyung-Sin Kim

🌀

Abstract

Computer vision applications predict on digital images acquired by a camera from physical scenes through light. However, conventional robustness benchmarks rely on perturbations in digitized images, diverging from distribution shifts occurring in the image acquisition process. To bridge this gap, we introduce a new distribution shift dataset, ImageNet-ES, comprising variations in environmental and camera sensor factors by directly capturing 202k images with a real camera in a controllable testbed. With the new dataset, we evaluate out-of-distribution (OOD) detection and model robustness. We find that existing OOD detection methods do not cope with the covariate shifts in ImageNet-ES, implying that the definition and detection of OOD should be revisited to embrace real-world distribution shifts. We also observe that the model becomes more robust in both ImageNet-C and -ES by learning environment and sensor variations in addition to existing digital augmentations. Lastly, our results suggest that effective shift mitigation via camera sensor control can significantly improve performance without increasing model size. With these findings, our benchmark may aid future research on robustness, OOD, and camera sensor control for computer vision. Our code and dataset are available at https://github.com/Edw2n/ImageNet-ES.

Create account to get full access

Overview

Computer vision applications rely on digital images captured by cameras to make predictions about physical scenes.
Conventional robustness benchmarks focus on perturbations in digitized images, rather than considering distribution shifts that occur during the image acquisition process.
To address this gap, researchers introduce a new dataset called ImageNet-ES, which captures variations in environmental and camera sensor factors by directly photographing a controlled testbed.
The dataset is used to evaluate out-of-distribution (OOD) detection and model robustness, revealing limitations in existing OOD detection methods and opportunities to improve model robustness.

Plain English Explanation

Computer vision is the field of artificial intelligence (AI) that allows computers to understand and interpret digital images, just like humans can. These computer vision applications often make predictions about the physical world by analyzing images captured by cameras.

However, the standard ways of testing the robustness of these computer vision models don't always reflect the real-world challenges they face. Typical robustness benchmarks focus on introducing small, artificial changes or "perturbations" to the digital images themselves. But in reality, the process of capturing those images with a camera can introduce much larger distribution shifts - changes in the environment, camera settings, and other factors that can dramatically alter the visual data.

To better understand these real-world challenges, researchers have created a new dataset called ImageNet-ES. Instead of just modifying digital images, the researchers directly captured 202,000 images using a real camera in a controlled test environment. This allowed them to systematically study the impact of changes in lighting, camera position, and other environmental and sensor factors.

Using this new dataset, the researchers evaluated how well existing "out-of-distribution" (OOD) detection methods - techniques that can identify when an input image is very different from the data the model was trained on - actually perform in the face of real-world distribution shifts. They found that these OOD detection methods struggled, suggesting that the definition and detection of OOD needs to be revisited to better account for the types of shifts that occur during image acquisition.

The researchers also found that training computer vision models to be more robust, by exposing them to a wider variety of environmental and sensor variations in addition to digital augmentations, can significantly improve their performance on both artificial and real-world distribution shifts. This suggests that effective control and modeling of the camera sensor and image acquisition process could be a powerful way to make these models more robust, without having to increase the model size or complexity.

Overall, this new ImageNet-ES dataset and the insights from the researchers' experiments highlight the importance of moving beyond simplified robustness benchmarks and towards more realistic evaluation of computer vision systems in the face of real-world distribution shifts.

Technical Explanation

The researchers introduce a new dataset called ImageNet-ES to study the impact of distribution shifts that occur during the image acquisition process. Unlike conventional robustness benchmarks that focus on artificial perturbations to digitized images, ImageNet-ES comprises 202,000 images directly captured using a real camera in a controlled testbed. This allows the researchers to systematically study the effects of variations in environmental factors (e.g., lighting, background) and camera sensor parameters (e.g., exposure, focal length).

Using this dataset, the researchers evaluate the performance of existing out-of-distribution (OOD) detection methods. OOD detection aims to identify when an input image is significantly different from the data the model was trained on, which is crucial for ensuring safety and reliability in real-world computer vision applications. However, the researchers find that these OOD detection methods struggle to cope with the covariate shifts present in ImageNet-ES, implying that the definition and detection of OOD needs to be revisited to better account for distribution shifts in the image acquisition process.

The researchers also investigate strategies to improve model robustness, finding that training models to learn about environment and sensor variations, in addition to existing digital augmentations, can significantly boost their performance on both ImageNet-C (a benchmark for artificial distribution shifts) and ImageNet-ES. This suggests that effective control and modeling of the camera sensor and image acquisition process could be a powerful approach for improving model robustness, without the need to increase model size or complexity.

Critical Analysis

The researchers have made a valuable contribution by highlighting the importance of considering distribution shifts that occur during the image acquisition process, rather than solely focusing on perturbations to digitized images. The ImageNet-ES dataset provides a more realistic testbed for evaluating computer vision models, and the researchers' findings suggest that existing OOD detection methods may not be sufficient for real-world applications.

One potential limitation of the study is the use of a controlled testbed, which may not fully capture the diversity of environmental and sensor factors encountered in the real world. Additionally, the researchers do not explore the impact of specific sensor or environmental factors on model performance, which could provide more granular insights.

Furthermore, the paper does not delve into the underlying reasons why existing OOD detection methods struggle with the distribution shifts in ImageNet-ES. A deeper analysis of the shortcomings of these methods and potential directions for improvement would be a valuable addition to the research.

Overall, this work highlights the need for more realistic benchmarks and robustness evaluation in computer vision, and the ImageNet-ES dataset provides a promising starting point for further research in this area.

Conclusion

The researchers have introduced a novel dataset, ImageNet-ES, that captures distribution shifts arising from the image acquisition process using a real camera in a controlled testbed. This dataset reveals limitations in existing out-of-distribution (OOD) detection methods and provides insights into improving model robustness.

The key takeaways from this research are:

Conventional robustness benchmarks that focus on artificial perturbations to digitized images may not adequately reflect the real-world challenges faced by computer vision systems.
Effective modeling and control of the camera sensor and image acquisition process can significantly improve model robustness, without the need to increase model complexity.
The definition and detection of OOD needs to be revisited to better account for distribution shifts that occur during image capture.

By addressing these challenges, the ImageNet-ES dataset and the insights from this research can help advance the development of more reliable and robust computer vision systems that can better handle the complexities of the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox

Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen

Most existing out-of-distribution (OOD) detection benchmarks classify samples with novel labels as the OOD data. However, some marginal OOD samples actually have close semantic contents to the in-distribution (ID) sample, which makes determining the OOD sample a Sorites Paradox. In this paper, we construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue, in which we divide the test samples into subsets with different semantic and covariate shift degrees relative to the ID dataset. The data division is achieved through a shift measuring method based on our proposed Language Aligned Image feature Decomposition (LAID). Moreover, we construct a Synthetic Incremental Shift (Syn-IS) dataset that contains high-quality generated images with more diverse covariate contents to complement the IS-OOD benchmark. We evaluate current OOD detection methods on our benchmark and find several important insights: (1) The performance of most OOD detection methods significantly improves as the semantic shift increases; (2) Some methods like GradNorm may have different OOD detection mechanisms as they rely less on semantic shifts to make decisions; (3) Excessive covariate shifts in the image are also likely to be considered as OOD for some methods. Our code and data are released in https://github.com/qqwsad5/IS-OOD.

6/17/2024

cs.CV

👁️

Investigating Robustness of Open-Vocabulary Foundation Object Detectors under Distribution Shifts

Prakash Chandra Chhipa, Kanjar De, Meenakshi Subhash Chippa, Rajkumar Saini, Marcus Liwicki

The challenge of Out-Of-Distribution (OOD) robustness remains a critical hurdle towards deploying deep vision models. Open-vocabulary object detection extends the capabilities of traditional object detection frameworks to recognize and classify objects beyond predefined categories. Investigating OOD robustness in open-vocabulary object detection is essential to increase the trustworthiness of these models. This study presents a comprehensive robustness evaluation of zero-shot capabilities of three recent open-vocabulary foundation object detection models, namely OWL-ViT, YOLO World, and Grounding DINO. Experiments carried out on the COCO-O and COCO-C benchmarks encompassing distribution shifts highlight the challenges of the models' robustness. Source code shall be made available to the research community on GitHub.

6/4/2024

cs.CV

🎯

OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift

Lin Li, Yifei Wang, Chawin Sitawarin, Michael Spratling

Existing works have made great progress in improving adversarial robustness, but typically test their method only on data from the same distribution as the training data, i.e. in-distribution (ID) testing. As a result, it is unclear how such robustness generalizes under input distribution shifts, i.e. out-of-distribution (OOD) testing. This omission is concerning as such distribution shifts are unavoidable when methods are deployed in the wild. To address this issue we propose a benchmark named OODRobustBench to comprehensively assess OOD adversarial robustness using 23 dataset-wise shifts (i.e. naturalistic shifts in input distribution) and 6 threat-wise shifts (i.e., unforeseen adversarial threat models). OODRobustBench is used to assess 706 robust models using 60.7K adversarial evaluations. This large-scale analysis shows that: 1) adversarial robustness suffers from a severe OOD generalization issue; 2) ID robustness correlates strongly with OOD robustness in a positive linear way. The latter enables the prediction of OOD robustness from ID robustness. We then predict and verify that existing methods are unlikely to achieve high OOD robustness. Novel methods are therefore required to achieve OOD robustness beyond our prediction. To facilitate the development of these methods, we investigate a wide range of techniques and identify several promising directions. Code and models are available at: https://github.com/OODRobustBench/OODRobustBench.

6/5/2024

cs.LG cs.CV

Detecting Out-Of-Distribution Earth Observation Images with Diffusion Models

Georges Le Bellier (CEDRIC - VERTIGO, CNAM), Nicolas Audebert (CEDRIC - VERTIGO, CNAM, IGN)

Earth Observation imagery can capture rare and unusual events, such as disasters and major landscape changes, whose visual appearance contrasts with the usual observations. Deep models trained on common remote sensing data will output drastically different features for these out-of-distribution samples, compared to those closer to their training dataset. Detecting them could therefore help anticipate changes in the observations, either geographical or environmental. In this work, we show that the reconstruction error of diffusion models can effectively serve as unsupervised out-of-distribution detectors for remote sensing images, using them as a plausibility score. Moreover, we introduce ODEED, a novel reconstruction-based scorer using the probability-flow ODE of diffusion models. We validate it experimentally on SpaceNet 8 with various scenarios, such as classical OOD detection with geographical shift and near-OOD setups: pre/post-flood and non-flooded/flooded image recognition. We show that our ODEED scorer significantly outperforms other diffusion-based and discriminative baselines on the more challenging near-OOD scenarios of flood image detection, where OOD images are close to the distribution tail. We aim to pave the way towards better use of generative models for anomaly detection in remote sensing.

4/22/2024

cs.CV cs.AI cs.LG