On adversarial training and the 1 Nearest Neighbor classifier

2404.06313

Published 4/12/2024 by Amir Hagai, Yair Weiss

🏋️

Abstract

The ability to fool deep learning classifiers with tiny perturbations of the input has lead to the development of adversarial training in which the loss with respect to adversarial examples is minimized in addition to the training examples. While adversarial training improves the robustness of the learned classifiers, the procedure is computationally expensive, sensitive to hyperparameters and may still leave the classifier vulnerable to other types of small perturbations. In this paper we analyze the adversarial robustness of the 1 Nearest Neighbor (1NN) classifier and compare its performance to adversarial training. We prove that under reasonable assumptions, the 1 NN classifier will be robust to {em any} small image perturbation of the training images and will give high adversarial accuracy on test images as the number of training examples goes to infinity. In experiments with 45 different binary image classification problems taken from CIFAR10, we find that 1NN outperform TRADES (a powerful adversarial training algorithm) in terms of average adversarial accuracy. In additional experiments with 69 pretrained robust models for CIFAR10, we find that 1NN outperforms almost all of them in terms of robustness to perturbations that are only slightly different from those seen during training. Taken together, our results suggest that modern adversarial training methods still fall short of the robustness of the simple 1NN classifier. our code can be found at https://github.com/amirhagai/On-Adversarial-Training-And-The-1-Nearest-Neighbor-Classifier

Create account to get full access

Overview

This paper provides a template for citing AI research papers in the PRIME AI style.
The template includes the authors, title, page numbers, and DOI (digital object identifier) for the cited paper.
The paper was generated using LaTeXML, a tool for converting LaTeX documents to HTML.

Plain English Explanation

This paper presents a template for citing AI research papers in the PRIME AI style. The PRIME AI style is a standardized format for citing AI-related publications, which includes the authors' names, the title of the paper, the page numbers, and the paper's digital object identifier (DOI). The template was generated using LaTeXML, a tool that converts LaTeX documents (a common typesetting language used in academic publishing) into HTML format. This template can be used by researchers and others who need to properly cite AI papers in their own work.

Technical Explanation

The paper provides a template for citing AI research papers in the PRIME AI style. The template includes the following elements:

Authors: The names of the authors who contributed to the research paper.
Title: The title of the research paper.
Pages: The page numbers where the paper can be found.
DOI: The digital object identifier (DOI) for the paper, which is a unique code used to identify and locate the paper online.

The template was generated using LaTeXML, a tool that can convert LaTeX documents (a common typesetting language used in academic publishing) into HTML format. This allows the citation template to be displayed and shared online.

Critical Analysis

The provided paper does not contain any critical analysis or discussion of the research itself. It is simply a template for properly citing AI research papers in the PRIME AI style. While the template is a useful tool for ensuring consistent and accurate citations, it does not provide any insights or commentary on the underlying research.

Conclusion

This paper presents a standardized template for citing AI research papers in the PRIME AI style. The template includes the key elements needed for a proper citation, such as the authors, title, page numbers, and DOI. This template can be used by researchers and others who need to cite AI-related publications in their own work.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🖼️

PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

Giang Nguyen, Valerie Chen, Mohammad Reza Taesiri, Anh Nguyen

Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision. In this paper, we show a novel utility of nearest neighbors: To improve predictions of a frozen, pretrained classifier C. We leverage an image comparator S that (1) compares the input image with NN images from the top-K most probable classes; and (2) uses S's output scores to weight the confidence scores of C. Our method consistently improves fine-grained image classification accuracy on CUB-200, Cars-196, and Dogs-120. Also, a human study finds that showing lay users our probable-class nearest neighbors (PCNN) improves their decision accuracy over prior work which only shows only the top-1 class examples.

4/23/2024

cs.CV cs.HC

Towards unlocking the mystery of adversarial fragility of neural networks

Jingchao Gao, Raghu Mudumbai, Xiaodong Wu, Jirong Yi, Catherine Xu, Hui Xie, Weiyu Xu

In this paper, we study the adversarial robustness of deep neural networks for classification tasks. We look at the smallest magnitude of possible additive perturbations that can change the output of a classification algorithm. We provide a matrix-theoretic explanation of the adversarial fragility of deep neural network for classification. In particular, our theoretical results show that neural network's adversarial robustness can degrade as the input dimension $d$ increases. Analytically we show that neural networks' adversarial robustness can be only $1/sqrt{d}$ of the best possible adversarial robustness. Our matrix-theoretic explanation is consistent with an earlier information-theoretic feature-compression-based explanation for the adversarial fragility of neural networks.

6/26/2024

cs.LG cs.CR cs.IT eess.SP

🖼️

Robust Image Classification in the Presence of Out-of-Distribution and Adversarial Samples Using Attractors in Neural Networks

Nasrin Alipour, Seyyed Ali SeyyedSalehi

The proper handling of out-of-distribution (OOD) samples in deep classifiers is a critical concern for ensuring the suitability of deep neural networks in safety-critical systems. Existing approaches developed for robust OOD detection in the presence of adversarial attacks lose their performance by increasing the perturbation levels. This study proposes a method for robust classification in the presence of OOD samples and adversarial attacks with high perturbation levels. The proposed approach utilizes a fully connected neural network that is trained to use training samples as its attractors, enhancing its robustness. This network has the ability to classify inputs and identify OOD samples as well. To evaluate this method, the network is trained on the MNIST dataset, and its performance is tested on adversarial examples. The results indicate that the network maintains its performance even when classifying adversarial examples, achieving 87.13% accuracy when dealing with highly perturbed MNIST test data. Furthermore, by using fashion-MNIST and CIFAR-10-bw as OOD samples, the network can distinguish these samples from MNIST samples with an accuracy of 98.84% and 99.28%, respectively. In the presence of severe adversarial attacks, these measures decrease slightly to 98.48% and 98.88%, indicating the robustness of the proposed method.

6/18/2024

cs.CV cs.LG eess.IV

📊

Certified Robustness against Sparse Adversarial Perturbations via Data Localization

Ambar Pal, Ren'e Vidal, Jeremias Sulam

Recent work in adversarial robustness suggests that natural data distributions are localized, i.e., they place high probability in small volume regions of the input space, and that this property can be utilized for designing classifiers with improved robustness guarantees for $ell_2$-bounded perturbations. Yet, it is still unclear if this observation holds true for more general metrics. In this work, we extend this theory to $ell_0$-bounded adversarial perturbations, where the attacker can modify a few pixels of the image but is unrestricted in the magnitude of perturbation, and we show necessary and sufficient conditions for the existence of $ell_0$-robust classifiers. Theoretical certification approaches in this regime essentially employ voting over a large ensemble of classifiers. Such procedures are combinatorial and expensive or require complicated certification techniques. In contrast, a simple classifier emerges from our theory, dubbed Box-NN, which naturally incorporates the geometry of the problem and improves upon the current state-of-the-art in certified robustness against sparse attacks for the MNIST and Fashion-MNIST datasets.

5/24/2024

cs.LG cs.AI