Facial Misrecognition Systems: Simple Weight Manipulations Force DNNs to Err Only on Specific Persons

Read original: arXiv:2301.03118 - Published 6/13/2024 by Irad Zehavi, Roee Nitzan, Adi Shamir

🤿

Overview

Researchers describe how to plant "backdoors" in facial recognition models based on deep Siamese neural networks.
These backdoors cause the system to make specific errors when classifying images of certain pre-selected individuals, without the need for any visual triggers.
The backdoors can be implemented by applying linear transformations to the model's final weight matrix, without additional training.
Multiple independent backdoors can be installed in the same model with minimal interference.

Plain English Explanation

The researchers have developed a way to secretly manipulate facial recognition systems to make mistakes when identifying certain people, without the users of the system being aware of the issue. They call these hidden vulnerabilities "backdoors."

For example, the researchers could backdoor a facial recognition system to always declare that any two images of a particular celebrity are of different people, even though the images are clearly the same person. Or they could make the system think that completely different-looking people, like Morgan Freeman and Scarlett Johansson, are actually the same person.

Surprisingly, the researchers found they could install these backdoors by simply modifying the final mathematical weights inside the neural network model, without needing to retrain or optimize the model in any other way. They also discovered that multiple backdoors could be independently added to the same model by different attackers, with almost no impact on each other's effectiveness.

The key advantage of this attack is that it's virtually invisible - there are no obvious triggers or changes to the model's normal behavior that would tip off users that something is wrong. The facial recognition system appears to work correctly for most people, while secretly making mistakes only on the target individuals chosen by the attacker.

Technical Explanation

The researchers developed two types of backdoor attacks against facial recognition models based on deep Siamese neural networks:

Anonymity backdoors: These cause the model to classify any two images of a specific person as being different people, effectively "anonymizing" that individual.
Confusion backdoors: These cause the model to classify any two images of a specific pair of people as being the same person, even if the individuals look completely different.

To implement these backdoors, the researchers found they could simply apply linear transformations to the final weight matrix of the neural network model, without any additional training. This allowed them to install multiple independent backdoors in the same model with minimal interference.

In experiments, the researchers were able to achieve very high success rates for both types of backdoors. For example, when trying to anonymize 10 different celebrities, the backdoored model failed to recognize the celebrities as the same person 97-98% of the time. And when trying to confuse the images of Morgan Freeman and Scarlett Johansson, the model declared them as the same person 98.5% of the time.

Importantly, the researchers found that the presence of these backdoors had minimal impact on the model's normal, "benign" accuracy in recognizing other people not targeted by the backdoors. In most cases, the benign accuracy degraded by less than 0.05%.

Critical Analysis

The researchers' work highlights a significant vulnerability in facial recognition systems that relies on deep neural networks. The ability to secretly plant backdoors that cause targeted errors, without impacting overall accuracy, is a concerning security risk.

One potential limitation is that the backdoors require access to training images of the target individuals. This may limit the attackers' ability to select arbitrary people to target. The researchers also note that detecting these types of invisible backdoors remains an open challenge.

Further research is needed to understand the broader implications and develop effective mitigation strategies. It's crucial that the computer vision and machine learning community takes steps to address backdoor vulnerabilities in high-stakes AI systems like facial recognition.

Conclusion

This research demonstrates a novel and stealthy way to compromise facial recognition models by installing hidden backdoors that cause targeted errors. The ability to do so without degrading overall accuracy, and to stack multiple independent backdoors in a single model, is a significant security concern.

While the specific attack technique may be difficult to scale to arbitrary targets, it highlights the broader issue of backdoor vulnerabilities in AI systems. Addressing these types of invisible backdoor attacks remains an important challenge for the research community, with implications for the safe and reliable deployment of AI technologies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

Facial Misrecognition Systems: Simple Weight Manipulations Force DNNs to Err Only on Specific Persons

Irad Zehavi, Roee Nitzan, Adi Shamir

In this paper, we describe how to plant novel types of backdoors in any facial recognition model based on the popular architecture of deep Siamese neural networks. These backdoors force the system to err only on natural images of specific persons who are preselected by the attacker, without controlling their appearance or inserting any triggers. For example, we show how such a backdoored system can classify any two images of a particular person as different people, or any two images of a particular pair of persons as the same person, with almost no effect on the correctness of its decisions for other persons. Surprisingly, we show that both types of backdoors can be implemented by applying linear transformations to the model's last weight matrix, with no additional training or optimization, using only images of the backdoor identities. A unique property of our attack is that multiple backdoors can be independently installed in the same model by multiple attackers, who may not be aware of each other's existence, with almost no interference. We have experimentally verified the attacks on a SOTA facial recognition system. When we tried to individually anonymize ten celebrities, the network failed to recognize two of their images as being the same person in $97.02%$ to $98.31%$ of the time. When we tried to confuse between the extremely different-looking Morgan Freeman and Scarlett Johansson, for example, their images were declared to be the same person in $98.47 %$ of the time. For each type of backdoor, we sequentially installed multiple backdoors with minimal effect on the performance of each other (for example, anonymizing all ten celebrities on the same model reduced the success rate for each celebrity by no more than $1.01%$). In all of our experiments, the benign accuracy of the network on other persons barely degraded (in most cases, it degraded by less than $0.05%$).

6/13/2024

✨

MakeupAttack: Feature Space Black-box Backdoor Attack on Face Recognition via Makeup Transfer

Ming Sun, Lihua Jing, Zixuan Zhu, Rui Wang

Backdoor attacks pose a significant threat to the training process of deep neural networks (DNNs). As a widely-used DNN-based application in real-world scenarios, face recognition systems once implanted into the backdoor, may cause serious consequences. Backdoor research on face recognition is still in its early stages, and the existing backdoor triggers are relatively simple and visible. Furthermore, due to the perceptibility, diversity, and similarity of facial datasets, many state-of-the-art backdoor attacks lose effectiveness on face recognition tasks. In this work, we propose a novel feature space backdoor attack against face recognition via makeup transfer, dubbed MakeupAttack. In contrast to many feature space attacks that demand full access to target models, our method only requires model queries, adhering to black-box attack principles. In our attack, we design an iterative training paradigm to learn the subtle features of the proposed makeup-style trigger. Additionally, MakeupAttack promotes trigger diversity using the adaptive selection method, dispersing the feature distribution of malicious samples to bypass existing defense methods. Extensive experiments were conducted on two widely-used facial datasets targeting multiple models. The results demonstrate that our proposed attack method can bypass existing state-of-the-art defenses while maintaining effectiveness, robustness, naturalness, and stealthiness, without compromising model performance.

8/23/2024

An Invisible Backdoor Attack Based On Semantic Feature

Yangming Chen

Backdoor attacks have severely threatened deep neural network (DNN) models in the past several years. These attacks can occur in almost every stage of the deep learning pipeline. Although the attacked model behaves normally on benign samples, it makes wrong predictions for samples containing triggers. However, most existing attacks use visible patterns (e.g., a patch or image transformations) as triggers, which are vulnerable to human inspection. In this paper, we propose a novel backdoor attack, making imperceptible changes. Concretely, our attack first utilizes the pre-trained victim model to extract low-level and high-level semantic features from clean images and generates trigger pattern associated with high-level features based on channel attention. Then, the encoder model generates poisoned images based on the trigger and extracted low-level semantic features without causing noticeable feature loss. We evaluate our attack on three prominent image classification DNN across three standard datasets. The results demonstrate that our attack achieves high attack success rates while maintaining robustness against backdoor defenses. Furthermore, we conduct extensive image similarity experiments to emphasize the stealthiness of our attack strategy.

5/21/2024

🔎

Mask-based Invisible Backdoor Attacks on Object Detection

Jeongjin Shin

Deep learning models have achieved unprecedented performance in the domain of object detection, resulting in breakthroughs in areas such as autonomous driving and security. However, deep learning models are vulnerable to backdoor attacks. These attacks prompt models to behave similarly to standard models without a trigger; however, they act maliciously upon detecting a predefined trigger. Despite extensive research on backdoor attacks in image classification, their application to object detection remains relatively underexplored. Given the widespread application of object detection in critical real-world scenarios, the sensitivity and potential impact of these vulnerabilities cannot be overstated. In this study, we propose an effective invisible backdoor attack on object detection utilizing a mask-based approach. Three distinct attack scenarios were explored for object detection: object disappearance, object misclassification, and object generation attack. Through extensive experiments, we comprehensively examined the effectiveness of these attacks and tested certain defense methods to determine effective countermeasures. Code will be available at https://github.com/jeongjin0/invisible-backdoor-object-detection

6/5/2024