TASAR: Transferable Attack on Skeletal Action Recognition

Read original: arXiv:2409.02483 - Published 9/5/2024 by Yunfeng Diao, Baiqi Wu, Ruixuan Zhang, Ajian Liu, Xingxing Wei, Meng Wang, He Wang

TASAR: Transferable Attack on Skeletal Action Recognition

Overview

This paper introduces TASAR, a transfer-based attack on skeletal action recognition models.
The attack aims to fool skeletal action recognition systems by transferring adversarial examples from one network to another.
The authors demonstrate the effectiveness of TASAR against several state-of-the-art skeletal action recognition models.

Plain English Explanation

The paper discusses a new technique called TASAR (Transfer-based Attack on Skeletal Action Recognition) that can trick skeletal action recognition systems. Skeletal action recognition is a type of computer vision that can identify human actions and movements based on the positions of the body's joints.

The key idea behind TASAR is to create "adversarial examples" - slightly modified versions of normal input data that can fool the recognition model into making mistakes. The researchers show that these adversarial examples can be transferred from one skeletal action recognition model to another, allowing the attack to be used against a variety of systems.

By applying TASAR, the authors demonstrate that they can cause state-of-the-art skeletal action recognition models to misclassify actions. This highlights a vulnerability in these types of AI systems and the need for more robust defenses against adversarial attacks.

Technical Explanation

The paper introduces the TASAR approach for attacking skeletal action recognition models. TASAR works by generating adversarial examples - slightly perturbed versions of input skeletal data that can fool the target model into making incorrect predictions.

The core of TASAR is a transfer-based attack strategy. The authors first train a substitute model to mimic the behavior of the target model. They then use the substitute model to generate adversarial examples, which are designed to transfer and fool the target model, even if the substitute and target models have different architectures.

The authors evaluate TASAR against several state-of-the-art skeletal action recognition models, including NTU-RGBD, NTU-RGBD120, and Kinetics. The results show that TASAR can achieve a high attack success rate, causing the target models to misclassify actions in many cases.

Critical Analysis

The paper provides a thorough evaluation of the TASAR attack and its effectiveness against various skeletal action recognition models. However, the authors do not discuss potential defenses or countermeasures that could be used to mitigate such attacks.

Additionally, the paper does not explore the practical implications of this type of attack in real-world scenarios. It would be valuable to understand the feasibility and impact of TASAR in settings where skeletal action recognition is deployed, such as surveillance, smart homes, or human-robot interaction.

Further research could investigate more robust model architectures or training techniques that are less vulnerable to transfer-based adversarial attacks like TASAR. Exploring the trade-offs between model performance and adversarial robustness would also be an important area for future work.

Conclusion

This paper introduces TASAR, a transfer-based attack that can fool skeletal action recognition models. The authors demonstrate the effectiveness of TASAR against several state-of-the-art systems, highlighting a vulnerability in this type of AI technology.

The findings of this research underscore the need for developing more robust and secure skeletal action recognition models that can better withstand adversarial attacks. As these systems become more prevalent in various applications, ensuring their reliability and security will be crucial for their successful deployment and widespread adoption.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

TASAR: Transferable Attack on Skeletal Action Recognition

Yunfeng Diao, Baiqi Wu, Ruixuan Zhang, Ajian Liu, Xingxing Wei, Meng Wang, He Wang

Skeletal sequences, as well-structured representations of human behaviors, are crucial in Human Activity Recognition (HAR). The transferability of adversarial skeletal sequences enables attacks in real-world HAR scenarios, such as autonomous driving, intelligent surveillance, and human-computer interactions. However, existing Skeleton-based HAR (S-HAR) attacks exhibit weak adversarial transferability and, therefore, cannot be considered true transfer-based S-HAR attacks. More importantly, the reason for this failure remains unclear. In this paper, we study this phenomenon through the lens of loss surface, and find that its sharpness contributes to the poor transferability in S-HAR. Inspired by this observation, we assume and empirically validate that smoothening the rugged loss landscape could potentially improve adversarial transferability in S-HAR. To this end, we propose the first Transfer-based Attack on Skeletal Action Recognition, TASAR. TASAR explores the smoothed model posterior without re-training the pre-trained surrogates, which is achieved by a new post-train Dual Bayesian optimization strategy. Furthermore, unlike previous transfer-based attacks that treat each frame independently and overlook temporal coherence within sequences, TASAR incorporates motion dynamics into the Bayesian attack gradient, effectively disrupting the spatial-temporal coherence of S-HARs. To exhaustively evaluate the effectiveness of existing methods and our method, we build the first large-scale robust S-HAR benchmark, comprising 7 S-HAR models, 10 attack methods, 3 S-HAR datasets and 2 defense models. Extensive results demonstrate the superiority of TASAR. Our benchmark enables easy comparisons for future studies, with the code available in the supplementary material.

9/5/2024

Boosting Adversarial Transferability for Skeleton-based Action Recognition via Exploring the Model Posterior Space

Yunfeng Diao, Baiqi Wu, Ruixuan Zhang, Xun Yang, Meng Wang, He Wang

Skeletal motion plays a pivotal role in human activity recognition (HAR). Recently, attack methods have been proposed to identify the universal vulnerability of skeleton-based HAR(S-HAR). However, the research of adversarial transferability on S-HAR is largely missing. More importantly, existing attacks all struggle in transfer across unknown S-HAR models. We observed that the key reason is that the loss landscape of the action recognizers is rugged and sharp. Given the established correlation in prior studies~cite{qin2022boosting,wu2020towards} between loss landscape and adversarial transferability, we assume and empirically validate that smoothing the loss landscape could potentially improve adversarial transferability on S-HAR. This is achieved by proposing a new post-train Dual Bayesian strategy, which can effectively explore the model posterior space for a collection of surrogates without the need for re-training. Furthermore, to craft adversarial examples along the motion manifold, we incorporate the attack gradient with information of the motion dynamics in a Bayesian manner. Evaluated on benchmark datasets, e.g. HDM05 and NTU 60, the average transfer success rate can reach as high as 35.9% and 45.5% respectively. In comparison, current state-of-the-art skeletal attacks achieve only 3.6% and 9.8%. The high adversarial transferability remains consistent across various surrogate, victim, and even defense models. Through a comprehensive analysis of the results, we provide insights on what surrogates are more likely to exhibit transferability, to shed light on future research.

9/6/2024

🤔

Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

Yunfeng Diao, He Wang, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg, Meng Wang

Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars, where safety and lives are at stake. Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks. However, the proposed attacks require the full-knowledge of the attacked classifier, which is overly restrictive. In this paper, we show such threats indeed exist, even when the attacker only has access to the input/output of the model. To this end, we propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR. BASAR explores the interplay between the classification boundary and the natural motion manifold. To our best knowledge, this is the first time data manifold is introduced in adversarial attacks on time series. Via BASAR, we find on-manifold adversarial samples are extremely deceitful and rather common in skeletal motions, in contrast to the common belief that adversarial samples only exist off-manifold. Through exhaustive evaluation, we show that BASAR can deliver successful attacks across classifiers, datasets, and attack modes. By attack, BASAR helps identify the potential causes of the model vulnerability and provides insights on possible improvements. Finally, to mitigate the newly identified threat, we propose a new adversarial training approach by leveraging the sophisticated distributions of on/off-manifold adversarial samples, called mixed manifold-based adversarial training (MMAT). MMAT can successfully help defend against adversarial attacks without compromising classification accuracy.

5/7/2024

👁️

Towards Physical World Backdoor Attacks against Skeleton Action Recognition

Qichen Zheng, Yi Yu, Siyuan Yang, Jun Liu, Kwok-Yan Lam, Alex Kot

Skeleton Action Recognition (SAR) has attracted significant interest for its efficient representation of the human skeletal structure. Despite its advancements, recent studies have raised security concerns in SAR models, particularly their vulnerability to adversarial attacks. However, such strategies are limited to digital scenarios and ineffective in physical attacks, limiting their real-world applicability. To investigate the vulnerabilities of SAR in the physical world, we introduce the Physical Skeleton Backdoor Attacks (PSBA), the first exploration of physical backdoor attacks against SAR. Considering the practicalities of physical execution, we introduce a novel trigger implantation method that integrates infrequent and imperceivable actions as triggers into the original skeleton data. By incorporating a minimal amount of this manipulated data into the training set, PSBA enables the system misclassify any skeleton sequences into the target class when the trigger action is present. We examine the resilience of PSBA in both poisoned and clean-label scenarios, demonstrating its efficacy across a range of datasets, poisoning ratios, and model architectures. Additionally, we introduce a trigger-enhancing strategy to strengthen attack performance in the clean label setting. The robustness of PSBA is tested against three distinct backdoor defenses, and the stealthiness of PSBA is evaluated using two quantitative metrics. Furthermore, by employing a Kinect V2 camera, we compile a dataset of human actions from the real world to mimic physical attack situations, with our findings confirming the effectiveness of our proposed attacks. Our project website can be found at https://qichenzheng.github.io/psba-website.

8/19/2024