A General Black-box Adversarial Attack on Graph-based Fake News Detectors

Read original: arXiv:2404.15744 - Published 4/29/2024 by Peican Zhu, Zechen Pan, Yang Liu, Jiwei Tian, Keke Tang, Zhen Wang

A General Black-box Adversarial Attack on Graph-based Fake News Detectors

Overview

This research paper presents a general black-box adversarial attack against graph-based fake news detectors.
The proposed attack can effectively fool these detectors without requiring access to their underlying architecture or training data.
The authors demonstrate the effectiveness of their attack on several state-of-the-art fake news detection models.

Plain English Explanation

Graph-based fake news detectors are a type of machine learning model that analyze the relationships between different pieces of information, such as news articles and social media posts, to identify potential misinformation. These models have become increasingly important for combating the spread of false or misleading content online.

However, researchers have found that these models can be vulnerable to adversarial attacks, where small, carefully crafted changes to the input data can cause the model to misclassify the content as real or fake.

In this paper, the authors present a new type of adversarial attack that can fool graph-based fake news detectors without requiring any knowledge of the model's internal architecture or training data. This "black-box" attack works by strategically modifying the connections between the nodes in the graph, similar to how an attacker might try to manipulate the relationships between different social media accounts.

The authors demonstrate the effectiveness of their attack on several state-of-the-art fake news detection models, showing that it can significantly reduce the models' ability to correctly identify misinformation. This is an important finding, as it highlights the need for more robust and secure graph-based detection systems that can withstand these types of adversarial attacks.

Technical Explanation

The authors propose a general black-box adversarial attack against graph-based fake news detectors. The key idea is to manipulate the connections (edges) between nodes in the input graph, rather than modifying the node features themselves.

The attack operates in a black-box setting, meaning the attacker does not have access to the model's internal architecture or training data. Instead, the attacker can only interact with the model through its input-output interface, observing the model's predictions for different input graphs.

The attack algorithm works as follows:

Initialize: The attacker starts with the original input graph and a set of target nodes that they want to misclassify.
Perturb: The attacker iteratively modifies the connections between nodes in the graph, adding or removing edges, with the goal of fooling the target detector.
Evaluate: After each perturbation, the attacker evaluates the model's prediction on the modified graph and updates the attack strategy accordingly.

The authors demonstrate the effectiveness of their attack on several state-of-the-art fake news detection models, including DAAD and a graph convolutional network (GCN) model. They show that the attack can significantly reduce the models' accuracy on identifying misinformation, with success rates up to 90%.

Critical Analysis

The authors acknowledge several limitations of their approach. First, the attack assumes the attacker has the ability to modify the connections between nodes in the input graph, which may not always be feasible in real-world scenarios. Additionally, the authors do not consider the case where the target model employs defense mechanisms to detect and mitigate adversarial attacks.

Further research is needed to explore the robustness of graph-based fake news detectors to more sophisticated adversarial attacks, as well as the development of effective countermeasures to protect these models from such threats. It is crucial that the research community continues to study the vulnerabilities of AI-powered systems to ensure they are reliable and trustworthy for real-world applications.

Conclusion

This paper presents a novel black-box adversarial attack against graph-based fake news detectors. The authors demonstrate the effectiveness of their attack, which works by strategically manipulating the connections between nodes in the input graph, in fooling several state-of-the-art detection models.

The findings of this research highlight the need for more robust and secure graph-based detection systems that can withstand adversarial attacks. As the use of AI-powered tools for combating misinformation continues to grow, it is essential that these systems are designed with security and resilience in mind to ensure they remain effective in the face of evolving threats.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

A General Black-box Adversarial Attack on Graph-based Fake News Detectors

Peican Zhu, Zechen Pan, Yang Liu, Jiwei Tian, Keke Tang, Zhen Wang

Graph Neural Network (GNN)-based fake news detectors apply various methods to construct graphs, aiming to learn distinctive news embeddings for classification. Since the construction details are unknown for attackers in a black-box scenario, it is unrealistic to conduct the classical adversarial attacks that require a specific adjacency matrix. In this paper, we propose the first general black-box adversarial attack framework, i.e., General Attack via Fake Social Interaction (GAFSI), against detectors based on different graph structures. Specifically, as sharing is an important social interaction for GNN-based fake news detectors to construct the graph, we simulate sharing behaviors to fool the detectors. Firstly, we propose a fraudster selection module to select engaged users leveraging local and global information. In addition, a post injection module guides the selected users to create shared relations by sending posts. The sharing records will be added to the social context, leading to a general attack against different detectors. Experimental results on empirical datasets demonstrate the effectiveness of GAFSI.

4/29/2024

❗

Multi-agent Attacks for Black-box Social Recommendations

Shijie Wang, Wenqi Fan, Xiao-yong Wei, Xiaowei Mei, Shanru Lin, Qing Li

The rise of online social networks has facilitated the evolution of social recommender systems, which incorporate social relations to enhance users' decision-making process. With the great success of Graph Neural Networks (GNNs) in learning node representations, GNN-based social recommendations have been widely studied to model user-item interactions and user-user social relations simultaneously. Despite their great successes, recent studies have shown that these advanced recommender systems are highly vulnerable to adversarial attacks, in which attackers can inject well-designed fake user profiles to disrupt recommendation performances. While most existing studies mainly focus on argeted attacks to promote target items on vanilla recommender systems, untargeted attacks to degrade the overall prediction performance are less explored on social recommendations under a black-box scenario. To perform untargeted attacks on social recommender systems, attackers can construct malicious social relationships for fake users to enhance the attack performance. However, the coordination of social relations and item profiles is challenging for attacking black-box social recommendations. To address this limitation, we first conduct several preliminary studies to demonstrate the effectiveness of cross-community connections and cold-start items in degrading recommendations performance. Specifically, we propose a novel framework MultiAttack based on multi-agent reinforcement learning to coordinate the generation of cold-start item profiles and cross-community social relations for conducting untargeted attacks on black-box social recommendations. Comprehensive experiments on various real-world datasets demonstrate the effectiveness of our proposed attacking framework under the black-box setting.

9/17/2024

Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks

Yunfeng Diao, Naixin Zhai, Changtao Miao, Xun Yang, Meng Wang

Recent advancements in image synthesis, particularly with the advent of GAN and Diffusion models, have amplified public concerns regarding the dissemination of disinformation. To address such concerns, numerous AI-generated Image (AIGI) Detectors have been proposed and achieved promising performance in identifying fake images. However, there still lacks a systematic understanding of the adversarial robustness of these AIGI detectors. In this paper, we examine the vulnerability of state-of-the-art AIGI detectors against adversarial attack under white-box and black-box settings, which has been rarely investigated so far. For the task of AIGI detection, we propose a new attack containing two main parts. First, inspired by the obvious difference between real images and fake images in the frequency domain, we add perturbations under the frequency domain to push the image away from its original frequency distribution. Second, we explore the full posterior distribution of the surrogate model to further narrow this gap between heterogeneous models, e.g. transferring adversarial examples across CNNs and ViTs. This is achieved by introducing a novel post-train Bayesian strategy that turns a single surrogate into a Bayesian one, capable of simulating diverse victim models using one pre-trained surrogate, without the need for re-training. We name our method as frequency-based post-train Bayesian attack, or FPBA. Through FPBA, we show that adversarial attack is truly a real threat to AIGI detectors, because FPBA can deliver successful black-box attacks across models, generators, defense methods, and even evade cross-generator detection, which is a crucial real-world detection scenario.

7/31/2024

DAAD: Dynamic Analysis and Adaptive Discriminator for Fake News Detection

Xinqi Su, Yawen Cui, Ajian Liu, Xun Lin, Yuhao Wang, Haochen Liang, Wenhui Li, Zitong Yu

In current web environment, fake news spreads rapidly across online social networks, posing serious threats to society. Existing multimodal fake news detection (MFND) methods can be classified into knowledge-based and semantic-based approaches. However, these methods are overly dependent on human expertise and feedback, lacking flexibility. To address this challenge, we propose a Dynamic Analysis and Adaptive Discriminator (DAAD) approach for fake news detection. For knowledge-based methods, we introduce the Monte Carlo Tree Search (MCTS) algorithm to leverage the self-reflective capabilities of large language models (LLMs) for prompt optimization, providing richer, domain-specific details and guidance to the LLMs, while enabling more flexible integration of LLM comment on news content. For semantic-based methods, we define four typical deceit patterns: emotional exaggeration, logical inconsistency, image manipulation, and semantic inconsistency, to reveal the mechanisms behind fake news creation. To detect these patterns, we carefully design four discriminators and expand them in depth and breadth, using the soft-routing mechanism to explore optimal detection models. Experimental results on three real-world datasets demonstrate the superiority of our approach. The code will be available at: https://github.com/SuXinqi/DAAD.

8/21/2024