MetaGAD: Meta Representation Adaptation for Few-Shot Graph Anomaly Detection

Read original: arXiv:2305.10668 - Published 8/27/2024 by Xiongxiao Xu, Kaize Ding, Canyu Chen, Kai Shu

❗

Overview

Graph anomaly detection is an important problem in domains like financial fraud, social spam, and network intrusion
Most existing methods are unsupervised, as labeled anomalies are expensive to acquire
Unsupervised methods may identify uninteresting anomalies due to lack of prior knowledge
Limited labeled anomalies can potentially advance graph anomaly detection, but this area is relatively unexplored

Plain English Explanation

Detecting anomalies or unusual patterns in graph-structured data is a crucial task in various fields such as financial fraud detection, social media spam filtering, and network intrusion monitoring. Most existing methods for this problem are unsupervised, meaning they do not rely on labeled examples of anomalies. This is because obtaining a large number of labeled anomalies can be extremely expensive and time-consuming.

However, the anomalies identified by unsupervised methods may not always be the most relevant or interesting ones, as these methods lack prior knowledge about what constitutes an anomaly. In real-world scenarios, it is often possible to obtain a small number of labeled anomalies, and leveraging this limited information could significantly improve the performance of graph anomaly detection. Unfortunately, this area has been relatively unexplored so far.

Technical Explanation

The paper presents a novel meta-learning based framework called MetaGAD that aims to effectively leverage a small number of labeled anomalies, along with a large amount of unlabeled data, to detect anomalies in graph-structured data. The key challenges addressed by MetaGAD are the irregularity of anomalies and the overfitting issue that can arise in few-shot learning scenarios.

MetaGAD is formulated as a bi-level optimization problem, where the outer optimization loop ensures that the model converges to a solution that minimizes the validation loss, thereby enhancing its generalization capacity. The inner optimization loop adapts the knowledge learned from self-supervised pretraining to the few-shot supervised learning task of graph anomaly detection.

The authors evaluate MetaGAD on six real-world datasets, including both synthetic anomalies and organic anomalies (i.e., anomalies that naturally occur in the datasets). The results demonstrate the effectiveness of MetaGAD in detecting anomalies when only a small number of labeled anomalies are available.

Critical Analysis

The paper presents a promising approach to addressing the challenging problem of graph anomaly detection with limited labeled data. By leveraging meta-learning techniques, MetaGAD is able to effectively transfer knowledge from self-supervised pretraining to the few-shot supervised learning task, overcoming the issues of irregularity and overfitting that can arise in such scenarios.

However, the paper does not provide a detailed analysis of the limitations of the proposed approach. For example, it would be interesting to understand how MetaGAD performs when the number of labeled anomalies is further reduced, or how it compares to other few-shot learning methods for graph anomaly detection. Additionally, the authors could have discussed potential real-world applications and deployment challenges of their approach.

Conclusion

This paper introduces a novel meta-learning-based framework called MetaGAD that addresses the problem of graph anomaly detection with limited labeled data. By effectively leveraging a small number of labeled anomalies and a large amount of unlabeled data, MetaGAD demonstrates superior performance in detecting both synthetic and organic anomalies compared to existing methods.

The proposed approach has the potential to significantly impact various domains that rely on graph-structured data, such as financial fraud detection, social media spam filtering, and network intrusion monitoring. By enabling more accurate and efficient anomaly detection with limited labeled data, MetaGAD could lead to improvements in security, fraud prevention, and the overall health of these critical systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

❗

MetaGAD: Meta Representation Adaptation for Few-Shot Graph Anomaly Detection

Xiongxiao Xu, Kaize Ding, Canyu Chen, Kai Shu

Graph anomaly detection has long been an important problem in various domains pertaining to information security such as financial fraud, social spam and network intrusion. The majority of existing methods are performed in an unsupervised manner, as labeled anomalies in a large scale are often too expensive to acquire. However, the identified anomalies may turn out to be uninteresting data instances due to the lack of prior knowledge. In real-world scenarios, it is often feasible to obtain limited labeled anomalies, which have great potential to advance graph anomaly detection. However, the work exploring limited labeled anomalies and a large amount of unlabeled nodes in graphs to detect anomalies is relatively limited. Therefore, in this paper, we study an important problem of few-shot graph anomaly detection. Nonetheless, it is challenging to fully leverage the information of few-shot anomalous nodes due to the irregularity of anomalies and the overfitting issue in the few-shot learning. To tackle the above challenges, we propose a novel meta-learning based framework, MetaGAD, that learns to adapt the knowledge from self-supervised learning to few-shot supervised learning for graph anomaly detection. In specific, we formulate the problem as a bi-level optimization, ensuring MetaGAD converging to minimizing the validation loss, thus enhancing the generalization capacity. The comprehensive experiments on six real-world datasets with synthetic anomalies and organic anomalies (available in the datasets) demonstrate the effectiveness of MetaGAD in detecting anomalies with few-shot anomalies. The code is available at https://github.com/XiongxiaoXu/MetaGAD.

8/27/2024

❗

Label-based Graph Augmentation with Metapath for Graph Anomaly Detection

Hwan Kim, Junghoon Kim, Byung Suk Lee, Sungsu Lim

Graph anomaly detection has attracted considerable attention from various domain ranging from network security to finance in recent years. Due to the fact that labeling is very costly, existing methods are predominately developed in an unsupervised manner. However, the detected anomalies may be found out uninteresting instances due to the absence of prior knowledge regarding the anomalies looking for. This issue may be solved by using few labeled anomalies as prior knowledge. In real-world scenarios, we can easily obtain few labeled anomalies. Efficiently leveraging labelled anomalies as prior knowledge is crucial for graph anomaly detection; however, this process remains challenging due to the inherently limited number of anomalies available. To address the problem, we propose a novel approach that leverages metapath to embed actual connectivity patterns between anomalous and normal nodes. To further efficiently exploit context information from metapath-based anomaly subgraph, we present a new framework, Metapath-based Graph Anomaly Detection (MGAD), incorporating GCN layers in both the dual-encoders and decoders to efficiently propagate context information between abnormal and normal nodes. Specifically, MGAD employs GNN-based graph autoencoder as its backbone network. Moreover, dual encoders capture the complex interactions and metapath-based context information between labeled and unlabeled nodes both globally and locally. Through a comprehensive set of experiments conducted on seven real-world networks, this paper demonstrates the superiority of the MGAD method compared to state-of-the-art techniques. The code is available at https://github.com/missinghwan/MGAD.

4/15/2024

❗

Generative Semi-supervised Graph Anomaly Detection

Hezhe Qiao, Qingsong Wen, Xiaoli Li, Ee-Peng Lim, Guansong Pang

This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a small percentage of normal nodes, helps enhance the detection performance of existing unsupervised GAD methods when they are adapted to the semi-supervised setting. However, their utilization of these normal nodes is limited. In this paper, we propose a novel Generative GAD approach (namely GGAD) for the semi-supervised scenario to better exploit the normal nodes. The key idea is to generate pseudo anomaly nodes, referred to as 'outlier nodes', for providing effective negative node samples in training a discriminative one-class classifier. The main challenge here lies in the lack of ground truth information about real anomaly nodes. To address this challenge, GGAD is designed to leverage two important priors about the anomaly nodes -- asymmetric local affinity and egocentric closeness -- to generate reliable outlier nodes that assimilate anomaly nodes in both graph structure and feature representations. Comprehensive experiments on six real-world GAD datasets are performed to establish a benchmark for semi-supervised GAD and show that GGAD substantially outperforms state-of-the-art unsupervised and semi-supervised GAD methods with varying numbers of training normal nodes. Code will be made available at https://github.com/mala-lab/GGAD.

5/29/2024

🤷

From Unsupervised to Few-shot Graph Anomaly Detection: A Multi-scale Contrastive Learning Approach

Yu Zheng, Ming Jin, Yixin Liu, Lianhua Chi, Khoa T. Phan, Yi-Ping Phoebe Chen

Anomaly detection from graph data is an important data mining task in many applications such as social networks, finance, and e-commerce. Existing efforts in graph anomaly detection typically only consider the information in a single scale (view), thus inevitably limiting their capability in capturing anomalous patterns in complex graph data. To address this limitation, we propose a novel framework, graph ANomaly dEtection framework with Multi-scale cONtrastive lEarning (ANEMONE in short). By using a graph neural network as a backbone to encode the information from multiple graph scales (views), we learn better representation for nodes in a graph. In maximizing the agreements between instances at both the patch and context levels concurrently, we estimate the anomaly score of each node with a statistical anomaly estimator according to the degree of agreement from multiple perspectives. To further exploit a handful of ground-truth anomalies (few-shot anomalies) that may be collected in real-life applications, we further propose an extended algorithm, ANEMONE-FS, to integrate valuable information in our method. We conduct extensive experiments under purely unsupervised settings and few-shot anomaly detection settings, and we demonstrate that the proposed method ANEMONE and its variant ANEMONE-FS consistently outperform state-of-the-art algorithms on six benchmark datasets.

8/2/2024