From Unsupervised to Few-shot Graph Anomaly Detection: A Multi-scale Contrastive Learning Approach

Read original: arXiv:2202.05525 - Published 8/2/2024 by Yu Zheng, Ming Jin, Yixin Liu, Lianhua Chi, Khoa T. Phan, Yi-Ping Phoebe Chen

🤷

Overview

Anomaly detection from graph data is an important task in various applications.
Existing methods only consider single-scale information, limiting their ability to capture complex anomalous patterns.
The paper proposes a novel framework, ANEMONE, that uses a graph neural network to encode information from multiple graph scales.
ANEMONE learns node representations by maximizing the agreements between instances at both the patch and context levels.
An extended algorithm, ANEMONE-FS, is proposed to integrate valuable information from a few ground-truth anomalies.
Extensive experiments show that ANEMONE and ANEMONE-FS outperform state-of-the-art algorithms on benchmark datasets.

Plain English Explanation

Anomaly detection is the process of identifying unusual or unexpected patterns in data. This is an important task in many real-world applications, such as detecting fraud in financial transactions, identifying suspicious activities in social networks, and monitoring for security threats in e-commerce.

When the data is structured as a graph, with nodes representing entities and edges representing relationships, existing anomaly detection methods have typically focused on a single scale or view of the graph. This means they may miss out on complex patterns that span multiple scales of the graph, such as anomalous subgraphs or anomalous connections between different parts of the graph.

To address this limitation, the researchers propose a new framework called ANEMONE (Graph ANomaly dEtection framework with Multi-scale cONtrastive lEarning). ANEMONE uses a graph neural network to encode information from multiple scales of the graph, such as local neighborhoods and broader contexts. By maximizing the agreements between node representations at both the patch (local) and context (global) levels, ANEMONE can learn a more comprehensive understanding of the graph structure and identify anomalies more effectively.

Furthermore, the researchers introduce an extended algorithm, ANEMONE-FS, which can leverage a small number of known anomalies (few-shot anomalies) to further improve the anomaly detection performance. This can be useful in real-world scenarios where a limited number of ground-truth anomalies may be available.

Through extensive experiments on benchmark datasets, the researchers demonstrate that ANEMONE and ANEMONE-FS consistently outperform existing state-of-the-art anomaly detection algorithms. This suggests that their multi-scale contrastive learning approach is a promising direction for improving the detection of complex anomalies in graph-structured data.

Technical Explanation

The paper proposes a novel framework called ANEMONE (Graph ANomaly dEtection framework with Multi-scale cONtrastive lEarning) for anomaly detection in graph data. ANEMONE uses a graph neural network (GNN) as the backbone to encode information from multiple scales (views) of the graph, capturing both local and global patterns.

The key components of ANEMONE are:

Multi-scale Representation Learning: ANEMONE leverages a GNN to learn node representations that capture information from different scales of the graph, such as local neighborhoods and broader contexts.
Contrastive Learning: ANEMONE maximizes the agreements between node representations at both the patch (local) and context (global) levels. This contrastive learning approach helps to learn more comprehensive and robust node representations for anomaly detection.
Anomaly Scoring: Based on the learned node representations, ANEMONE estimates the anomaly score of each node using a statistical anomaly estimator. Nodes with lower agreement between their patch and context representations are considered more anomalous.

Additionally, the researchers propose an extended algorithm, ANEMONE-FS, which integrates a small number of ground-truth anomalies (few-shot anomalies) to further improve the anomaly detection performance.

The experimental evaluation on six benchmark datasets shows that ANEMONE and ANEMONE-FS consistently outperform state-of-the-art anomaly detection algorithms in both purely unsupervised and few-shot anomaly detection settings. This demonstrates the effectiveness of the proposed multi-scale contrastive learning approach in capturing complex anomalous patterns in graph data.

Critical Analysis

The paper presents a well-designed and comprehensive framework for anomaly detection in graph data. The key strengths of the proposed approach include:

Multi-scale Representation Learning: By encoding information from multiple scales of the graph, ANEMONE can capture anomalous patterns that may be missed by methods focusing on a single scale.
Contrastive Learning: The contrastive learning objective, which maximizes the agreements between patch and context representations, is a clever way to learn more robust and informative node embeddings for anomaly detection.
Incorporation of Few-shot Anomalies: The ANEMONE-FS extension, which leverages a small number of ground-truth anomalies, is a practical and effective way to further improve the anomaly detection performance in real-world scenarios.

However, the paper also acknowledges some limitations and potential areas for future research:

Scalability: The computational complexity of the multi-scale contrastive learning approach may limit its scalability to very large graphs. Exploring more efficient optimization techniques could address this concern.
Interpretability: The paper does not provide much insight into the interpretability of the identified anomalies. Developing methods to explain the detected anomalies and their underlying causes would be a valuable extension.
Evaluation on Real-world Datasets: While the experiments use benchmark datasets, evaluating the framework on real-world, domain-specific graph data with known anomalies would further demonstrate its practical applicability.
Robustness to Graph Perturbations: The paper does not assess the robustness of ANEMONE to common graph perturbations, such as node or edge deletions/additions. Exploring the resilience of the framework to such perturbations would be an important direction for future research.

Overall, the proposed ANEMONE framework represents a promising step forward in the field of graph anomaly detection. The combination of multi-scale representation learning and contrastive learning, along with the ability to leverage few-shot anomalies, makes this an impactful contribution that warrants further exploration and refinement.

Conclusion

This paper introduces a novel framework, ANEMONE, for anomaly detection in graph data. ANEMONE leverages a graph neural network to learn node representations that capture information from multiple scales of the graph, enabling the detection of complex anomalous patterns. By maximizing the agreements between patch and context representations through contrastive learning, ANEMONE can learn more robust and informative node embeddings for effective anomaly identification.

The researchers also propose an extended algorithm, ANEMONE-FS, which can integrate a small number of ground-truth anomalies to further improve the detection performance. Extensive experiments on benchmark datasets demonstrate the superior performance of ANEMONE and ANEMONE-FS compared to state-of-the-art anomaly detection methods.

The key contributions of this work include the multi-scale contrastive learning approach for graph anomaly detection and the ability to leverage few-shot anomalies. These advancements have the potential to significantly impact various real-world applications, such as fraud detection, network security monitoring, and social network analysis, where identifying complex anomalies in graph-structured data is crucial.

While the paper highlights some limitations, such as scalability and interpretability, the proposed ANEMONE framework represents a promising direction for further research and development in the field of graph anomaly detection.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤷

From Unsupervised to Few-shot Graph Anomaly Detection: A Multi-scale Contrastive Learning Approach

Yu Zheng, Ming Jin, Yixin Liu, Lianhua Chi, Khoa T. Phan, Yi-Ping Phoebe Chen

Anomaly detection from graph data is an important data mining task in many applications such as social networks, finance, and e-commerce. Existing efforts in graph anomaly detection typically only consider the information in a single scale (view), thus inevitably limiting their capability in capturing anomalous patterns in complex graph data. To address this limitation, we propose a novel framework, graph ANomaly dEtection framework with Multi-scale cONtrastive lEarning (ANEMONE in short). By using a graph neural network as a backbone to encode the information from multiple graph scales (views), we learn better representation for nodes in a graph. In maximizing the agreements between instances at both the patch and context levels concurrently, we estimate the anomaly score of each node with a statistical anomaly estimator according to the degree of agreement from multiple perspectives. To further exploit a handful of ground-truth anomalies (few-shot anomalies) that may be collected in real-life applications, we further propose an extended algorithm, ANEMONE-FS, to integrate valuable information in our method. We conduct extensive experiments under purely unsupervised settings and few-shot anomaly detection settings, and we demonstrate that the proposed method ANEMONE and its variant ANEMONE-FS consistently outperform state-of-the-art algorithms on six benchmark datasets.

8/2/2024

❗

MetaGAD: Meta Representation Adaptation for Few-Shot Graph Anomaly Detection

Xiongxiao Xu, Kaize Ding, Canyu Chen, Kai Shu

Graph anomaly detection has long been an important problem in various domains pertaining to information security such as financial fraud, social spam and network intrusion. The majority of existing methods are performed in an unsupervised manner, as labeled anomalies in a large scale are often too expensive to acquire. However, the identified anomalies may turn out to be uninteresting data instances due to the lack of prior knowledge. In real-world scenarios, it is often feasible to obtain limited labeled anomalies, which have great potential to advance graph anomaly detection. However, the work exploring limited labeled anomalies and a large amount of unlabeled nodes in graphs to detect anomalies is relatively limited. Therefore, in this paper, we study an important problem of few-shot graph anomaly detection. Nonetheless, it is challenging to fully leverage the information of few-shot anomalous nodes due to the irregularity of anomalies and the overfitting issue in the few-shot learning. To tackle the above challenges, we propose a novel meta-learning based framework, MetaGAD, that learns to adapt the knowledge from self-supervised learning to few-shot supervised learning for graph anomaly detection. In specific, we formulate the problem as a bi-level optimization, ensuring MetaGAD converging to minimizing the validation loss, thus enhancing the generalization capacity. The comprehensive experiments on six real-world datasets with synthetic anomalies and organic anomalies (available in the datasets) demonstrate the effectiveness of MetaGAD in detecting anomalies with few-shot anomalies. The code is available at https://github.com/XiongxiaoXu/MetaGAD.

8/27/2024

🧠

Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection

Yuanchen Bei, Sheng Zhou, Jinke Shi, Yao Ma, Haishuai Wang, Jiajun Bu

Unsupervised graph anomaly detection aims at identifying rare patterns that deviate from the majority in a graph without the aid of labels, which is important for a variety of real-world applications. Recent advances have utilized Graph Neural Networks (GNNs) to learn effective node representations by aggregating information from neighborhoods. This is motivated by the hypothesis that nodes in the graph tend to exhibit consistent behaviors with their neighborhoods. However, such consistency can be disrupted by graph anomalies in multiple ways. Most existing methods directly employ GNNs to learn representations, disregarding the negative impact of graph anomalies on GNNs, resulting in sub-optimal node representations and anomaly detection performance. While a few recent approaches have redesigned GNNs for graph anomaly detection under semi-supervised label guidance, how to address the adverse effects of graph anomalies on GNNs in unsupervised scenarios and learn effective representations for anomaly detection are still under-explored. To bridge this gap, in this paper, we propose a simple yet effective framework for Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection (G3AD). Specifically, G3AD introduces two auxiliary networks along with correlation constraints to guard the GNNs from inconsistent information encoding. Furthermore, G3AD introduces an adaptive caching module to guard the GNNs from solely reconstructing the observed data that contains anomalies. Extensive experiments demonstrate that our proposed G3AD can outperform seventeen state-of-the-art methods on both synthetic and real-world datasets.

4/26/2024

❗

AnomalyLLM: Few-shot Anomaly Edge Detection for Dynamic Graphs using Large Language Models

Shuo Liu, Di Yao, Lanting Fang, Zhetao Li, Wenbin Li, Kaiyu Feng, XiaoWen Ji, Jingping Bi

Detecting anomaly edges for dynamic graphs aims to identify edges significantly deviating from the normal pattern and can be applied in various domains, such as cybersecurity, financial transactions and AIOps. With the evolving of time, the types of anomaly edges are emerging and the labeled anomaly samples are few for each type. Current methods are either designed to detect randomly inserted edges or require sufficient labeled data for model training, which harms their applicability for real-world applications. In this paper, we study this problem by cooperating with the rich knowledge encoded in large language models(LLMs) and propose a method, namely AnomalyLLM. To align the dynamic graph with LLMs, AnomalyLLM pre-trains a dynamic-aware encoder to generate the representations of edges and reprograms the edges using the prototypes of word embeddings. Along with the encoder, we design an in-context learning framework that integrates the information of a few labeled samples to achieve few-shot anomaly detection. Experiments on four datasets reveal that AnomalyLLM can not only significantly improve the performance of few-shot anomaly detection, but also achieve superior results on new anomalies without any update of model parameters.

8/29/2024