Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

Read original: arXiv:2312.06441 - Published 7/9/2024 by Fan Xu, Nan Wang, Hao Wu, Xuezhi Wen, Xibin Zhao, Hai Wan

Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

Overview

This paper revisits the problem of graph-based fraud detection, focusing on the challenges of heterophily (where connected nodes have different labels) and spectral properties of the graph.
The authors propose several techniques to address these challenges and improve the performance of graph neural networks (GNNs) for fraud detection.
The paper introduces new architectures, training methods, and evaluation metrics to better handle heterophily and leverage the graph spectrum for more effective fraud detection.

Plain English Explanation

The paper looks at how to detect fraudulent activity using network information, like who is connected to whom. This is a common way to find fraud, but it has some challenges.

One big challenge is heterophily, which means that connected people often have different behaviors or characteristics. In a fraud network, fraudsters might actually be connected to non-fraudsters, making it harder to identify them just based on their connections.

Another challenge is the graph spectrum, which is a way of looking at the underlying structure of the network. The authors show that understanding this spectrum can help improve fraud detection, but current methods don't take it into account well.

To address these issues, the paper introduces new techniques for using graph neural networks to detect fraud. This includes architectures that are better suited to heterophilic networks, as well as ways to leverage the graph spectrum to improve performance.

The goal is to make graph-based fraud detection more robust and effective, even in situations where the network connections don't neatly match the fraudulent behavior.

Technical Explanation

The paper focuses on two key challenges in graph-based fraud detection:

Heterophily: In many real-world fraud networks, fraudsters are often connected to non-fraudsters, violating the common assumption of
homophily
(where connected nodes tend to have the same label). This makes it harder for standard GNN models to effectively identify fraudulent nodes.
Graph Spectrum: The spectral properties of the graph, captured by its eigenvalues and eigenvectors, can provide valuable information for fraud detection. However, existing GNN methods do not fully leverage this graph spectrum.

To address these challenges, the authors propose several technical contributions:

Heterophilic GNN Architectures: They develop new GNN architectures that are better suited to heterophilic graphs, such as FUGNN, which aims to balance fairness and utility in GNN predictions.
Spectral-Aware GNNs: The authors introduce techniques to explicitly incorporate the graph spectrum into the GNN training process, helping the model better leverage this structural information.
Adaptive Spectral Filtering: They propose methods to restructure the graph to increase its homophily, making it more amenable to standard GNN approaches.
Generative Semi-Supervised Fraud Detection: The authors develop a semi-supervised generative model that can effectively identify fraudulent nodes even with limited labeled data.

Through extensive experiments on real-world fraud datasets, the authors demonstrate the effectiveness of their proposed techniques in improving graph-based fraud detection, especially in the presence of heterophily and challenging spectral properties.

Critical Analysis

The paper makes a valuable contribution by addressing important limitations in existing graph-based fraud detection methods. The focus on heterophily and the graph spectrum is particularly insightful, as these factors can significantly impact the performance of standard GNN models.

However, the authors acknowledge that their proposed solutions, while effective, may not fully solve these challenges. Heterophily and complex spectral properties can still pose difficulties, and the generalization of the techniques to different fraud domains or graph structures may require further investigation.

Additionally, the paper does not delve into the potential biases or fairness implications of the proposed methods. As GNNs are applied to high-stakes domains like fraud detection, it is crucial to carefully examine these aspects and ensure the techniques do not exacerbate any existing societal biases.

Overall, the paper represents an important step forward in improving graph-based anomaly detection and provides a foundation for future research to further enhance the robustness and fairness of fraud detection systems.

Conclusion

This paper tackles the crucial problem of graph-based fraud detection, addressing two key challenges: heterophily and the graph spectrum. By introducing novel GNN architectures, spectral-aware training methods, and semi-supervised generative models, the authors demonstrate significant improvements in fraud detection performance, particularly in scenarios where standard GNN approaches struggle.

The insights and techniques presented in this work have broad implications for the field of graph-based anomaly detection, providing a path forward for developing more robust and effective systems that can better handle the complexities of real-world fraud networks. As the application of GNNs continues to grow, this research highlights the importance of carefully considering the underlying graph structure and spectral properties to ensure the reliable and fair deployment of these powerful techniques.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

Fan Xu, Nan Wang, Hao Wu, Xuezhi Wen, Xibin Zhao, Hai Wan

Graph-based fraud detection (GFD) can be regarded as a challenging semi-supervised node binary classification task. In recent years, Graph Neural Networks (GNN) have been widely applied to GFD, characterizing the anomalous possibility of a node by aggregating neighbor information. However, fraud graphs are inherently heterophilic, thus most of GNNs perform poorly due to their assumption of homophily. In addition, due to the existence of heterophily and class imbalance problem, the existing models do not fully utilize the precious node label information. To address the above issues, this paper proposes a semi-supervised GNN-based fraud detector SEC-GFD. This detector includes a hybrid filtering module and a local environmental constraint module, the two modules are utilized to solve heterophily and label utilization problem respectively. The first module starts from the perspective of the spectral domain, and solves the heterophily problem to a certain extent. Specifically, it divides the spectrum into various mixed-frequency bands based on the correlation between spectrum energy distribution and heterophily. Then in order to make full use of the node label information, a local environmental constraint module is adaptively designed. The comprehensive experimental results on four real-world fraud detection datasets denote that SEC-GFD outperforms other competitive graph-based fraud detectors. We release our code at https://github.com/Sunxkissed/SEC-GFD.

7/9/2024

Global and Local Confidence Based Fraud Detection Graph Neural Network

Jiaxun Liu, Yue Tian, Guanjun Liu

Graph Neural Networks (GNNs) are widely used in financial fraud detection due to their excellent ability on handling graph-structured financial data and modeling multilayer connections by aggregating information of neighbors. However, these GNN-based methods focus on extracting neighbor-level information but neglect a global perspective. This paper presents the concept and calculation formula of Global Confidence Degree (GCD) and thus designs GCD-based GNN (GCD-GNN) that can address the challenges of camouflage in fraudulent activities and thus can capture more global information. To obtain a precise GCD for each node, we use a multilayer perceptron to transform features and then the new features and the corresponding prototype are used to eliminate unnecessary information. The GCD of a node evaluates the typicality of the node and thus we can leverage GCD to generate attention values for message aggregation. This process is carried out through both the original GCD and its inverse, allowing us to capture both the typical neighbors with high GCD and the atypical ones with low GCD. Extensive experiments on two public datasets demonstrate that GCD-GNN outperforms state-of-the-art baselines, highlighting the effectiveness of GCD. We also design a lightweight GCD-GNN (GCD-GNN$_{light}$) that also outperforms the baselines but is slightly weaker than GCD-GNN on fraud detection performance. However, GCD-GNN$_{light}$ obviously outperforms GCD-GNN on convergence and inference speed.

8/20/2024

🔄

Shape-aware Graph Spectral Learning

Junjie Xu, Enyan Dai, Dongsheng Luo, Xiang Zhang, Suhang Wang

Spectral Graph Neural Networks (GNNs) are gaining attention for their ability to surpass the limitations of message-passing GNNs. They rely on supervision from downstream tasks to learn spectral filters that capture the graph signal's useful frequency information. However, some works empirically show that the preferred graph frequency is related to the graph homophily level. This relationship between graph frequency and graphs with homophily/heterophily has not been systematically analyzed and considered in existing spectral GNNs. To mitigate this gap, we conduct theoretical and empirical analyses revealing a positive correlation between low-frequency importance and the homophily ratio, and a negative correlation between high-frequency importance and the homophily ratio. Motivated by this, we propose shape-aware regularization on a Newton Interpolation-based spectral filter that can (i) learn an arbitrary polynomial spectral filter and (ii) incorporate prior knowledge about the desired shape of the corresponding homophily level. Comprehensive experiments demonstrate that NewtonNet can achieve graph spectral filters with desired shapes and superior performance on both homophilous and heterophilous datasets.

5/24/2024

Graph Neural Networks with Diverse Spectral Filtering

Jingwei Guo, Kaizhu Huang, Xinping Yi, Rui Zhang

Spectral Graph Neural Networks (GNNs) have achieved tremendous success in graph machine learning, with polynomial filters applied for graph convolutions, where all nodes share the identical filter weights to mine their local contexts. Despite the success, existing spectral GNNs usually fail to deal with complex networks (e.g., WWW) due to such homogeneous spectral filtering setting that ignores the regional heterogeneity as typically seen in real-world networks. To tackle this issue, we propose a novel diverse spectral filtering (DSF) framework, which automatically learns node-specific filter weights to exploit the varying local structure properly. Particularly, the diverse filter weights consist of two components -- A global one shared among all nodes, and a local one that varies along network edges to reflect node difference arising from distinct graph parts -- to balance between local and global information. As such, not only can the global graph characteristics be captured, but also the diverse local patterns can be mined with awareness of different node positions. Interestingly, we formulate a novel optimization problem to assist in learning diverse filters, which also enables us to enhance any spectral GNNs with our DSF framework. We showcase the proposed framework on three state-of-the-arts including GPR-GNN, BernNet, and JacobiConv. Extensive experiments over 10 benchmark datasets demonstrate that our framework can consistently boost model performance by up to 4.92% in node classification tasks, producing diverse filters with enhanced interpretability. Code is available at url{https://github.com/jingweio/DSF}.

5/24/2024