Improving the interpretability of GNN predictions through conformal-based graph sparsification

2404.12356

Published 4/19/2024 by Pablo Sanchez-Martin, Kinaan Aamir Khan, Isabel Valera

🗣️

Abstract

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN emph{training} approach that jointly i) finds the most predictive subgraph by removing edges and/or nodes -- -emph{without making assumptions about the subgraph structure} -- while ii) optimizing the performance of the graph classification task. To that end, we rely on reinforcement learning to solve the resulting bi-level optimization with a reward function based on conformal predictions to account for the current in-training uncertainty of the classifier. Our empirical results on nine different graph classification datasets show that our method competes in performance with baselines while relying on significantly sparser subgraphs, leading to more interpretable GNN-based predictions.

Get summaries of the top AI research delivered straight to your inbox:

Overview

Proposes a new training approach for Graph Neural Networks (GNNs) that jointly finds the most predictive subgraph and optimizes graph classification performance
Unlike previous methods, this approach does not make assumptions about the structure of the subgraph
Relies on reinforcement learning and conformal predictions to handle the uncertainty of the classifier during training

Plain English Explanation

Graph Neural Networks (GNNs) are a powerful type of machine learning model that can handle data in the form of graphs, which are made up of nodes (like individual data points) and edges (the connections between them). GNNs have been very successful at solving tasks like classifying entire graphs, but the way they work can be hard to understand.

Most GNN models take information from all the nodes and edges in a graph, even if some of them aren't really relevant to the task at hand. This can make it difficult to interpret why the model made a particular prediction. In contrast, the new approach proposed in this paper tries to find the most important parts of the graph - the "predictive subgraph" - while also optimizing the model's performance on the graph classification task.

The key idea is to use reinforcement learning, a type of machine learning where the model learns by trial and error, to simultaneously discover the predictive subgraph and train the GNN. The model is rewarded when it finds a sparser subgraph that still performs well on the task. This helps make the model's decisions more interpretable, since we can see which parts of the graph it's focusing on.

The paper shows that this new approach can match the performance of standard GNN models while using much smaller and more interpretable subgraphs. This could be very useful in applications where we need to understand how a model is making its decisions, like in medical diagnosis or analyzing social networks.

Technical Explanation

The paper proposes a novel GNN training approach that jointly learns to find the most predictive subgraph and optimize the graph classification task. Unlike previous methods that make assumptions about the subgraph structure, this approach uses reinforcement learning to discover the subgraph without any such assumptions.

The key components are:

A subgraph selector module that uses reinforcement learning to decide which edges and nodes to remove from the input graph.
A GNN classifier that takes the selected subgraph as input and performs the graph classification task.
A reward function based on conformal predictions that accounts for the uncertainty of the classifier during training.

The subgraph selector and GNN classifier are trained in a bi-level optimization, where the selector tries to find the sparsest subgraph that maintains good classification performance. This encourages the model to focus on the most relevant parts of the graph, leading to more interpretable predictions.

The experiments on nine graph classification datasets show that this approach can match the performance of standard GNN baselines while using significantly sparser subgraphs. This suggests the method is effective at identifying the most predictive parts of the graph, which could be valuable in applications where interpretability is important.

Critical Analysis

The paper presents a compelling approach to improving the interpretability of GNN models, but there are a few potential limitations and areas for further research:

The reliance on reinforcement learning may make the training process more unstable and slower to converge compared to standard GNN training. The authors could explore ways to stabilize the training, such as using reinforcement learning enhancements.
The paper only evaluates the method on static graph datasets, but many real-world graphs are evolving over time. It would be interesting to see how the approach performs on dynamic graph data.
While the sparser subgraphs produced by the method are more interpretable, the paper does not provide a detailed analysis of the discovered subgraphs or discuss their potential insights. Future work could explore the qualitative properties of the learned subgraphs and how they relate to the underlying graph structure and classification task.

Overall, the paper presents a promising direction for improving the interpretability of GNN models, and the proposed approach could have significant practical implications in domains where explainability is crucial.

Conclusion

This paper introduces a new training approach for Graph Neural Networks that can jointly learn to find the most predictive subgraph of the input graph while optimizing the graph classification task. Unlike previous methods, this approach does not make assumptions about the structure of the subgraph, instead using reinforcement learning to discover it.

The experiments show that this method can match the performance of standard GNN baselines while relying on significantly sparser subgraphs, leading to more interpretable predictions. This could be very useful in applications where understanding the model's decision-making process is important, such as medical diagnosis or social network analysis.

The paper also highlights some potential limitations, such as the instability of reinforcement learning training, and suggests directions for future research, such as exploring the method's performance on dynamic graph data. Overall, this work represents an important step towards developing more interpretable and trustworthy Graph Neural Network models.

Related Papers

Locality-Aware Graph-Rewiring in GNNs

Federico Barbero, Ameya Velingker, Amin Saberi, Michael Bronstein, Francesco Di Giovanni

Graph Neural Networks (GNNs) are popular models for machine learning on graphs that typically follow the message-passing paradigm, whereby the feature of a node is updated recursively upon aggregating information over its neighbors. While exchanging messages over the input graph endows GNNs with a strong inductive bias, it can also make GNNs susceptible to over-squashing, thereby preventing them from capturing long-range interactions in the given graph. To rectify this issue, graph rewiring techniques have been proposed as a means of improving information flow by altering the graph connectivity. In this work, we identify three desiderata for graph-rewiring: (i) reduce over-squashing, (ii) respect the locality of the graph, and (iii) preserve the sparsity of the graph. We highlight fundamental trade-offs that occur between spatial and spectral rewiring techniques; while the former often satisfy (i) and (ii) but not (iii), the latter generally satisfy (i) and (iii) at the expense of (ii). We propose a novel rewiring framework that satisfies all of (i)--(iii) through a locality-aware sequence of rewiring operations. We then discuss a specific instance of such rewiring framework and validate its effectiveness on several real-world benchmarks, showing that it either matches or significantly outperforms existing rewiring approaches.

5/7/2024

cs.LG

Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

Zhenzhong Wang, Qingyuan Zeng, Wanyu Lin, Min Jiang, Kay Chen Tan

While graph neural networks (GNNs) have become the de-facto standard for graph-based node classification, they impose a strong assumption on the availability of sufficient labeled samples. This assumption restricts the classification performance of prevailing GNNs on many real-world applications suffering from low-data regimes. Specifically, features extracted from scarce labeled nodes could not provide sufficient supervision for the unlabeled samples, leading to severe over-fitting. In this work, we point out that leveraging subgraphs to capture long-range dependencies can augment the representation of a node with homophily properties, thus alleviating the low-data regime. However, prior works leveraging subgraphs fail to capture the long-range dependencies among nodes. To this end, we present a novel self-supervised learning framework, called multi-view subgraph neural networks (Muse), for handling long-range dependencies. In particular, we propose an information theory-based identification mechanism to identify two types of subgraphs from the views of input space and latent space, respectively. The former is to capture the local structure of the graph, while the latter captures the long-range dependencies among nodes. By fusing these two views of subgraphs, the learned representations can preserve the topological properties of the graph at large, including the local structure and long-range dependencies, thus maximizing their expressiveness for downstream node classification tasks. Experimental results show that Muse outperforms the alternative methods on node classification tasks with limited labeled data.

4/22/2024

cs.LG cs.AI

🧠

Graph Convolutional Neural Networks Sensitivity under Probabilistic Error Model

Xinjue Wang, Esa Ollila, Sergiy A. Vorobyov

Graph Neural Networks (GNNs), particularly Graph Convolutional Neural Networks (GCNNs), have emerged as pivotal instruments in machine learning and signal processing for processing graph-structured data. This paper proposes an analysis framework to investigate the sensitivity of GCNNs to probabilistic graph perturbations, directly impacting the graph shift operator (GSO). Our study establishes tight expected GSO error bounds, which are explicitly linked to the error model parameters, and reveals a linear relationship between GSO perturbations and the resulting output differences at each layer of GCNNs. This linearity demonstrates that a single-layer GCNN maintains stability under graph edge perturbations, provided that the GSO errors remain bounded, regardless of the perturbation scale. For multilayer GCNNs, the dependency of system's output difference on GSO perturbations is shown to be a recursion of linearity. Finally, we exemplify the framework with the Graph Isomorphism Network (GIN) and Simple Graph Convolution Network (SGCN). Experiments validate our theoretical derivations and the effectiveness of our approach.

5/7/2024

stat.ML cs.LG

EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time

Shengyao Lu, Bang Liu, Keith G. Mills, Jiao He, Di Niu

Understanding and explaining the predictions of Graph Neural Networks (GNNs), is crucial for enhancing their safety and trustworthiness. Subgraph-level explanations are gaining attention for their intuitive appeal. However, most existing subgraph-level explainers face efficiency challenges in explaining GNNs due to complex search processes. The key challenge is to find a balance between intuitiveness and efficiency while ensuring transparency. Additionally, these explainers usually induce subgraphs by nodes, which may introduce less-intuitive disconnected nodes in the subgraph-level explanations or omit many important subgraph structures. In this paper, we reveal that inducing subgraph explanations by edges is more comprehensive than other subgraph inducing techniques. We also emphasize the need of determining the subgraph explanation size for each data instance, as different data instances may involve different important substructures. Building upon these considerations, we introduce a training-free approach, named EiG-Search. We employ an efficient linear-time search algorithm over the edge-induced subgraphs, where the edges are ranked by an enhanced gradient-based importance. We conduct extensive experiments on a total of seven datasets, demonstrating its superior performance and efficiency both quantitatively and qualitatively over the leading baselines.

5/6/2024

cs.LG