Utilizing Description Logics for Global Explanations of Heterogeneous Graph Neural Networks

2405.12654

Published 5/22/2024 by Dominik Kohler, Stefan Heindorf

🧠

Abstract

Graph Neural Networks (GNNs) are effective for node classification in graph-structured data, but they lack explainability, especially at the global level. Current research mainly utilizes subgraphs of the input as local explanations or generates new graphs as global explanations. However, these graph-based methods are limited in their ability to explain classes with multiple sufficient explanations. To provide more expressive explanations, we propose utilizing class expressions (CEs) from the field of description logic (DL). Our approach explains heterogeneous graphs with different types of nodes using CEs in the EL description logic. To identify the best explanation among multiple candidate explanations, we employ and compare two different scoring functions: (1) For a given CE, we construct multiple graphs, have the GNN make a prediction for each graph, and aggregate the predicted scores. (2) We score the CE in terms of fidelity, i.e., we compare the predictions of the GNN to the predictions by the CE on a separate validation set. Instead of subgraph-based explanations, we offer CE-based explanations.

Create account to get full access

Overview

Graph Neural Networks (GNNs) are effective for node classification in graph-structured data, but lack explainability, especially at the global level
Current research focuses on using subgraphs or generating new graphs to provide explanations
This paper proposes utilizing class expressions (CEs) from description logic to provide more expressive explanations for heterogeneous graphs

Plain English Explanation

Graph Neural Networks (GNNs) are a type of machine learning model that are good at working with data organized in the form of graphs, such as social networks or chemical molecules. However, one of the limitations of GNNs is that it can be difficult to understand how they arrive at their predictions, especially when looking at the overall graph rather than just a small part of it.

Most current research on explaining GNNs has focused on two main approaches: using subgraphs of the original input as local explanations, or generating entirely new graphs as global explanations. While these graph-based methods can provide some insights, they are limited in their ability to capture complex relationships and patterns that may span multiple parts of the graph.

To address this, the researchers in this paper propose a new approach that utilizes something called "class expressions" (CEs) from the field of description logic. CEs allow for more detailed and nuanced explanations by capturing the various characteristics and relationships that define different classes or categories within the graph data.

The key idea is to use CEs to explain the predictions of a GNN on heterogeneous graphs, where there are different types of nodes (e.g. people, locations, organizations). The researchers explore two different ways of scoring the quality of these CE-based explanations: one that looks at the GNN's predictions on multiple subgraphs, and another that compares the GNN's predictions to the predictions made by the CE itself on a separate dataset.

By moving beyond simple subgraph-based explanations and leveraging the expressive power of description logic, this approach aims to provide more meaningful and insightful explanations for the decisions made by powerful GNN models.

Technical Explanation

This paper proposes a novel approach for providing global-level explanations for the predictions made by Graph Neural Networks (GNNs) on heterogeneous graph-structured data.

The key innovation is the use of class expressions (CEs) from the field of description logic (DL) to capture complex relationships and patterns within the graph. Unlike existing methods that rely on subgraphs or generated graphs as explanations, the CE-based approach can express more nuanced and comprehensive concepts that define the different classes or categories in the data.

To identify the best CE-based explanation for a GNN's prediction, the researchers explore two different scoring functions:

Graph-based scoring: For a given CE, they construct multiple subgraphs from the original input, have the GNN make a prediction for each subgraph, and then aggregate the predicted scores.
Fidelity-based scoring: They score the CE in terms of how well its predictions match the GNN's predictions on a separate validation set, measuring the "fidelity" of the CE as an explanation.

The researchers evaluate their approach on several heterogeneous graph datasets, demonstrating that the CE-based explanations can outperform subgraph-based methods in terms of both faithfulness to the GNN's decision-making and human interpretability.

Critical Analysis

One of the key strengths of this research is its focus on providing global-level explanations for GNN models, which is an important but often overlooked aspect of model interpretability. By moving beyond local subgraph-based explanations, the CE-based approach has the potential to uncover more comprehensive and meaningful insights about the overall decision-making logic of GNNs.

That said, the researchers acknowledge some limitations in their work. For example, the current approach may struggle to explain classes with multiple sufficient explanations, as the CE formalism is limited in its ability to capture disjunctive concepts. Additionally, the computational complexity of the CE discovery process could be a practical challenge, especially for large and complex graphs.

Further research could explore ways to address these limitations, such as by combining CE-based explanations with other interpretability techniques or developing more efficient CE discovery algorithms. Additionally, more user studies would be valuable to assess the real-world understandability and usefulness of the CE-based explanations from a human-centered perspective.

Overall, this paper represents an important step forward in the quest for more expressive and interpretable GNN models, and the ideas presented here could have significant implications for a wide range of graph-based applications.

Conclusion

This paper introduces a novel approach for providing global-level explanations for Graph Neural Network (GNN) predictions using class expressions (CEs) from description logic. By leveraging the expressive power of CEs, the researchers demonstrate that it is possible to capture more nuanced and comprehensive patterns within heterogeneous graph data, going beyond the limitations of subgraph-based explanations.

The proposed CE-based explanation method, coupled with the two scoring functions explored in the paper, offers a promising direction for improving the interpretability of GNNs. As GNNs continue to gain traction in a variety of real-world applications, such advancements in explainability will be crucial for building trust, accountability, and responsible development of these powerful machine learning models.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

GNNAnatomy: Systematic Generation and Evaluation of Multi-Level Explanations for Graph Neural Networks

Hsiao-Ying Lu, Yiran Li, Ujwal Pratap Krishna Kaluvakolanu Thyagarajan, Kwan-Liu Ma

Graph Neural Networks (GNNs) have proven highly effective in various machine learning (ML) tasks involving graphs, such as node/graph classification and link prediction. However, explaining the decisions made by GNNs poses challenges because of the aggregated relational information based on graph structure, leading to complex data transformations. Existing methods for explaining GNNs often face limitations in systematically exploring diverse substructures and evaluating results in the absence of ground truths. To address this gap, we introduce GNNAnatomy, a model- and dataset-agnostic visual analytics system designed to facilitate the generation and evaluation of multi-level explanations for GNNs. In GNNAnatomy, we employ graphlets to elucidate GNN behavior in graph-level classification tasks. By analyzing the associations between GNN classifications and graphlet frequencies, we formulate hypothesized factual and counterfactual explanations. To validate a hypothesized graphlet explanation, we introduce two metrics: (1) the correlation between its frequency and the classification confidence, and (2) the change in classification confidence after removing this substructure from the original graph. To demonstrate the effectiveness of GNNAnatomy, we conduct case studies on both real-world and synthetic graph datasets from various domains. Additionally, we qualitatively compare GNNAnatomy with a state-of-the-art GNN explainer, demonstrating the utility and versatility of our design.

6/10/2024

cs.LG cs.IR cs.SI

👁️

A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation, and Research Challenges

Mario Alfonso Prado-Romero, Bardh Prenkaj, Giovanni Stilo, Fosca Giannotti

Graph Neural Networks (GNNs) perform well in community detection and molecule classification. Counterfactual Explanations (CE) provide counter-examples to overcome the transparency limitations of black-box models. Due to the growing attention in graph learning, we focus on the concepts of CE for GNNs. We analysed the SoA to provide a taxonomy, a uniform notation, and the benchmarking datasets and evaluation metrics. We discuss fourteen methods, their evaluation protocols, twenty-two datasets, and nineteen metrics. We integrated the majority of methods into the GRETEL library to conduct an empirical evaluation to understand their strengths and pitfalls. We highlight open challenges and future work.

6/12/2024

cs.LG cs.AI

🚀

Global Concept Explanations for Graphs by Contrastive Learning

Jonas Teufel, Pascal Friederich

Beyond improving trust and validating model fairness, xAI practices also have the potential to recover valuable scientific insights in application domains where little to no prior human intuition exists. To that end, we propose a method to extract global concept explanations from the predictions of graph neural networks to develop a deeper understanding of the tasks underlying structure-property relationships. We identify concept explanations as dense clusters in the self-explaining Megan models subgraph latent space. For each concept, we optimize a representative prototype graph and optionally use GPT-4 to provide hypotheses about why each structure has a certain effect on the prediction. We conduct computational experiments on synthetic and real-world graph property prediction tasks. For the synthetic tasks we find that our method correctly reproduces the structural rules by which they were created. For real-world molecular property regression and classification tasks, we find that our method rediscovers established rules of thumb. More specifically, our results for molecular mutagenicity prediction indicate more fine-grained resolution of structural details than existing explainability methods, consistent with previous results from chemistry literature. Overall, our results show promising capability to extract the underlying structure-property relationships for complex graph property prediction tasks.

4/26/2024

cs.LG cs.AI

🧠

L2XGNN: Learning to Explain Graph Neural Networks

Giuseppe Serra, Mathias Niepert

Graph Neural Networks (GNNs) are a popular class of machine learning models. Inspired by the learning to explain (L2X) paradigm, we propose L2XGNN, a framework for explainable GNNs which provides faithful explanations by design. L2XGNN learns a mechanism for selecting explanatory subgraphs (motifs) which are exclusively used in the GNNs message-passing operations. L2XGNN is able to select, for each input graph, a subgraph with specific properties such as being sparse and connected. Imposing such constraints on the motifs often leads to more interpretable and effective explanations. Experiments on several datasets suggest that L2XGNN achieves the same classification accuracy as baseline methods using the entire input graph while ensuring that only the provided explanations are used to make predictions. Moreover, we show that L2XGNN is able to identify motifs responsible for the graph's properties it is intended to predict.

6/17/2024

cs.LG cs.AI