Counterfactual Explanations of Black-box Machine Learning Models using Causal Discovery with Applications to Credit Rating

2402.02678

Published 4/30/2024 by Daisuke Takahashi, Shohei Shimizu, Takuma Tanaka

🔮

Abstract

Explainable artificial intelligence (XAI) has helped elucidate the internal mechanisms of machine learning algorithms, bolstering their reliability by demonstrating the basis of their predictions. Several XAI models consider causal relationships to explain models by examining the input-output relationships of prediction models and the dependencies between features. The majority of these models have been based their explanations on counterfactual probabilities, assuming that the causal graph is known. However, this assumption complicates the application of such models to real data, given that the causal relationships between features are unknown in most cases. Thus, this study proposed a novel XAI framework that relaxed the constraint that the causal graph is known. This framework leveraged counterfactual probabilities and additional prior information on causal structure, facilitating the integration of a causal graph estimated through causal discovery methods and a black-box classification model. Furthermore, explanatory scores were estimated based on counterfactual probabilities. Numerical experiments conducted employing artificial data confirmed the possibility of estimating the explanatory score more accurately than in the absence of a causal graph. Finally, as an application to real data, we constructed a classification model of credit ratings assigned by Shiga Bank, Shiga prefecture, Japan. We demonstrated the effectiveness of the proposed method in cases where the causal graph is unknown.

Create account to get full access

Overview

This paper proposes a new framework for explainable artificial intelligence (XAI) that does not require the causal graph (the relationships between features) to be known.
The framework uses counterfactual probabilities and additional information about the causal structure to integrate a black-box classification model with a causal graph estimated through causal discovery methods.
Explanatory scores are estimated based on counterfactual probabilities, and the approach is shown to be more accurate than methods that assume the causal graph is known.
The paper demonstrates the effectiveness of the proposed method on a real-world credit rating classification task where the causal graph is unknown.

Plain English Explanation

Machine learning models are often criticized for being "black boxes" - it's not clear how they arrive at their predictions. Explainable AI (XAI) aims to make these models more transparent by explaining the reasoning behind their outputs.

Many XAI approaches rely on understanding the causal relationships between the input features and the model's predictions. They use counterfactual probabilities to explain how changing certain features would affect the output. However, these methods assume that the causal graph (the connections between features) is already known, which is often not the case in real-world data.

This study proposes a new XAI framework that does not require the causal graph to be known. It combines a black-box classification model with an estimated causal graph, using additional prior information about the causal structure. This allows the model to generate more accurate explanatory scores based on counterfactual probabilities, even when the true causal relationships are unknown.

The researchers tested this approach on both artificial and real-world data, including a credit rating classification task. They found that their method outperformed existing XAI techniques that assume the causal graph is known, demonstrating its effectiveness in practical applications where the underlying causal structure is uncertain.

Technical Explanation

The paper proposes a novel XAI framework that relaxes the assumption that the causal graph (the relationships between input features) is known. The framework leverages counterfactual probabilities and additional prior information about the causal structure to integrate a black-box classification model with a causal graph estimated through causal discovery methods.

The key elements of the approach are:

Causal Graph Estimation: The causal graph is estimated using causal discovery algorithms, rather than being assumed known.
Counterfactual Probabilities: Explanatory scores are calculated based on counterfactual probabilities, which capture how changing input features would affect the model's predictions.
Prior Causal Information: Additional prior information about the causal structure is incorporated to improve the accuracy of the counterfactual probabilities.

The researchers conducted numerical experiments using artificial data to demonstrate that their approach can estimate explanatory scores more accurately than methods that assume the causal graph is known. They also applied the proposed framework to a real-world credit rating classification task, where the causal relationships between features were unknown, and showed its effectiveness in this context.

Critical Analysis

The paper presents a promising approach to XAI that addresses a key limitation of existing methods - the requirement that the causal graph be known. By relaxing this assumption and incorporating estimated causal relationships and prior information, the proposed framework can generate more accurate explanations for black-box models in real-world scenarios where the underlying causal structure is uncertain.

However, the paper does not extensively discuss the potential limitations or caveats of the approach. For example, the performance of the causal graph estimation step may be sensitive to the quality and amount of available data, which could impact the reliability of the explanatory scores. Additionally, the incorporation of prior causal information may be challenging in practice, as such information may not always be readily available or easy to quantify.

Further research could explore the robustness of the framework to different types of causal discovery algorithms, the sensitivity to the quality and quantity of prior causal information, and the generalization of the approach to a wider range of real-world applications and data types. Comparisons to other XAI methods that do not require the causal graph to be known could also provide additional insights.

Conclusion

This study presents a novel XAI framework that addresses a key limitation of existing approaches by not requiring the causal graph to be known. By integrating a black-box classification model with an estimated causal graph and leveraging counterfactual probabilities and prior causal information, the proposed method can generate more accurate explanatory scores, even when the underlying causal relationships are uncertain.

The successful application of the framework to a real-world credit rating classification task demonstrates its potential for improving the transparency and reliability of machine learning models in practical settings where the causal structure is unknown. As the use of AI systems continues to expand, advances in XAI like this can help build trust and ensure these models are deployed responsibly.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🎯

Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Bo

Catarina Moreira, Yu-Liang Chou, Chihcheng Hsieh, Chun Ouyang, Joaquim Jorge, Jo~ao Madeiras Pereira

This study investigates the impact of machine learning models on the generation of counterfactual explanations by conducting a benchmark evaluation over three different types of models: a decision tree (fully transparent, interpretable, white-box model), a random forest (semi-interpretable, grey-box model), and a neural network (fully opaque, black-box model). We tested the counterfactual generation process using four algorithms (DiCE, WatcherCF, prototype, and GrowingSpheresCF) in the literature in 25 different datasets. Our findings indicate that: (1) Different machine learning models have little impact on the generation of counterfactual explanations; (2) Counterfactual algorithms based uniquely on proximity loss functions are not actionable and will not provide meaningful explanations; (3) One cannot have meaningful evaluation results without guaranteeing plausibility in the counterfactual generation. Algorithms that do not consider plausibility in their internal mechanisms will lead to biased and unreliable conclusions if evaluated with the current state-of-the-art metrics; (4) A counterfactual inspection analysis is strongly recommended to ensure a robust examination of counterfactual explanations and the potential identification of biases.

6/12/2024

cs.LG cs.AI

🗣️

Causality-Aware Local Interpretable Model-Agnostic Explanations

Martina Cinquini, Riccardo Guidotti

A main drawback of eXplainable Artificial Intelligence (XAI) approaches is the feature independence assumption, hindering the study of potential variable dependencies. This leads to approximating black box behaviors by analyzing the effects on randomly generated feature values that may rarely occur in the original samples. This paper addresses this issue by integrating causal knowledge in an XAI method to enhance transparency and enable users to assess the quality of the generated explanations. Specifically, we propose a novel extension to a widely used local and model-agnostic explainer, which encodes explicit causal relationships within the data surrounding the instance being explained. Extensive experiments show that our approach overcomes the original method in terms of faithfully replicating the black-box model's mechanism and the consistency and reliability of the generated explanations.

4/16/2024

cs.AI cs.LG

🤿

Counterfactual Explanations for Deep Learning-Based Traffic Forecasting

Rushan Wang, Yanan Xin, Yatao Zhang, Fernando Perez-Cruz, Martin Raubal

Deep learning models are widely used in traffic forecasting and have achieved state-of-the-art prediction accuracy. However, the black-box nature of those models makes the results difficult to interpret by users. This study aims to leverage an Explainable AI approach, counterfactual explanations, to enhance the explainability and usability of deep learning-based traffic forecasting models. Specifically, the goal is to elucidate relationships between various input contextual features and their corresponding predictions. We present a comprehensive framework that generates counterfactual explanations for traffic forecasting and provides usable insights through the proposed scenario-driven counterfactual explanations. The study first implements a deep learning model to predict traffic speed based on historical traffic data and contextual variables. Counterfactual explanations are then used to illuminate how alterations in these input variables affect predicted outcomes, thereby enhancing the transparency of the deep learning model. We investigated the impact of contextual features on traffic speed prediction under varying spatial and temporal conditions. The scenario-driven counterfactual explanations integrate two types of user-defined constraints, directional and weighting constraints, to tailor the search for counterfactual explanations to specific use cases. These tailored explanations benefit machine learning practitioners who aim to understand the model's learning mechanisms and domain experts who seek insights for real-world applications. The results showcase the effectiveness of counterfactual explanations in revealing traffic patterns learned by deep learning models, showing its potential for interpreting black-box deep learning models used for spatiotemporal predictions in general.

5/2/2024

cs.LG cs.AI

Privacy Implications of Explainable AI in Data-Driven Systems

Fatima Ezzeddine

Machine learning (ML) models, demonstrably powerful, suffer from a lack of interpretability. The absence of transparency, often referred to as the black box nature of ML models, undermines trust and urges the need for efforts to enhance their explainability. Explainable AI (XAI) techniques address this challenge by providing frameworks and methods to explain the internal decision-making processes of these complex models. Techniques like Counterfactual Explanations (CF) and Feature Importance play a crucial role in achieving this goal. Furthermore, high-quality and diverse data remains the foundational element for robust and trustworthy ML applications. In many applications, the data used to train ML and XAI explainers contain sensitive information. In this context, numerous privacy-preserving techniques can be employed to safeguard sensitive information in the data, such as differential privacy. Subsequently, a conflict between XAI and privacy solutions emerges due to their opposing goals. Since XAI techniques provide reasoning for the model behavior, they reveal information relative to ML models, such as their decision boundaries, the values of features, or the gradients of deep learning models when explanations are exposed to a third entity. Attackers can initiate privacy breaching attacks using these explanations, to perform model extraction, inference, and membership attacks. This dilemma underscores the challenge of finding the right equilibrium between understanding ML decision-making and safeguarding privacy.

6/26/2024

cs.LG cs.AI cs.CR