A Framework for Feasible Counterfactual Exploration incorporating Causality, Sparsity and Density

2404.13476

Published 4/23/2024 by Kleopatra Markou, Dimitrios Tomaras, Vana Kalogeraki, Dimitrios Gunopulos

A Framework for Feasible Counterfactual Exploration incorporating Causality, Sparsity and Density

Abstract

The imminent need to interpret the output of a Machine Learning model with counterfactual (CF) explanations - via small perturbations to the input - has been notable in the research community. Although the variety of CF examples is important, the aspect of them being feasible at the same time, does not necessarily apply in their entirety. This work uses different benchmark datasets to examine through the preservation of the logical causal relations of their attributes, whether CF examples can be generated after a small amount of changes to the original input, be feasible and actually useful to the end-user in a real-world case. To achieve this, we used a black box model as a classifier, to distinguish the desired from the input class and a Variational Autoencoder (VAE) to generate feasible CF examples. As an extension, we also extracted two-dimensional manifolds (one for each dataset) that located the majority of the feasible examples, a representation that adequately distinguished them from infeasible ones. For our experimentation we used three commonly used datasets and we managed to generate feasible and at the same time sparse, CF examples that satisfy all possible predefined causal constraints, by confirming their importance with the attributes in a dataset.

Create account to get full access

Overview

This research paper proposes a framework for generating feasible counterfactual explanations that incorporate causality, sparsity, and density constraints.
Counterfactual explanations are a type of explainable AI that show how changes to input features can lead to a different model output.
The proposed framework aims to make these counterfactual explanations more realistic and actionable by considering the underlying causal relationships, the sparsity of feature changes, and the density of the data distribution.

Plain English Explanation

Imagine you want to understand why an AI system made a particular decision. Counterfactual explanations can help by showing you how the decision would change if you modified certain input features. For example, an AI system might predict that a loan applicant will be denied, but a counterfactual explanation could show that if the applicant's income was just $500 higher, they would be approved.

However, the counterfactual changes suggested by these explanations don't always make sense in the real world. The changes might be unrealistic, like increasing someone's income by an enormous amount. Or they might not consider the underlying causal relationships between the features, like how income and other factors are connected.

This research paper introduces a new framework that tries to make counterfactual explanations more feasible and realistic. It does this by:

Incorporating information about the causal relationships between the features, so the suggested changes are more grounded in how the world works.
Encouraging the counterfactual changes to be as sparse as possible, meaning only a few features are changed at a time.
Ensuring the counterfactual examples are dense within the normal data distribution, so the changes are more realistic.

By considering these factors, the framework can generate counterfactual explanations that are more plausible and actionable for the person receiving the explanation.

Technical Explanation

The core of the proposed framework is an optimization problem that finds counterfactual examples subject to constraints on causality, sparsity, and density. Specifically:

Causality: The framework incorporates a causal graph that encodes the relationships between the input features. This ensures the counterfactual changes respect these causal dependencies.
Sparsity: The optimization encourages the counterfactual to only modify a small number of input features, making the changes more feasible.
Density: The counterfactual examples are constrained to lie within high-density regions of the data distribution, again improving plausibility.

The authors demonstrate their framework on both synthetic and real-world datasets, comparing the generated counterfactuals to those from other methods. They show their approach produces more realistic and actionable explanations, as evaluated by human judges.

Critical Analysis

The key strengths of this research are the incorporation of causal knowledge and the explicit constraints on sparsity and density. These elements make the counterfactual explanations more grounded in reality and useful for end-users.

However, the reliance on a pre-specified causal graph is a potential limitation. In many real-world applications, the causal relationships may not be fully known or agreed upon. The framework could be extended to learn the causal structure from data, or to be robust to uncertainty in the causal model.

Additionally, the paper does not deeply explore the computational complexity of the optimization problem. As the number of input features grows, the search for sparse, dense counterfactuals may become increasingly challenging. Scalability to high-dimensional problems could be an area for further research.

Finally, while the human evaluation shows promising results, more extensive user studies would be helpful to further validate the usefulness of the counterfactual explanations generated by this framework in real-world applications.

Conclusion

This research presents an innovative framework for generating counterfactual explanations that are more feasible and actionable than previous approaches. By incorporating causal knowledge, sparsity constraints, and density constraints, the framework produces counterfactual examples that are grounded in reality and can provide meaningful insights to end-users. While there are some limitations to address, this work represents an important step forward in making explainable AI systems more transparent and useful.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🔮

CF-OPT: Counterfactual Explanations for Structured Prediction

Germain Vivier-Ardisson, Alexandre Forel, Axel Parmentier, Thibaut Vidal

Optimization layers in deep neural networks have enjoyed a growing popularity in structured learning, improving the state of the art on a variety of applications. Yet, these pipelines lack interpretability since they are made of two opaque layers: a highly non-linear prediction model, such as a deep neural network, and an optimization layer, which is typically a complex black-box solver. Our goal is to improve the transparency of such methods by providing counterfactual explanations. We build upon variational autoencoders a principled way of obtaining counterfactuals: working in the latent space leads to a natural notion of plausibility of explanations. We finally introduce a variant of the classic loss for VAE training that improves their performance in our specific structured context. These provide the foundations of CF-OPT, a first-order optimization algorithm that can find counterfactual explanations for a broad class of structured learning architectures. Our numerical results show that both close and plausible explanations can be obtained for problems from the recent literature.

6/4/2024

cs.LG

📊

Generating Counterfactual Explanations Using Cardinality Constraints

Rub'en Ruiz-Torrubiano

Providing explanations about how machine learning algorithms work and/or make particular predictions is one of the main tools that can be used to improve their trusworthiness, fairness and robustness. Among the most intuitive type of explanations are counterfactuals, which are examples that differ from a given point only in the prediction target and some set of features, presenting which features need to be changed in the original example to flip the prediction for that example. However, such counterfactuals can have many different features than the original example, making their interpretation difficult. In this paper, we propose to explicitly add a cardinality constraint to counterfactual generation limiting how many features can be different from the original example, thus providing more interpretable and easily understantable counterfactuals.

4/12/2024

cs.LG cs.AI

🎯

Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Bo

Catarina Moreira, Yu-Liang Chou, Chihcheng Hsieh, Chun Ouyang, Joaquim Jorge, Jo~ao Madeiras Pereira

This study investigates the impact of machine learning models on the generation of counterfactual explanations by conducting a benchmark evaluation over three different types of models: a decision tree (fully transparent, interpretable, white-box model), a random forest (semi-interpretable, grey-box model), and a neural network (fully opaque, black-box model). We tested the counterfactual generation process using four algorithms (DiCE, WatcherCF, prototype, and GrowingSpheresCF) in the literature in 25 different datasets. Our findings indicate that: (1) Different machine learning models have little impact on the generation of counterfactual explanations; (2) Counterfactual algorithms based uniquely on proximity loss functions are not actionable and will not provide meaningful explanations; (3) One cannot have meaningful evaluation results without guaranteeing plausibility in the counterfactual generation. Algorithms that do not consider plausibility in their internal mechanisms will lead to biased and unreliable conclusions if evaluated with the current state-of-the-art metrics; (4) A counterfactual inspection analysis is strongly recommended to ensure a robust examination of counterfactual explanations and the potential identification of biases.

6/12/2024

cs.LG cs.AI

CFGs: Causality Constrained Counterfactual Explanations using goal-directed ASP

Sopam Dasgupta, Joaqu'in Arias, Elmer Salazar, Gopal Gupta

Machine learning models that automate decision-making are increasingly used in consequential areas such as loan approvals, pretrial bail approval, and hiring. Unfortunately, most of these models are black boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might also desire explanations to understand why a decision was made. Ethical and legal considerations require informing the individual of changes in the input attribute (s) that could be made to produce a desirable outcome. Our work focuses on the latter problem of generating counterfactual explanations by considering the causal dependencies between features. In this paper, we present the framework CFGs, CounterFactual Generation with s(CASP), which utilizes the goal-directed Answer Set Programming (ASP) system s(CASP) to automatically generate counterfactual explanations from models generated by rule-based machine learning algorithms in particular. We benchmark CFGs with the FOLD-SE model. Reaching the counterfactual state from the initial state is planned and achieved using a series of interventions. To validate our proposal, we show how counterfactual explanations are computed and justified by imagining worlds where some or all factual assumptions are altered/changed. More importantly, we show how CFGs navigates between these worlds, namely, go from our initial state where we obtain an undesired outcome to the imagined goal state where we obtain the desired decision, taking into account the causal relationships among features.

5/28/2024

cs.AI cs.LG cs.LO