Accurate estimation of feature importance faithfulness for tree models

2404.03426

Published 4/5/2024 by Mateusz Gajewski, Adam Karczmarz, Mateusz Rapicki, Piotr Sankowski

Accurate estimation of feature importance faithfulness for tree models

Abstract

In this paper, we consider a perturbation-based metric of predictive faithfulness of feature rankings (or attributions) that we call PGI squared. When applied to decision tree-based regression models, the metric can be computed accurately and efficiently for arbitrary independent feature perturbation distributions. In particular, the computation does not involve Monte Carlo sampling that has been typically used for computing similar metrics and which is inherently prone to inaccuracies. Moreover, we propose a method of ranking features by their importance for the tree model's predictions based on PGI squared. Our experiments indicate that in some respects, the method may identify the globally important features better than the state-of-the-art SHAP explainer

Create account to get full access

Overview

This paper proposes a method for accurately estimating the feature importance faithfulness of tree-based machine learning models.
The authors introduce an exact computation approach that can efficiently calculate the feature importance scores for tree models.
They demonstrate the effectiveness of their method on various datasets and compare it to existing approximation techniques.

Plain English Explanation

Tree-based machine learning models, such as decision trees and random forests, are popular for their ability to capture complex relationships in data. However, understanding how these models make predictions can be challenging, as the importance of different input features (e.g., age, income, etc.) in the decision-making process is not always clear.

The authors of this paper have developed a new method to accurately measure the

faithfulness

of the feature importance scores reported by tree models. Faithfulness refers to how well the reported feature importance scores reflect the true influence of each input feature on the model's predictions.

The researchers' approach involves an exact computation of the feature importance scores, which is more precise than the approximate methods commonly used. This allows them to better understand the inner workings of tree models and identify the most influential features more reliably.

By applying their method to various datasets, the authors demonstrate that their exact computation approach outperforms existing approximation techniques in terms of accuracy. This is an important contribution, as it helps researchers and practitioners gain a deeper and more reliable understanding of how tree-based models make decisions, which can be crucial for applications such as explainable AI, medical diagnosis, and predictive analytics.

Technical Explanation

The paper introduces an exact computation method for estimating the feature importance faithfulness of tree-based models. The authors start by defining a formal measure of feature importance faithfulness, which quantifies how well the reported feature importance scores reflect the true influence of each input feature on the model's predictions.

To compute the feature importance faithfulness exactly, the researchers develop an efficient algorithm that leverages the tree structure of the models. This algorithm can exactly compute the feature importance scores without relying on approximate methods, which are commonly used but can introduce errors.

The authors evaluate their exact computation approach on several datasets and compare it to existing approximation techniques, such as SHAP and decision predicate graphs. The results show that the exact computation method outperforms the approximation techniques in terms of faithfulness, providing a more accurate assessment of feature importance.

Critical Analysis

The paper presents a rigorous and well-designed study, with a clear and formal definition of feature importance faithfulness. The proposed exact computation method is a significant advancement over the approximate techniques commonly used, as it can provide a more reliable understanding of the decision-making process in tree-based models.

However, the authors acknowledge that their method may be computationally expensive for large or complex tree models. They suggest that future work could explore ways to balance the trade-off between faithfulness and computational efficiency, potentially through the development of faster approximation algorithms.

Additionally, the paper focuses solely on tree-based models, and it would be interesting to see if the exact computation approach could be extended to other types of machine learning models, such as neural networks or support vector machines. This could further broaden the applicability of the research.

Conclusion

This paper presents an important contribution to the field of interpretable machine learning. By developing an exact computation method for estimating feature importance faithfulness, the authors have provided a valuable tool for researchers and practitioners to better understand the decision-making processes of tree-based models.

The improved accuracy and reliability of the feature importance scores can have significant implications for applications where model interpretability is crucial, such as medical diagnosis, predictive analytics, and explainable AI. By shedding light on the inner workings of these models, the proposed method can help build trust and confidence in the use of machine learning technologies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

✨

Confident Feature Ranking

Bitya Neuhof, Yuval Benjamini

Machine learning models are widely applied in various fields. Stakeholders often use post-hoc feature importance methods to better understand the input features' contribution to the models' predictions. The interpretation of the importance values provided by these methods is frequently based on the relative order of the features (their ranking) rather than the importance values themselves. Since the order may be unstable, we present a framework for quantifying the uncertainty in global importance values. We propose a novel method for the post-hoc interpretation of feature importance values that is based on the framework and pairwise comparisons of the feature importance values. This method produces simultaneous confidence intervals for the features' ranks, which include the ``true'' (infinite sample) ranks with high probability, and enables the selection of the set of the top-k important features.

4/19/2024

stat.ML cs.AI cs.LG

✨

From SHAP Scores to Feature Importance Scores

Olivier Letoffe, Xuanxiang Huang, Nicholas Asher, Joao Marques-Silva

A central goal of eXplainable Artificial Intelligence (XAI) is to assign relative importance to the features of a Machine Learning (ML) model given some prediction. The importance of this task of explainability by feature attribution is illustrated by the ubiquitous recent use of tools such as SHAP and LIME. Unfortunately, the exact computation of feature attributions, using the game-theoretical foundation underlying SHAP and LIME, can yield manifestly unsatisfactory results, that tantamount to reporting misleading relative feature importance. Recent work targeted rigorous feature attribution, by studying axiomatic aggregations of features based on logic-based definitions of explanations by feature selection. This paper shows that there is an essential relationship between feature attribution and a priori voting power, and that those recently proposed axiomatic aggregations represent a few instantiations of the range of power indices studied in the past. Furthermore, it remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs), i.e. the use of power indices in XAI, and which of these indices would be the best suited for the purposes of XAI by feature attribution, namely in terms of not producing results that could be deemed as unsatisfactory. This paper proposes novel desirable properties that FISs should exhibit. In addition, the paper also proposes novel FISs exhibiting the proposed properties. Finally, the paper conducts a rigorous analysis of the best-known power indices in terms of the proposed properties.

5/21/2024

cs.AI cs.LG

✨

On marginal feature attributions of tree-based models

Khashayar Filom, Alexey Miroshnikov, Konstandinos Kotsiopoulos, Arjun Ravi Kannan

Due to their power and ease of use, tree-based machine learning models, such as random forests and gradient-boosted tree ensembles, have become very popular. To interpret them, local feature attributions based on marginal expectations, e.g. marginal (interventional) Shapley, Owen or Banzhaf values, may be employed. Such methods are true to the model and implementation invariant, i.e. dependent only on the input-output function of the model. We contrast this with the popular TreeSHAP algorithm by presenting two (statistically similar) decision trees that compute the exact same function for which the path-dependent TreeSHAP yields different rankings of features, whereas the marginal Shapley values coincide. Furthermore, we discuss how the internal structure of tree-based models may be leveraged to help with computing their marginal feature attributions according to a linear game value. One important observation is that these are simple (piecewise-constant) functions with respect to a certain grid partition of the input space determined by the trained model. Another crucial observation, showcased by experiments with XGBoost, LightGBM and CatBoost libraries, is that only a portion of all features appears in a tree from the ensemble. Thus, the complexity of computing marginal Shapley (or Owen or Banzhaf) feature attributions may be reduced. This remains valid for a broader class of game values which we shall axiomatically characterize. A prime example is the case of CatBoost models where the trees are oblivious (symmetric) and the number of features in each of them is no larger than the depth. We exploit the symmetry to derive an explicit formula, with improved complexity and only in terms of the internal model parameters, for marginal Shapley (and Banzhaf and Owen) values of CatBoost models. This results in a fast, accurate algorithm for estimating these feature attributions.

5/7/2024

cs.LG cs.GT

✨

Provably Stable Feature Rankings with SHAP and LIME

Jeremy Goldwasser, Giles Hooker

Feature attributions are ubiquitous tools for understanding the predictions of machine learning models. However, the calculation of popular methods for scoring input variables such as SHAP and LIME suffers from high instability due to random sampling. Leveraging ideas from multiple hypothesis testing, we devise attribution methods that ensure the most important features are ranked correctly with high probability. Given SHAP estimates from KernelSHAP or Shapley Sampling, we demonstrate how to retrospectively verify the number of stable rankings. Further, we introduce efficient sampling algorithms for SHAP and LIME that guarantee the $K$ highest-ranked features have the proper ordering. Finally, we show how to adapt these local feature attribution methods for the global importance setting.

6/4/2024

stat.ML cs.LG