Provably Stable Feature Rankings with SHAP and LIME

Read original: arXiv:2401.15800 - Published 6/4/2024 by Jeremy Goldwasser, Giles Hooker

✨

Overview

Feature attributions, such as SHAP and LIME, are widely used to understand the predictions of machine learning models.
However, these methods can suffer from high instability due to random sampling.
The paper proposes new attribution methods that ensure the most important features are ranked correctly with high probability.

Plain English Explanation

Feature attributions are tools used to understand how machine learning models make their predictions. They show which input variables (or "features") are most important for a model's output. Popular methods like SHAP and LIME calculate these feature importance scores, but they can be unstable because they rely on random sampling.

The researchers in this paper developed new attribution methods that can reliably identify the most crucial features. By drawing on ideas from statistics, they devised ways to verify the stability of feature rankings from SHAP or similar tools. They also introduced new sampling algorithms for SHAP and LIME that guarantee the top features are ranked correctly. These techniques work both for understanding individual predictions (local importance) and for assessing overall feature importance for a model (global importance).

The key is ensuring the most relevant features are identified with high confidence, even in the face of random noise in the data. This helps machine learning practitioners better interpret their models and make more informed decisions.

Technical Explanation

The paper proposes several advancements to feature attribution methods like SHAP and LIME:

Retrospective Stability Verification: Given SHAP estimates, the authors show how to retrospectively verify the number of feature rankings that are stable with high probability. This allows users to determine how many top features can be trusted.
Stable Sampling Algorithms: The paper introduces new sampling algorithms for SHAP and LIME that guarantee the K highest-ranked features have the proper ordering. This ensures the most important features are identified accurately.
Global Importance Adaptation: The local feature attribution methods are adapted to the global importance setting, allowing the techniques to assess overall feature importance for a model.

The key innovations leverage ideas from multiple hypothesis testing to ensure the most salient features are reliably identified, even in the presence of noisy data. Experiments demonstrate the advantages of the proposed RankSHAP and ConfidentFeatureRanking methods compared to standard SHAP and LIME.

Critical Analysis

The paper thoroughly addresses the instability issue with current feature attribution methods and provides rigorous statistical techniques to overcome this limitation. However, a few potential concerns are worth considering:

Computational Complexity: The stable sampling algorithms introduced may be more computationally intensive than standard SHAP or LIME, which could limit their practical application for very large models or datasets.
Assumptions and Constraints: The theoretical guarantees provided by the methods rely on certain assumptions, such as feature independence. It's unclear how the techniques would perform when these assumptions are violated in real-world scenarios.
Generalization to Other Attribution Methods: The paper primarily focuses on SHAP and LIME, but it would be valuable to understand how the proposed ideas could be applied to other feature attribution approaches, such as TreeSHAP.

Overall, the research represents a significant advancement in ensuring the reliability of feature importance estimates, which is critical for building trust in machine learning models. Further exploration of the practical implications and limitations would be a valuable next step.

Conclusion

This paper tackles the important challenge of instability in popular feature attribution methods like SHAP and LIME. By drawing on statistical principles, the researchers developed new techniques that can reliably identify the most crucial input features for a model's predictions, both locally and globally.

The proposed RankSHAP and ConfidentFeatureRanking methods provide a more robust and trustworthy way to interpret machine learning models, which is essential for responsible AI development and deployment. While some practical considerations remain, this work represents a significant step forward in making feature attributions a reliable tool for understanding and improving complex predictive models.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

Provably Stable Feature Rankings with SHAP and LIME

Jeremy Goldwasser, Giles Hooker

Feature attributions are ubiquitous tools for understanding the predictions of machine learning models. However, the calculation of popular methods for scoring input variables such as SHAP and LIME suffers from high instability due to random sampling. Leveraging ideas from multiple hypothesis testing, we devise attribution methods that ensure the most important features are ranked correctly with high probability. Given SHAP estimates from KernelSHAP or Shapley Sampling, we demonstrate how to retrospectively verify the number of stable rankings. Further, we introduce efficient sampling algorithms for SHAP and LIME that guarantee the $K$ highest-ranked features have the proper ordering. Finally, we show how to adapt these local feature attribution methods for the global importance setting.

6/4/2024

RankSHAP: a Gold Standard Feature Attribution Method for the Ranking Task

Tanya Chowdhury, Yair Zick, James Allan

Several works propose various post-hoc, model-agnostic explanations for the task of ranking, i.e. the task of ordering a set of documents, via feature attribution methods. However, these attributions are seen to weakly correlate and sometimes contradict each other. In classification/regression, several works focus on emph{axiomatic characterization} of feature attribution methods, showing that a certain method uniquely satisfies a set of desirable properties. However, no such efforts have been taken in the space of feature attributions for the task of ranking. We take an axiomatic game-theoretic approach, popular in the feature attribution community, to identify candidate attribution methods for ranking tasks. We first define desirable axioms: Rank-Efficiency, Rank-Missingness, Rank-Symmetry and Rank-Monotonicity, all variants of the classical Shapley axioms. Next, we introduce Rank-SHAP, a feature attribution algorithm for the general ranking task, which is an extension to classical Shapley values. We identify a polynomial-time algorithm for computing approximate Rank-SHAP values and evaluate the computational efficiency and accuracy of our algorithm under various scenarios. We also evaluate its alignment with human intuition with a user study. Lastly, we theoretically examine popular rank attribution algorithms, EXS and Rank-LIME, and evaluate their capacity to satisfy the classical Shapley axioms.

5/6/2024

✨

From SHAP Scores to Feature Importance Scores

Olivier Letoffe, Xuanxiang Huang, Nicholas Asher, Joao Marques-Silva

A central goal of eXplainable Artificial Intelligence (XAI) is to assign relative importance to the features of a Machine Learning (ML) model given some prediction. The importance of this task of explainability by feature attribution is illustrated by the ubiquitous recent use of tools such as SHAP and LIME. Unfortunately, the exact computation of feature attributions, using the game-theoretical foundation underlying SHAP and LIME, can yield manifestly unsatisfactory results, that tantamount to reporting misleading relative feature importance. Recent work targeted rigorous feature attribution, by studying axiomatic aggregations of features based on logic-based definitions of explanations by feature selection. This paper shows that there is an essential relationship between feature attribution and a priori voting power, and that those recently proposed axiomatic aggregations represent a few instantiations of the range of power indices studied in the past. Furthermore, it remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs), i.e. the use of power indices in XAI, and which of these indices would be the best suited for the purposes of XAI by feature attribution, namely in terms of not producing results that could be deemed as unsatisfactory. This paper proposes novel desirable properties that FISs should exhibit. In addition, the paper also proposes novel FISs exhibiting the proposed properties. Finally, the paper conducts a rigorous analysis of the best-known power indices in terms of the proposed properties.

5/21/2024

✨

Confident Feature Ranking

Bitya Neuhof, Yuval Benjamini

Machine learning models are widely applied in various fields. Stakeholders often use post-hoc feature importance methods to better understand the input features' contribution to the models' predictions. The interpretation of the importance values provided by these methods is frequently based on the relative order of the features (their ranking) rather than the importance values themselves. Since the order may be unstable, we present a framework for quantifying the uncertainty in global importance values. We propose a novel method for the post-hoc interpretation of feature importance values that is based on the framework and pairwise comparisons of the feature importance values. This method produces simultaneous confidence intervals for the features' ranks, which include the ``true'' (infinite sample) ranks with high probability, and enables the selection of the set of the top-k important features.

4/19/2024