The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance

Read original: arXiv:2309.13775 - Published 4/3/2024 by Jon Donnelly, Srikar Katta, Cynthia Rudin, Edward P. Browne

📶

Overview

Quantifying the importance of variables is crucial in fields like genetics, public policy, and medicine.
Existing methods calculate variable importance for a single model trained on a given dataset.
However, there may be many equally valid models that explain the data, leading to conflicting conclusions.
The insights from these models may also not generalize well due to data sensitivity.
The paper proposes a new framework that quantifies variable importance across all good models and is stable across data distributions.

Plain English Explanation

Imagine you're a doctor trying to understand which factors most influence a patient's health. You might build a model that uses various patient characteristics, like age, weight, and blood pressure, to predict their risk of a certain illness. The importance of each factor in your model would tell you which ones are the most influential.

However, there could be many different models that all fit the data equally well. If different researchers build these models, they might reach very different conclusions about which factors are most important. This can be problematic, especially for high-stakes decisions in fields like medicine or policy.

The researchers in this paper propose a new way to address this issue. Instead of just looking at a single model, their framework considers all the good models that could explain the data. It then quantifies the importance of each variable by looking at how important it is across this whole set of models. This makes the variable importance more robust and less dependent on the specific choices made by the researcher.

The framework is also designed to be stable - the variable importance insights should hold up even if the data is slightly changed. This is important because the real-world data we use to train models is often messy and can vary over time.

By taking this broader perspective and ensuring stability, the researchers hope their framework can provide more reliable and actionable insights for high-impact applications.

Technical Explanation

The key innovation in this paper is a new variable importance framework that accounts for all the good models that could explain a given dataset, rather than just a single model. The authors call this the "Rashomon set" of models.

To quantify variable importance in this Rashomon set, the framework uses a distribution-based approach. It calculates a distribution of variable importance scores, reflecting how important each variable is across the whole set of good models. This distribution-based importance metric is designed to be stable - it should remain consistent even as the dataset is perturbed.

The authors prove theoretical guarantees about the consistency and accuracy of their framework's variable importance estimates. They also demonstrate its effectiveness through experiments on complex simulated data, where it outperforms existing methods.

Finally, the researchers apply their framework to a real-world case study, exploring which genes are important for predicting HIV viral load. Their analysis identifies a key gene that had not previously been linked to HIV, highlighting the potential for their approach to uncover novel, policy-relevant insights.

Critical Analysis

A key strength of this framework is its flexibility - it can be integrated with most existing model classes and variable importance metrics. This makes it widely applicable across different research domains.

However, the framework does rely on being able to identify the full set of good models for a given dataset. In practice, this set may be very large or difficult to enumerate, limiting the feasibility of the approach. The authors acknowledge this as an area for future research.

Additionally, while the framework is designed to be stable across data perturbations, it's unclear how it would perform in the face of more substantial dataset shifts or changes in the underlying data distribution. Further testing in more realistic, dynamic environments would help assess the true robustness of the approach.

Overall, this paper presents a promising new direction for quantifying variable importance in a more comprehensive and reliable way. With further development and real-world validation, the framework could become a valuable tool for high-stakes applications where confidence in variable importance is crucial.

Conclusion

This research tackles the important challenge of quantifying variable importance in a way that accounts for the full set of good explanatory models, rather than just a single model. By taking this broader perspective and ensuring stability across data distributions, the proposed framework aims to provide more robust and trustworthy insights for high-impact fields.

The theoretical guarantees and experimental results are encouraging, suggesting this approach could lead to important discoveries and better-informed decisions in areas like genetics, public policy, and medicine. While there are still some practical limitations to address, this work represents an important step forward in making variable importance analysis more reliable and actionable.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📶

The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance

Jon Donnelly, Srikar Katta, Cynthia Rudin, Edward P. Browne

Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different researchers may arrive at many conflicting yet equally valid conclusions given the same data. Additionally, even when accounting for all possible explanations for a given dataset, these insights may not generalize because not all good explanations are stable across reasonable data perturbations. We propose a new variable importance framework that quantifies the importance of a variable across the set of all good models and is stable across the data distribution. Our framework is extremely flexible and can be integrated with most existing model classes and global variable importance metrics. We demonstrate through experiments that our framework recovers variable importance rankings for complex simulation setups where other methods fail. Further, we show that our framework accurately estimates the true importance of a variable for the underlying data distribution. We provide theoretical guarantees on the consistency and finite sample error rates for our estimator. Finally, we demonstrate its utility with a real-world case study exploring which genes are important for predicting HIV load in persons with HIV, highlighting an important gene that has not previously been studied in connection with HIV. Code is available at https://github.com/jdonnelly36/Rashomon_Importance_Distribution.

4/3/2024

🔍

Model-agnostic variable importance for predictive uncertainty: an entropy-based approach

Danny Wood, Theodore Papamarkou, Matt Benatan, Richard Allmendinger

In order to trust the predictions of a machine learning algorithm, it is necessary to understand the factors that contribute to those predictions. In the case of probabilistic and uncertainty-aware models, it is necessary to understand not only the reasons for the predictions themselves, but also the reasons for the model's level of confidence in those predictions. In this paper, we show how existing methods in explainability can be extended to uncertainty-aware models and how such extensions can be used to understand the sources of uncertainty in a model's predictive distribution. In particular, by adapting permutation feature importance, partial dependence plots, and individual conditional expectation plots, we demonstrate that novel insights into model behaviour may be obtained and that these methods can be used to measure the impact of features on both the entropy of the predictive distribution and the log-likelihood of the ground truth labels under that distribution. With experiments using both synthetic and real-world data, we demonstrate the utility of these approaches to understand both the sources of uncertainty and their impact on model performance.

8/19/2024

Amazing Things Come From Having Many Good Models

Cynthia Rudin, Chudi Zhong, Lesia Semenova, Margo Seltzer, Ronald Parr, Jiachang Liu, Srikar Katta, Jon Donnelly, Harry Chen, Zachery Boner

The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, but mostly magic. In light of the Rashomon Effect, this perspective piece proposes reshaping the way we think about machine learning, particularly for tabular data problems in the nondeterministic (noisy) setting. We address how the Rashomon Effect impacts (1) the existence of simple-yet-accurate models, (2) flexibility to address user preferences, such as fairness and monotonicity, without losing performance, (3) uncertainty in predictions, fairness, and explanations, (4) reliable variable importance, (5) algorithm choice, specifically, providing advanced knowledge of which algorithms might be suitable for a given problem, and (6) public policy. We also discuss a theory of when the Rashomon Effect occurs and why. Our goal is to illustrate how the Rashomon Effect can have a massive impact on the use of machine learning for complex problems in society.

7/11/2024

Model-independent variable selection via the rule-based variable priorit

Min Lu, Hemant Ishwaran

While achieving high prediction accuracy is a fundamental goal in machine learning, an equally important task is finding a small number of features with high explanatory power. One popular selection technique is permutation importance, which assesses a variable's impact by measuring the change in prediction error after permuting the variable. However, this can be problematic due to the need to create artificial data, a problem shared by other methods as well. Another problem is that variable selection methods can be limited by being model-specific. We introduce a new model-independent approach, Variable Priority (VarPro), which works by utilizing rules without the need to generate artificial data or evaluate prediction error. The method is relatively easy to use, requiring only the calculation of sample averages of simple statistics, and can be applied to many data settings, including regression, classification, and survival. We investigate the asymptotic properties of VarPro and show, among other things, that VarPro has a consistent filtering property for noise variables. Empirical studies using synthetic and real-world data show the method achieves a balanced performance and compares favorably to many state-of-the-art procedures currently used for variable selection.

9/17/2024