Interpretable Prediction and Feature Selection for Survival Analysis

2404.14689

Published 4/24/2024 by Mike Van Ness, Madeleine Udell

🔮

Abstract

Survival analysis is widely used as a technique to model time-to-event data when some data is censored, particularly in healthcare for predicting future patient risk. In such settings, survival models must be both accurate and interpretable so that users (such as doctors) can trust the model and understand model predictions. While most literature focuses on discrimination, interpretability is equally as important. A successful interpretable model should be able to describe how changing each feature impacts the outcome, and should only use a small number of features. In this paper, we present DyS (pronounced ``dice''), a new survival analysis model that achieves both strong discrimination and interpretability. DyS is a feature-sparse Generalized Additive Model, combining feature selection and interpretable prediction into one model. While DyS works well for all survival analysis problems, it is particularly useful for large (in $n$ and $p$) survival datasets such as those commonly found in observational healthcare studies. Empirical studies show that DyS competes with other state-of-the-art machine learning models for survival analysis, while being highly interpretable.

Create account to get full access

Overview

Survival analysis is a statistical method used to model time-to-event data, particularly in healthcare settings where some data may be censored (i.e., the event of interest has not occurred for certain individuals).
Interpretability is crucial for survival models, as users (e.g., doctors) need to understand how the model makes predictions and how changes in features affect the outcome.
This paper presents DyS, a new survival analysis model that aims to achieve both strong discrimination (accuracy) and interpretability.

Plain English Explanation

Survival analysis is a way of looking at how long it takes for something to happen, like a patient's health outcome. In healthcare, this can be really useful for predicting a patient's future risk. However, the models used for survival analysis need to be both accurate and easy to understand, so that doctors and other users can trust the predictions and see how different factors affect the outcome.

DyS is a new survival analysis model that tries to be both accurate and easy to understand. It's a type of Generalized Additive Model that selects only a few important features to use in its predictions. This means the model is simple and straightforward, so users can see how changes in things like a patient's age or symptoms affect their predicted health outcome.

DyS is particularly useful for large healthcare datasets, where there are a lot of different factors to consider. By focusing on the most important features, it can make accurate predictions without being overly complex.

Technical Explanation

DyS is a Generalized Additive Model (GAM) that combines feature selection and interpretable prediction into a single survival analysis model. GAMs are a type of machine learning model that can capture nonlinear relationships between features and the outcome, while still maintaining interpretability.

The key innovation of DyS is its feature-sparse formulation, which means it only uses a small number of features to make predictions. This is achieved through a custom feature selection process that identifies the most important predictors. As a result, DyS is able to provide detailed explanations of how changes in each feature impact the predicted survival outcome.

Empirical studies show that DyS performs competitively with other state-of-the-art machine learning models for survival analysis, while being highly interpretable. This makes it particularly useful for large, complex healthcare datasets, where interpretability is crucial for building trust and enabling users to understand the model's predictions.

Critical Analysis

The paper provides a thorough evaluation of DyS, comparing its performance to other popular survival analysis models across a range of datasets. The results demonstrate that DyS can achieve strong discrimination (accuracy) while maintaining a high level of interpretability.

However, the paper does not address some potential limitations of the approach. For example, the feature selection process used by DyS may not be robust to highly correlated features, which are common in healthcare data. Additionally, the paper does not explore how DyS might perform on datasets with a larger number of features or more complex relationships between features and the outcome.

Further research could investigate ways to enhance the feature selection process, such as by incorporating additional regularization techniques or exploring alternative model architectures. It would also be valuable to test DyS on a wider range of healthcare datasets, including those with more challenging characteristics, to better understand its strengths and limitations.

Overall, DyS represents an interesting and promising approach to achieving both accuracy and interpretability in survival analysis, particularly for complex healthcare applications. However, as with any model, it's important to carefully consider its assumptions and limitations when applying it in real-world settings.

Conclusion

The DyS model presents a novel solution to the challenge of building accurate and interpretable survival analysis models, especially for large healthcare datasets. By combining feature selection and interpretable prediction into a single Generalized Additive Model, DyS is able to provide accurate predictions while also enabling users to understand how changes in different factors affect the predicted outcomes.

This type of interpretable model is crucial for building trust and enabling healthcare professionals to make informed decisions based on the model's predictions. The promising results demonstrated in this paper suggest that DyS could have significant practical applications in a wide range of healthcare settings, helping to improve patient outcomes and support clinical decision-making.

As the field of explainable AI continues to evolve, models like DyS that prioritize both accuracy and interpretability will likely become increasingly important, not just in healthcare but across many other domains as well. By advancing the state of the art in this area, this research contributes to the ongoing efforts to develop more transparent and trustworthy machine learning models that can be reliably deployed in high-stakes, real-world applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤿

Deep Clustering Survival Machines with Interpretable Expert Distributions

Bojian Hou, Hongming Li, Zhicheng Jiao, Zhen Zhou, Hao Zheng, Yong Fan

Conventional survival analysis methods are typically ineffective to characterize heterogeneity in the population while such information can be used to assist predictive modeling. In this study, we propose a hybrid survival analysis method, referred to as deep clustering survival machines, that combines the discriminative and generative mechanisms. Similar to the mixture models, we assume that the timing information of survival data is generatively described by a mixture of certain numbers of parametric distributions, i.e., expert distributions. We learn weights of the expert distributions for individual instances according to their features discriminatively such that each instance's survival information can be characterized by a weighted combination of the learned constant expert distributions. This method also facilitates interpretable subgrouping/clustering of all instances according to their associated expert distributions. Extensive experiments on both real and synthetic datasets have demonstrated that the method is capable of obtaining promising clustering results and competitive time-to-event predicting performance.

4/9/2024

cs.LG cs.AI

Optimal Sparse Survival Trees

Rui Zhang, Rui Xin, Margo Seltzer, Cynthia Rudin

Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for survival analysis due to their appealing interpretablility and their ability to capture complex relationships. However, most existing methods to produce survival trees rely on heuristic (or greedy) algorithms, which risk producing sub-optimal models. We present a dynamic-programming-with-bounds approach that finds provably-optimal sparse survival tree models, frequently in only a few seconds.

5/24/2024

cs.LG

🛠️

Teaching Models To Survive: Proper Scoring Rule and Stochastic Optimization with Competing Risks

Julie Alberge (SODA), Vincent Maladi`ere (SODA), Olivier Grisel (SODA), Judith Ab'ecassis (SODA), Gael Varoquaux (SODA)

When data are right-censored, i.e. some outcomes are missing due to a limited period of observation, survival analysis can compute the time to event. Multiple classes of outcomes lead to a classification variant: predicting the most likely event, known as competing risks, which has been less studied. To build a loss that estimates outcome probabilities for such settings, we introduce a strictly proper censoring-adjusted separable scoring rule that can be optimized on a subpart of the data because the evaluation is made independently of observations. It enables stochastic optimization for competing risks which we use to train gradient boosting trees. Compared to 11 state-of-the-art models, this model, MultiIncidence, performs best in estimating the probability of outcomes in survival and competing risks. It can predict at any time horizon and is much faster than existing alternatives.

6/21/2024

cs.AI

A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data

Lukas Burk, John Zobolas, Bernd Bischl, Andreas Bender, Marvin N. Wright, Raphael Sonabend

This work presents the first large-scale neutral benchmark experiment focused on single-event, right-censored, low-dimensional survival data. Benchmark experiments are essential in methodological research to scientifically compare new and existing model classes through proper empirical evaluation. Existing benchmarks in the survival literature are often narrow in scope, focusing, for example, on high-dimensional data. Additionally, they may lack appropriate tuning or evaluation procedures, or are qualitative reviews, rather than quantitative comparisons. This comprehensive study aims to fill the gap by neutrally evaluating a broad range of methods and providing generalizable conclusions. We benchmark 18 models, ranging from classical statistical approaches to many common machine learning methods, on 32 publicly available datasets. The benchmark tunes for both a discrimination measure and a proper scoring rule to assess performance in different settings. Evaluating on 8 survival metrics, we assess discrimination, calibration, and overall predictive performance of the tested models. Using discrimination measures, we find that no method significantly outperforms the Cox model. However, (tuned) Accelerated Failure Time models were able to achieve significantly better results with respect to overall predictive performance as measured by the right-censored log-likelihood. Machine learning methods that performed comparably well include Oblique Random Survival Forests under discrimination, and Cox-based likelihood-boosting under overall predictive performance. We conclude that for predictive purposes in the standard survival analysis setting of low-dimensional, right-censored data, the Cox Proportional Hazards model remains a simple and robust method, sufficient for practitioners.

6/7/2024

stat.ML cs.LG