Feature importance to explain multimodal prediction models. A clinical use case

Read original: arXiv:2404.18631 - Published 4/30/2024 by Jorn-Jan van de Beld, Shreyasi Pathak, Jeroen Geerdink, Johannes H. Hegeman, Christin Seifert

✨

Overview

This paper presents a multimodal deep learning model for predicting post-operative mortality in elderly hip fracture patients.
The model uses pre-operative (patient data, hip and chest images) and per-operative (vital signals, medications) data to predict the risk of mortality.
The authors use Shapley values to explain the model's predictions, allowing for interpretable and clinically applicable outcomes.

Plain English Explanation

Hip fractures are a serious health issue, especially for elderly patients. Surgery to treat these fractures can sometimes lead to complications that increase the risk of early death. An early warning system that identifies high-risk patients could help doctors monitor these patients more closely and address potential issues quickly, or inform the patient about their risks.

The researchers in this study developed a machine learning model that can predict the risk of a patient dying after hip fracture surgery. Their model uses different types of data, including the patient's medical history, images of the patient's hip and chest before surgery, and measurements of the patient's vital signs and medications during the surgery. The model combines these different data sources using advanced deep learning techniques.

Importantly, the researchers also developed a way to explain the model's predictions, using a technique called Shapley values. This allows doctors to understand why the model is making particular predictions, which is crucial for using the model in a real clinical setting. The researchers found that Shapley values can be used to estimate the relative importance of each data source in the model's predictions, both globally and for specific patients.

Technical Explanation

The researchers developed a multimodal deep learning model to predict post-operative mortality in elderly hip fracture patients. The model takes in both pre-operative data (static patient information, hip and chest X-ray images) and per-operative data (vital signs, medications) to make its predictions.

For the image data, the researchers used a pre-trained ResNet model to extract relevant features. The vital sign data was processed using an LSTM neural network. These different data streams were then combined using a multimodal fusion approach.

To make the model's predictions interpretable, the researchers computed Shapley values for each input feature. Shapley values estimate the relative contribution of each feature to the model's overall prediction. The researchers found that Shapley values could be used to explain the model's predictions both globally (across all patients) and locally (for individual patients).

Furthermore, the researchers developed a modified version of the chain rule to propagate Shapley values through the sequence of models in their multimodal architecture. This allowed them to provide interpretable explanations for the model's predictions at the local level.

Critical Analysis

The researchers acknowledge several limitations in their work. First, the study was conducted on a single-center dataset, so the generalizability of the results to other populations is unclear. Second, the model was trained on historical data, which may not reflect current clinical practices and outcomes.

Additionally, while the Shapley value approach provides interpretable explanations, it does not necessarily reveal the underlying causal relationships between the input features and the predicted outcome. Further research is needed to better understand the mechanisms driving the model's predictions.

Another potential concern is the ethical implications of using an AI system to predict mortality risk. While the model could help identify high-risk patients, there are concerns about the potential for bias, discrimination, and the psychological impact on patients. Careful consideration of these issues is crucial before deploying such a system in a clinical setting.

Conclusion

This study presents a promising approach for using multimodal deep learning to predict post-operative mortality risk in elderly hip fracture patients. The researchers' use of Shapley values to explain the model's predictions is a particularly notable contribution, as it enhances the model's clinical applicability and transparency.

However, further research is needed to validate the model's performance on more diverse datasets and to better understand the causal relationships underlying the predictions. Additionally, careful consideration of the ethical implications of such a system is crucial before it can be deployed in real-world clinical settings.

Overall, this work represents an important step towards developing explainable and clinically useful AI systems for healthcare applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

Feature importance to explain multimodal prediction models. A clinical use case

Jorn-Jan van de Beld, Shreyasi Pathak, Jeroen Geerdink, Johannes H. Hegeman, Christin Seifert

Surgery to treat elderly hip fracture patients may cause complications that can lead to early mortality. An early warning system for complications could provoke clinicians to monitor high-risk patients more carefully and address potential complications early, or inform the patient. In this work, we develop a multimodal deep-learning model for post-operative mortality prediction using pre-operative and per-operative data from elderly hip fracture patients. Specifically, we include static patient data, hip and chest images before surgery in pre-operative data, vital signals, and medications administered during surgery in per-operative data. We extract features from image modalities using ResNet and from vital signals using LSTM. Explainable model outcomes are essential for clinical applicability, therefore we compute Shapley values to explain the predictions of our multimodal black box model. We find that i) Shapley values can be used to estimate the relative contribution of each modality both locally and globally, and ii) a modified version of the chain rule can be used to propagate Shapley values through a sequence of models supporting interpretable local explanations. Our findings imply that a multimodal combination of black box models can be explained by propagating Shapley values through the model sequence.

4/30/2024

🔮

FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival

Liangrui Pan, Yijun Peng, Yan Li, Yiyi Liang, Liwen Xu, Qingchun Liang, Shaoliang Peng

Integrating the different data modalities of cancer patients can significantly improve the predictive performance of patient survival. However, most existing methods ignore the simultaneous utilization of rich semantic features at different scales in pathology images. When collecting multimodal data and extracting features, there is a likelihood of encountering intra-modality missing data, introducing noise into the multimodal data. To address these challenges, this paper proposes a new end-to-end framework, FORESEE, for robustly predicting patient survival by mining multimodal information. Specifically, the cross-fusion transformer effectively utilizes features at the cellular level, tissue level, and tumor heterogeneity level to correlate prognosis through a cross-scale feature cross-fusion method. This enhances the ability of pathological image feature representation. Secondly, the hybrid attention encoder (HAE) uses the denoising contextual attention module to obtain the contextual relationship features and local detail features of the molecular data. HAE's channel attention module obtains global features of molecular data. Furthermore, to address the issue of missing information within modalities, we propose an asymmetrically masked triplet masked autoencoder to reconstruct lost information within modalities. Extensive experiments demonstrate the superiority of our method over state-of-the-art methods on four benchmark datasets in both complete and missing settings.

5/14/2024

🤔

Multimodal Explainability via Latent Shift applied to COVID-19 stratification

Valerio Guarrasi, Lorenzo Tronchin, Domenico Albano, Eliodoro Faiella, Deborah Fazzini, Domiziana Santucci, Paolo Soda

We are witnessing a widespread adoption of artificial intelligence in healthcare. However, most of the advancements in deep learning in this area consider only unimodal data, neglecting other modalities. Their multimodal interpretation necessary for supporting diagnosis, prognosis and treatment decisions. In this work we present a deep architecture, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. The explanation of the decision taken is computed by applying a latent shift that, simulates a counterfactual prediction revealing the features of each modality that contribute the most to the decision and a quantitative score indicating the modality importance. We validate our approach in the context of COVID-19 pandemic using the AIforCOVID dataset, which contains multimodal data for the early identification of patients at risk of severe outcome. The results show that the proposed method provides meaningful explanations without degrading the classification performance.

7/23/2024

Automated Ensemble Multimodal Machine Learning for Healthcare

Fergus Imrie, Stefan Denner, Lucas S. Brunschwig, Klaus Maier-Hein, Mihaela van der Schaar

The application of machine learning in medicine and healthcare has led to the creation of numerous diagnostic and prognostic models. However, despite their success, current approaches generally issue predictions using data from a single modality. This stands in stark contrast with clinician decision-making which employs diverse information from multiple sources. While several multimodal machine learning approaches exist, significant challenges in developing multimodal systems remain that are hindering clinical adoption. In this paper, we introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning. AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies. In an illustrative application using a multimodal skin lesion dataset, we highlight the importance of multimodal machine learning and the power of combining multiple fusion strategies using ensemble learning. We have open-sourced our framework as a tool for the community and hope it will accelerate the uptake of multimodal machine learning in healthcare and spur further innovation.

7/26/2024