An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases

Read original: arXiv:2211.10858 - Published 6/11/2024 by Futian Weng, Yuanting Ma, Jinghan Sun, Shijun Shan, Qiyuan Li, Jianping Zhu, Yang Wang, Yan Xu

🤿

Overview

This paper presents a study on the interpretability and imbalanced semi-supervised learning of a multiclass intelligent skin diagnosis framework (ISDL) using a large dataset of skin images.
The framework aims to address the challenges of dermatological disease classification, including class imbalance and the need for interpretable AI models.
The researchers combine pseudo-labeling techniques and a Shapley Additive Explanation (SHAP) method to improve the performance and interpretability of their ISDL model.

Plain English Explanation

Skin diseases are very common worldwide. This study looked at a new AI-based skin diagnosis framework called ISDL that can classify different types of skin diseases. The researchers used a large dataset of over 58,000 skin images, including 10,857 unlabeled samples.

To address the problem of imbalanced data (where some skin disease classes have many more samples than others), the ISDL framework uses a technique called pseudo-labeling. This means it can make educated guesses about the labels of the unlabeled samples, which helps even out the representation of different skin disease classes.

The researchers found that the ISDL framework performed very well, achieving high accuracy, sensitivity, and specificity in classifying skin diseases. They also used a method called SHAP to help explain how the AI model makes its predictions, which aligns with how human doctors diagnose skin conditions.

Additionally, the researchers proposed a strategy to further optimize the pseudo-labeling process, called ISDLplus. This could help make the skin disease classification even more reliable and fair.

Overall, this research shows promise in using AI to assist with skin disease diagnosis, which could help relieve the burden on doctors, especially in areas with a shortage of dermatological expertise.

Technical Explanation

The researchers presented the Intelligent Skin Diagnosis Learning (ISDL) framework, which is a deep learning-based model for multiclass skin disease classification. They used a dataset of 58,457 skin images, including 10,857 unlabeled samples, to train and evaluate the ISDL model.

To address the class imbalance problem, the ISDL framework employs a semi-supervised learning approach with a pseudo-labeling technique. This means the model can make educated guesses about the labels of the unlabeled samples, and the pseudo-labeled samples from minority classes are given a higher probability during the class-rebalancing self-training process. This helps the model better utilize the unlabeled data to improve its performance on the underrepresented skin disease classes.

The researchers evaluated the ISDL model's performance using various metrics, including accuracy, sensitivity, specificity, macro-F1 score, and area under the receiver operating characteristic curve (AUC). The ISDL model achieved impressive results, with an accuracy of 0.979, sensitivity of 0.975, specificity of 0.973, macro-F1 score of 0.974, and AUC of 0.999 for multi-label skin disease classification.

To improve the interpretability of the ISDL model, the researchers combined it with the Shapley Additive Explanation (SHAP) method. SHAP helps explain how the deep learning model makes its predictions, and the researchers found that the SHAP-based explanations are consistent with the clinical diagnosis of skin diseases.

Furthermore, the researchers proposed a sampling distribution optimization strategy called ISDLplus to select the pseudo-labeled samples in a more effective manner, potentially improving the overall reliability and fairness of the skin disease classification.

Critical Analysis

The study presents a comprehensive approach to addressing the challenges of dermatological disease classification, such as class imbalance and the need for interpretable AI models. The researchers' use of pseudo-labeling techniques and the SHAP method to improve the performance and interpretability of the ISDL model is a promising step forward.

However, the paper does not discuss the potential limitations or caveats of the ISDL framework. For example, it would be helpful to know how the model performs on rare or unusual skin conditions, or how it handles the diversity of skin types and ethnicities in the dataset. Additionally, the researchers could have explored the potential biases in the dataset and how they might impact the model's performance.

While the proposed ISDLplus strategy for optimizing the pseudo-labeling process is interesting, the paper does not provide a detailed evaluation of its effectiveness compared to the original ISDL framework. It would be valuable to see a more thorough comparison and analysis of the trade-offs between the two approaches.

Overall, the study presents a solid foundation for using AI in dermatological disease diagnosis, but there is still room for further research and refinement to ensure the reliability, fairness, and real-world applicability of such systems.

Conclusion

This paper introduces the Intelligent Skin Diagnosis Learning (ISDL) framework, a deep learning-based model for multiclass skin disease classification. The researchers addressed the challenges of class imbalance and interpretability in their approach, using pseudo-labeling techniques and the Shapley Additive Explanation (SHAP) method to improve the model's performance and transparency.

The ISDL framework demonstrated impressive results, achieving high accuracy, sensitivity, and specificity in classifying a diverse set of skin diseases. The researchers also proposed the ISDLplus strategy to further optimize the pseudo-labeling process, which could enhance the reliability and fairness of the skin disease classification.

This research has the potential to significantly impact the field of dermatology, as the ISDL framework could help relieve the burden on medical professionals and improve access to skin disease diagnosis, especially in areas with a shortage of dermatological expertise. However, further research is needed to address the potential limitations and explore the real-world applicability of such AI-based skin disease diagnosis systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases

Futian Weng, Yuanting Ma, Jinghan Sun, Shijun Shan, Qiyuan Li, Jianping Zhu, Yang Wang, Yan Xu

Dermatological diseases are among the most common disorders worldwide. This paper presents the first study of the interpretability and imbalanced semi-supervised learning of the multiclass intelligent skin diagnosis framework (ISDL) using 58,457 skin images with 10,857 unlabeled samples. Pseudo-labelled samples from minority classes have a higher probability at each iteration of class-rebalancing self-training, thereby promoting the utilization of unlabeled samples to solve the class imbalance problem. Our ISDL achieved a promising performance with an accuracy of 0.979, sensitivity of 0.975, specificity of 0.973, macro-F1 score of 0.974 and area under the receiver operating characteristic curve (AUC) of 0.999 for multi-label skin disease classification. The Shapley Additive explanation (SHAP) method is combined with our ISDL to explain how the deep learning model makes predictions. This finding is consistent with the clinical diagnosis. We also proposed a sampling distribution optimisation strategy to select pseudo-labelled samples in a more effective manner using ISDLplus. Furthermore, it has the potential to relieve the pressure placed on professional doctors, as well as help with practical issues associated with a shortage of such doctors in rural areas.

6/11/2024

Enhancing Skin Disease Classification Leveraging Transformer-based Deep Learning Architectures and Explainable AI

Jayanth Mohan, Arrun Sivasubramanian, V Sowmya, Ravi Vinayakumar

Skin diseases affect over a third of the global population, yet their impact is often underestimated. Automating skin disease classification to assist doctors with their prognosis might be difficult. Nevertheless, due to efficient feature extraction pipelines, deep learning techniques have shown much promise for various tasks, including dermatological disease identification. This study uses a skin disease dataset with 31 classes and compares it with all versions of Vision Transformers, Swin Transformers and DivoV2. The analysis is also extended to compare with benchmark convolution-based architecture presented in the literature. Transfer learning with ImageNet1k weights on the skin disease dataset contributes to a high test accuracy of 96.48% and an F1-Score of 0.9727 using DinoV2, which is almost a 10% improvement over this data's current benchmark results. The performance of DinoV2 was also compared for the HAM10000 and Dermnet datasets to test the model's robustness, and the trained model overcomes the benchmark results by a slight margin in test accuracy and in F1-Score on the 23 and 7 class datasets. The results are substantiated using explainable AI frameworks like GradCAM and SHAP, which provide precise image locations to map the disease, assisting dermatologists in early detection, prompt prognosis, and treatment.

7/23/2024

Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM Empowerment

Xin Hu, Janet Wang, Jihun Hamm, Rie R Yotsu, Zhengming Ding

Current AI-assisted skin image diagnosis has achieved dermatologist-level performance in classifying skin cancer, driven by rapid advancements in deep learning architectures. However, unlike traditional vision tasks, skin images in general present unique challenges due to the limited availability of well-annotated datasets, complex variations in conditions, and the necessity for detailed interpretations to ensure patient safety. Previous segmentation methods have sought to reduce image noise and enhance diagnostic performance, but these techniques require fine-grained, pixel-level ground truth masks for training. In contrast, with the rise of foundation models, the Segment Anything Model (SAM) has been introduced to facilitate promptable segmentation, enabling the automation of the segmentation process with simple yet effective prompts. Efforts applying SAM predominantly focus on dermatoscopy images, which present more easily identifiable lesion boundaries than clinical photos taken with smartphones. This limitation constrains the practicality of these approaches to real-world applications. To overcome the challenges posed by noisy clinical photos acquired via non-standardized protocols and to improve diagnostic accessibility, we propose a novel Cross-Attentive Fusion framework for interpretable skin lesion diagnosis. Our method leverages SAM to generate visual concepts for skin diseases using prompts, integrating local visual concepts with global image features to enhance model performance. Extensive evaluation on two skin disease datasets demonstrates our proposed method's effectiveness on lesion diagnosis and interpretability.

9/17/2024

🏷️

Semi-Supervised Disease Classification based on Limited Medical Image Data

Yan Zhang, Chun Li, Zhaoxia Liu, Ming Li

In recent years, significant progress has been made in the field of learning from positive and unlabeled examples (PU learning), particularly in the context of advancing image and text classification tasks. However, applying PU learning to semi-supervised disease classification remains a formidable challenge, primarily due to the limited availability of labeled medical images. In the realm of medical image-aided diagnosis algorithms, numerous theoretical and practical obstacles persist. The research on PU learning for medical image-assisted diagnosis holds substantial importance, as it aims to reduce the time spent by professional experts in classifying images. Unlike natural images, medical images are typically accompanied by a scarcity of annotated data, while an abundance of unlabeled cases exists. Addressing these challenges, this paper introduces a novel generative model inspired by Holder divergence, specifically designed for semi-supervised disease classification using positive and unlabeled medical image data. In this paper, we present a comprehensive formulation of the problem and establish its theoretical feasibility through rigorous mathematical analysis. To evaluate the effectiveness of our proposed approach, we conduct extensive experiments on five benchmark datasets commonly used in PU medical learning: BreastMNIST, PneumoniaMNIST, BloodMNIST, OCTMNIST, and AMD. The experimental results clearly demonstrate the superiority of our method over existing approaches based on KL divergence. Notably, our approach achieves state-of-the-art performance on all five disease classification benchmarks. By addressing the limitations imposed by limited labeled data and harnessing the untapped potential of unlabeled medical images, our novel generative model presents a promising direction for enhancing semi-supervised disease classification in the field of medical image analysis.

5/8/2024