Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus Images

Read original: arXiv:2405.07338 - Published 5/14/2024 by Fatema Tuj Johora Faria, Mukaffi Bin Moin, Pronay Debnath, Asif Iftekher Fahim, Faisal Muhammad Shah

🧠

Overview

This research focuses on using deep learning models to analyze retinal blood vessels in fundus images for early disease diagnosis.
The study explores the use of various pre-trained convolutional neural network (CNN) models and Explainable AI techniques to improve the interpretability and accuracy of fundus image analysis.
The researchers investigate the performance of 10 different models, including TransUNet, Attention U-Net, and [Swin-UNET], for both classification and segmentation tasks.
The results show that the ResNet101 model achieved the highest accuracy for fundus image classification, while Swin-UNET demonstrated the best performance in segmentation tasks.

Plain English Explanation

The researchers are working on a way to help doctors diagnose diseases early by looking at the blood vessels in the back of the eye, called the retina. They are using deep learning, which is a type of artificial intelligence, to analyze images of the retina and identify patterns that could indicate the presence of a disease.

The researchers tried out different deep learning models, including some that were pre-trained on other types of images, to see which ones work best for analyzing retinal images. They also used special techniques to help explain how the models are making their decisions, which can make the results more trustworthy.

The researchers found that the ResNet101 model was the most accurate at classifying retinal images, correctly identifying the presence or absence of disease 94% of the time. Another model, called Swin-UNET, was particularly good at accurately outlining the different parts of the retina in the images.

By developing these advanced deep learning techniques for analyzing retinal images, the researchers hope to provide doctors with a powerful tool for catching diseases like diabetes or macular degeneration early, when they are often more treatable.

Technical Explanation

The researchers utilized eight pre-trained CNN models, including ResNet50V2, ResNet101V2, ResNet152V2, and DenseNet121, for fundus image classification. To enhance the interpretability of the models' decisions, the researchers utilized Explainable AI techniques such as Grad-CAM, Grad-CAM++, Score-CAM, Faster Score-CAM, and Layer CAM.

Expanding their exploration, the researchers investigated 10 models in total, including TransUNet with ResNet backbones, Attention U-Net with DenseNet and ResNet backbones, and Swin-UNET. This comprehensive study aimed to gain deeper insights into the effectiveness of attention mechanisms for fundus image analysis.

The results showed that the ResNet101 model achieved the highest accuracy of 94.17% for fundus image classification, while EfficientNetB0 exhibited the lowest accuracy of 88.33%. For fundus image segmentation, Swin-UNET demonstrated a Mean Pixel Accuracy of 86.19%, outperforming the Attention U-Net with DenseNet201 backbone, which achieved a score of 75.87%.

Critical Analysis

The research provides valuable insights into the application of deep learning models for early disease diagnosis using fundus images. The use of Explainable AI techniques, such as Grad-CAM and Score-CAM, to enhance the interpretability of the models' decisions is a commendable approach that can foster trust in the predictions.

However, the paper does not address the potential limitations of the dataset used for training and evaluating the models. The diversity and representativeness of the fundus images, as well as the inclusion of diverse disease conditions, could impact the generalizability of the findings.

Additionally, the paper does not compare the performance of the deep learning models with the diagnostic accuracy of human experts, which would provide valuable context for assessing the clinical relevance of the proposed approach.

Further research could explore the integration of the developed models into real-world clinical workflows, addressing practical challenges such as data privacy, regulatory requirements, and user-friendly deployment.

Conclusion

This research demonstrates the potential of deep learning models, combined with Explainable AI techniques, for the early diagnosis of diseases through the analysis of retinal blood vessels in fundus images. The study's findings highlight the ResNet101 model as the most accurate for fundus image classification and the Swin-UNET model for effective segmentation of retinal structures.

By advancing the field of fundus image analysis, this research could contribute to the development of more accurate and interpretable diagnostic tools, ultimately leading to earlier detection and improved patient outcomes for conditions affecting the retina and the broader cardiovascular system.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🧠

Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus Images

Fatema Tuj Johora Faria, Mukaffi Bin Moin, Pronay Debnath, Asif Iftekher Fahim, Faisal Muhammad Shah

Our research focuses on the critical field of early diagnosis of disease by examining retinal blood vessels in fundus images. While automatic segmentation of retinal blood vessels holds promise for early detection, accurate analysis remains challenging due to the limitations of existing methods, which often lack discrimination power and are susceptible to influences from pathological regions. Our research in fundus image analysis advances deep learning-based classification using eight pre-trained CNN models. To enhance interpretability, we utilize Explainable AI techniques such as Grad-CAM, Grad-CAM++, Score-CAM, Faster Score-CAM, and Layer CAM. These techniques illuminate the decision-making processes of the models, fostering transparency and trust in their predictions. Expanding our exploration, we investigate ten models, including TransUNet with ResNet backbones, Attention U-Net with DenseNet and ResNet backbones, and Swin-UNET. Incorporating diverse architectures such as ResNet50V2, ResNet101V2, ResNet152V2, and DenseNet121 among others, this comprehensive study deepens our insights into attention mechanisms for enhanced fundus image analysis. Among the evaluated models for fundus image classification, ResNet101 emerged with the highest accuracy, achieving an impressive 94.17%. On the other end of the spectrum, EfficientNetB0 exhibited the lowest accuracy among the models, achieving a score of 88.33%. Furthermore, in the domain of fundus image segmentation, Swin-Unet demonstrated a Mean Pixel Accuracy of 86.19%, showcasing its effectiveness in accurately delineating regions of interest within fundus images. Conversely, Attention U-Net with DenseNet201 backbone exhibited the lowest Mean Pixel Accuracy among the evaluated models, achieving a score of 75.87%.

5/14/2024

Benchmarking Retinal Blood Vessel Segmentation Models for Cross-Dataset and Cross-Disease Generalization

Jeremiah Fadugba, Patrick Kohler, Lisa Koch, Petru Manescu, Philipp Berens

Retinal blood vessel segmentation can extract clinically relevant information from fundus images. As manual tracing is cumbersome, algorithms based on Convolution Neural Networks have been developed. Such studies have used small publicly available datasets for training and measuring performance, running the risk of overfitting. Here, we provide a rigorous benchmark for various architectural and training choices commonly used in the literature on the largest dataset published to date. We train and evaluate five published models on the publicly available FIVES fundus image dataset, which exceeds previous ones in size and quality and which contains also images from common ophthalmological conditions (diabetic retinopathy, age-related macular degeneration, glaucoma). We compare the performance of different model architectures across different loss functions, levels of image qualitiy and ophthalmological conditions and assess their ability to perform well in the face of disease-induced domain shifts. Given sufficient training data, basic architectures such as U-Net perform just as well as more advanced ones, and transfer across disease-induced domain shifts typically works well for most architectures. However, we find that image quality is a key factor determining segmentation outcomes. When optimizing for segmentation performance, investing into a well curated dataset to train a standard architecture yields better results than tuning a sophisticated architecture on a smaller dataset or one with lower image quality. We distilled the utility of architectural advances in terms of their clinical relevance therefore providing practical guidance for model choices depending on the circumstances of the clinical setting

6/24/2024

🤿

A better approach to diagnose retinal diseases: Combining our Segmentation-based Vascular Enhancement with deep learning features

Yuzhuo Chen, Zetong Chen, Yuanyuan Liu

Abnormalities in retinal fundus images may indicate certain pathologies such as diabetic retinopathy, hypertension, stroke, glaucoma, retinal macular edema, venous occlusion, and atherosclerosis, making the study and analysis of retinal images of great significance. In conventional medicine, the diagnosis of retina-related diseases relies on a physician's subjective assessment of the retinal fundus images, which is a time-consuming process and the accuracy is highly dependent on the physician's subjective experience. To this end, this paper proposes a fast, objective, and accurate method for the diagnosis of diseases related to retinal fundus images. This method is a multiclassification study of normal samples and 13 categories of disease samples on the STARE database, with a test set accuracy of 99.96%. Compared with other studies, our method achieved the highest accuracy. This study innovatively propose Segmentation-based Vascular Enhancement(SVE). After comparing the classification performances of the deep learning models of SVE images, original images and Smooth Grad-CAM ++ images, we extracted the deep learning features and traditional features of the SVE images and input them into nine meta learners for classification. The results shows that our proposed UNet-SVE-VGG-MLP model has the optimal performance for classifying diseases related to retinal fundus images on the STARE database, with a overall accuracy of 99.96% and a weighted AUC of 99.98% for the 14 categories on test dataset. This method can be used to realize rapid, objective, and accurate classification and diagnosis of retinal fundus image related diseases.

5/28/2024

Perception and Localization of Macular Degeneration Applying Convolutional Neural Network, ResNet and Grad-CAM

Tahmim Hossain, Sagor Chandro Bakchy

A well-known retinal disease that sends blurry visions to the affected patients is Macular Degeneration. This research is based on classifying the healthy and macular degeneration fundus by localizing the affected region of the fundus. A CNN architecture and CNN with ResNet architecture (ResNet50, ResNet50v2, ResNet101, ResNet101v2, ResNet152, ResNet152v2) as the backbone are used to classify the two types of fundus. The data are split into three categories including (a) Training set is 90% and Testing set is 10% (b) Training set is 80% and Testing set is 20%, (c) Training set is 50% and Testing set is 50%. After the training, the best model has been selected from the evaluation metrics. Among the models, CNN with a backbone of ResNet50 performs best which gives the training accuracy of 98.7% for 90% train and 10% test data split. With this model, we have performed the Grad-CAM visualization to get the region of the affected area of the fundus.

5/3/2024