Classification of Diabetic Retinopathy using Pre-Trained Deep Learning Models

Read original: arXiv:2403.19905 - Published 4/1/2024 by Inas Al-Kamachy (Karlstad University, Sweden), Prof. Dr. Reza Hassanpour (Rotterdam University, Netherlands), Prof. Roya Choupani (Angelo State University, USA)

Classification of Diabetic Retinopathy using Pre-Trained Deep Learning Models

Introduction

Here is a summary of the provided section:

Diabetic Retinopathy (DR) is a leading cause of blindness worldwide. It results from clogging of the eye's blood vessels, causing two main reactions. The first reaction is the creation of new blood vessels in the vitreous area, obstructing the path of light that needs to reach the retina for vision. The second reaction involves blood leakage from vessels which damages the macula, the part of the retina responsible for detailed central vision.

The severity of diabetic retinopathy is classified into five levels as shown in Figure 1. The paper provides background on how DR affects the eye's anatomy and visual function. Proper blood flow to the retina is critical for converting light signals into neural impulses that are transmitted to the brain for image processing and sight.

Figure 1: Diabetic Retinopathy Stages

The study focused on developing a machine learning application for multi-categorical classification of diabetic retinopathy (DR) using convolutional neural networks (CNN) and four pre-trained deep learning models: VGG16, MobileNet, InceptionV3, and InceptionResNetV2. The researchers utilized 1000 color fundus images from the KAGGLE diabetic retinopathy dataset, with resolutions of 350x350 and 224x224 pixels. Several image augmentation techniques were applied before feeding the images to the pre-trained models.

PREVIOUS WORK

The paper summarizes several research works on classifying diabetic retinopathy (DR) stages from color fundus images using deep learning techniques. The key points are:

Shorav Suriyal et al. classified images into two classes (with/without DR) using a MobileNet-inspired network with 25 convolutional layers. After preprocessing the 16,798 Kaggle images, they achieved 73.3% accuracy.

Arkadiusz Kwasigroch et al. classified 88,000 images into five DR stages using a VGG-D based transfer learning model. After preprocessing steps like resizing and normalization, their model with convolutional, pooling, dropout and fully connected layers achieved 81.7% accuracy using a specialized labeling technique.

Xiaoliang Wang et al. classified 166 Kaggle images into five DR stages using transfer learning on AlexNet, VGG-16 and Inception-V3 architectures after resizing images. Inception-V3 with fine-tuned optimization achieved the highest accuracy of 63.23%.

Safaraz Masood et al. classified the Eye-Paces Kaggle dataset into five classes using Inception-V3 transfer learning. After preprocessing like resizing and highlighting blue channels, their best model achieved 48.2% test accuracy.

Ardianto et al. proposed Deep-DR-Net to classify 237 images into three stages using convolutional feature extraction, coding, and classification stages inspired by ResNet and InceptionNet. With the FINdERS dataset, they achieved 60.82% accuracy.

The Data Set

The researchers utilized a dataset from the KAGGLE competition, comprising 35,126 color fundus images captured from different eye positions (left and right). The images were scaled between 0 and 4 to correspond to the five stages of diabetic retinopathy (DR) disease: Normal, Mild, Moderate, Severe, and Proliferative Diabetic Retinopathy (PDR).

V PROPOSED WORK AND METHODOLOGY

The paper discusses the implementation of a method for diabetic retinopathy classification using convolutional neural networks (CNNs) on the KAGGLE dataset of 1000 color fundus images. The key points are:

Image Preprocessing:

Images were resized to 350x350x3 and 224x224x3 resolutions
Data was split into 80% training, 10% validation, and 10% testing sets
Images were standardized to [0,1] range
Five disease classes were defined with one-hot encoded vectors

Data Augmentation:

Various geometric transforms were applied to increase the dataset size to 2000 images and reduce overfitting

CNN Architecture:

A custom CNN was built with convolutional, batch normalization, max pooling, dropout, flatten, and dense layers
The final dense layer had 5 neurons for classification into the 5 disease classes

Transfer Learning:

Pre-trained models (VGG16, MobileNet, InceptionV3, InceptionResNetV2) were fine-tuned by freezing lower layers and adding custom layers
This leveraged features learned on ImageNet while fine-tuning for the diabetic retinopathy task

Web Application:

The best performing model was deployed using Flask to build a web application for image classification

Evaluation Metrics:

Area under the ROC curve (AUC) was used as the key metric, suitable for the imbalanced dataset
Performance of the models is reported, with InceptionResNetV2 achieving the highest AUC of 0.69

Conclusions and Future Work

The paper evaluates the performance of different pre-trained deep learning models for classifying diabetic retinopathy disease using a Kaggle dataset of retinal images. The models explored include CNNs built from scratch, MobileNet, VGG16, InceptionV3, and InceptionResNetV2.

CNNs built from scratch performed poorly due to limited training data and shallow network depth. MobileNet, with its depth-wise separable convolutions, showed better performance than VGG16 which had fewer layers.

InceptionV3 and InceptionResNetV2, with their deeper architectures and ability to handle varying resolutions, demonstrated state-of-the-art performance on this medical image classification task. Fine-tuning the pre-trained InceptionV3 model with techniques like image augmentation, layer-wise fine-tuning, and adding custom layers improved the AUC metric to 0.63 compared to 0.59 reported in previous work.

InceptionResNetV2 exhibited the best overall performance due to its extremely deep architecture of 572 layers. Future work includes using cloud environments for training on larger datasets, higher resolutions, mobile app development, and integration with camera for real-time diabetic retinopathy screening.

The key advantages of transfer learning with deep pre-trained models like Inception were their ability to leverage large initial datasets like ImageNet, handle varying resolutions, and extract relevant features from the medical images despite domain shift.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Classification of Diabetic Retinopathy using Pre-Trained Deep Learning Models

Inas Al-Kamachy (Karlstad University, Sweden), Prof. Dr. Reza Hassanpour (Rotterdam University, Netherlands), Prof. Roya Choupani (Angelo State University, USA)

Diabetic Retinopathy (DR) stands as the leading cause of blindness globally, particularly affecting individuals between the ages of 20 and 70. This paper presents a Computer-Aided Diagnosis (CAD) system designed for the automatic classification of retinal images into five distinct classes: Normal, Mild, Moderate, Severe, and Proliferative Diabetic Retinopathy (PDR). The proposed system leverages Convolutional Neural Networks (CNNs) employing pre-trained deep learning models. Through the application of fine-tuning techniques, our model is trained on fundus images of diabetic retinopathy with resolutions of 350x350x3 and 224x224x3. Experimental results obtained on the Kaggle platform, utilizing resources comprising 4 CPUs, 17 GB RAM, and 1 GB Disk, demonstrate the efficacy of our approach. The achieved Area Under the Curve (AUC) values for CNN, MobileNet, VGG-16, InceptionV3, and InceptionResNetV2 models are 0.50, 0.70, 0.53, 0.63, and 0.69, respectively.

4/1/2024

🔎

Detecting Severity of Diabetic Retinopathy from Fundus Images: A Transformer Network-based Review

Tejas Karkera, Chandranath Adak, Soumi Chattopadhyay, Muhammad Saqib

Diabetic Retinopathy (DR) is considered one of the significant concerns worldwide, primarily due to its impact on causing vision loss among most people with diabetes. The severity of DR is typically comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this study, we adopt and fine-tune transformer-based learning models to capture the crucial features of retinal images for a more nuanced understanding of DR severity. Additionally, we explore the effectiveness of image transformers to infer the degree of DR severity from fundus photographs. For experiments, we utilized the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.

6/11/2024

🌐

Lesion-aware network for diabetic retinopathy diagnosis

Xue Xia, Kun Zhan, Yuming Fang, Wenhui Jiang, Fei Shen

Deep learning brought boosts to auto diabetic retinopathy (DR) diagnosis, thus, greatly helping ophthalmologists for early disease detection, which contributes to preventing disease deterioration that may eventually lead to blindness. It has been proved that convolutional neural network (CNN)-aided lesion identifying or segmentation benefits auto DR screening. The key to fine-grained lesion tasks mainly lies in: (1) extracting features being both sensitive to tiny lesions and robust against DR-irrelevant interference, and (2) exploiting and re-using encoded information to restore lesion locations under extremely imbalanced data distribution. To this end, we propose a CNN-based DR diagnosis network with attention mechanism involved, termed lesion-aware network, to better capture lesion information from imbalanced data. Specifically, we design the lesion-aware module (LAM) to capture noise-like lesion areas across deeper layers, and the feature-preserve module (FPM) to assist shallow-to-deep feature fusion. Afterward, the proposed lesion-aware network (LANet) is constructed by embedding the LAM and FPM into the CNN decoders for DR-related information utilization. The proposed LANet is then further extended to a DR screening network by adding a classification layer. Through experiments on three public fundus datasets with pixel-level annotations, our method outperforms the mainstream methods with an area under curve of 0.967 in DR screening, and increases the overall average precision by 7.6%, 2.1%, and 1.2% in lesion segmentation on three datasets. Besides, the ablation study validates the effectiveness of the proposed sub-modules.

8/15/2024

Enhancing Eye Disease Diagnosis with Deep Learning and Synthetic Data Augmentation

Saideep Kilaru, Kothamasu Jayachandra, Tanishka Yagneshwar, Suchi Kumari

In recent years, the focus is on improving the diagnosis of diabetic retinopathy (DR) using machine learning and deep learning technologies. Researchers have explored various approaches, including the use of high-definition medical imaging, AI-driven algorithms such as convolutional neural networks (CNNs) and generative adversarial networks (GANs). Among all the available tools, CNNs have emerged as a preferred tool due to their superior classification accuracy and efficiency. Although the accuracy of CNNs is comparatively better but it can be improved by introducing some hybrid models by combining various machine learning and deep learning models. Therefore, in this paper, an ensemble learning technique is proposed for early detection and management of DR with higher accuracy. The proposed model is tested on the APTOS dataset and it is showing supremacy on the validation accuracy ($99%)$ in comparison to the previous models. Hence, the model can be helpful for early detection and treatment of the DR, thereby enhancing the overall quality of care for affected individuals.

7/26/2024