JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA

Read original: arXiv:2404.11525 - Published 7/30/2024 by Zeyu Zhang, Xuyin Qi, Mingxi Chen, Guangxi Li, Ryan Pham, Ayub Qassim, Ella Berry, Zhibin Liao, Owen Siggs, Robert Mclaughlin and 2 others

JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA

Overview

This paper proposes a novel model called JointViT for predicting oxygen saturation levels from Optical Coherence Tomography Angiography (OCTA) images.
The model uses a joint supervision approach to address the challenge of long-tailed distributions in OCTA data, which can lead to poor performance on rare disease states.
The researchers evaluate their model on a large-scale OCTA dataset and demonstrate improvements over existing methods for oxygen saturation level prediction.

Plain English Explanation

The paper explores a new deep learning model called JointViT that can predict a person's oxygen saturation levels from OCTA images. OCTA is a medical imaging technique that allows doctors to see the tiny blood vessels in the back of the eye.

One of the challenges with OCTA data is that certain disease states are much rarer than others, creating a "long-tailed" distribution. This means the model may not perform as well on these less common conditions. To address this, the JointViT model uses a "joint supervision" approach, where it is trained not just to predict oxygen levels, but also to classify the underlying disease state.

By incorporating this additional disease classification task, the model is able to better handle the long-tailed distribution and improve its overall performance on predicting oxygen saturation levels. The researchers tested their model on a large dataset of OCTA images and found it outperformed existing methods.

This research is significant because accurately measuring oxygen levels in the eye can provide important insights into a person's overall health. The JointViT model could help doctors more easily and reliably assess oxygen levels, which could lead to earlier detection of conditions like diabetes or cardiovascular disease.

Technical Explanation

The key innovation of the JointViT model is its use of joint supervision to address the challenge of long-tailed distributions in OCTA data. Specifically, the model is trained not only to predict oxygen saturation levels, but also to classify the underlying disease state present in the OCTA image.

The model architecture is based on the Vision Transformer (ViT), a powerful deep learning model for image recognition tasks. The researchers modify the ViT to include two output heads - one for predicting oxygen saturation and one for disease classification.

During training, the model is optimized to minimize the combined loss from both the regression task (oxygen saturation prediction) and the classification task (disease state identification). This joint supervision approach helps the model better represent the complex relationship between OCTA image features and oxygen levels, especially for rare disease states that are underrepresented in the training data.

The researchers evaluate their JointViT model on a large-scale OCTA dataset and demonstrate significant improvements in oxygen saturation prediction accuracy compared to existing methods. They also provide detailed ablation studies to understand the contributions of different model components and training strategies.

Critical Analysis

The paper provides a well-designed and thorough evaluation of the JointViT model, including comparisons to state-of-the-art approaches and detailed ablation studies. However, there are a few potential limitations and areas for further research:

The dataset used in the study, while large, may not fully capture the diversity of OCTA images seen in real-world clinical settings. Validation on additional datasets from different healthcare systems or regions could help assess the model's generalizability.
The paper does not discuss the computational complexity and inference time of the JointViT model, which could be important considerations for its practical deployment in clinical workflows.
While the joint supervision approach helps improve performance on rare disease states, the paper does not explore the model's ability to provide interpretable insights into the relationship between OCTA image features and oxygen saturation levels. Incorporating explainable AI techniques could enhance the model's clinical utility.
The long-term impact of using the JointViT model for oxygen saturation monitoring and early disease detection is not fully explored. Further research is needed to understand the potential benefits and drawbacks of deploying such AI-powered tools in real-world healthcare settings.

Overall, the JointViT model represents an important step forward in leveraging deep learning for improved oxygen saturation assessment from OCTA images. The joint supervision approach is a promising direction for addressing long-tailed distributions in medical imaging data, and the model's performance gains are noteworthy. However, continued research and careful consideration of practical deployment challenges will be crucial for realizing the full potential of this technology.

Conclusion

The JointViT model proposed in this paper demonstrates a novel approach to predicting oxygen saturation levels from OCTA images, addressing the challenge of long-tailed distributions in the data through joint supervision. The model's strong performance improvements over existing methods highlight the potential of this technique for enhancing oxygen monitoring and early disease detection capabilities in clinical settings.

As AI-powered tools continue to advance in the medical imaging domain, the JointViT model and its joint supervision approach offer valuable insights into how deep learning can be leveraged to extract meaningful insights from complex, imbalanced datasets. With further research and careful consideration of real-world deployment factors, this work could contribute to the development of more reliable and accessible healthcare technologies in the future.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA

Zeyu Zhang, Xuyin Qi, Mingxi Chen, Guangxi Li, Ryan Pham, Ayub Qassim, Ella Berry, Zhibin Liao, Owen Siggs, Robert Mclaughlin, Jamie Craig, Minh-Son To

The oxygen saturation level in the blood (SaO2) is crucial for health, particularly in relation to sleep-related breathing disorders. However, continuous monitoring of SaO2 is time-consuming and highly variable depending on patients' conditions. Recently, optical coherence tomography angiography (OCTA) has shown promising development in rapidly and effectively screening eye-related lesions, offering the potential for diagnosing sleep-related disorders. To bridge this gap, our paper presents three key contributions. Firstly, we propose JointViT, a novel model based on the Vision Transformer architecture, incorporating a joint loss function for supervision. Secondly, we introduce a balancing augmentation technique during data preprocessing to improve the model's performance, particularly on the long-tail distribution within the OCTA dataset. Lastly, through comprehensive experiments on the OCTA dataset, our proposed method significantly outperforms other state-of-the-art methods, achieving improvements of up to 12.28% in overall accuracy. This advancement lays the groundwork for the future utilization of OCTA in diagnosing sleep-related disorders. See project website https://steve-zeyu-zhang.github.io/JointViT

7/30/2024

Enhancing Retinal Disease Classification from OCTA Images via Active Learning Techniques

Jacob Thrasher, Annahita Amireskandari, Prashnna Gyawali

Eye diseases are common in older Americans and can lead to decreased vision and blindness. Recent advancements in imaging technologies allow clinicians to capture high-quality images of the retinal blood vessels via Optical Coherence Tomography Angiography (OCTA), which contain vital information for diagnosing these diseases and expediting preventative measures. OCTA provides detailed vascular imaging as compared to the solely structural information obtained by common OCT imaging. Although there have been considerable studies on OCT imaging, there have been limited to no studies exploring the role of artificial intelligence (AI) and machine learning (ML) approaches for predictive modeling with OCTA images. In this paper, we explore the use of deep learning to identify eye disease in OCTA images. However, due to the lack of labeled data, the straightforward application of deep learning doesn't necessarily yield good generalization. To this end, we utilize active learning to select the most valuable subset of data to train our model. We demonstrate that active learning subset selection greatly outperforms other strategies, such as inverse frequency class weighting, random undersampling, and oversampling, by up to 49% in F1 evaluation.

7/23/2024

Beyond the Eye: A Relational Model for Early Dementia Detection Using Retinal OCTA Images

Shouyue Liu, Jinkui Hao, Yonghuai Liu, Huazhu Fu, Xinyu Guo, Shuting Zhang, Yitian Zhao

Early detection of dementia, such as Alzheimer's disease (AD) or mild cognitive impairment (MCI), is essential to enable timely intervention and potential treatment. Accurate detection of AD/MCI is challenging due to the high complexity, cost, and often invasive nature of current diagnostic techniques, which limit their suitability for large-scale population screening. Given the shared embryological origins and physiological characteristics of the retina and brain, retinal imaging is emerging as a potentially rapid and cost-effective alternative for the identification of individuals with or at high risk of AD. In this paper, we present a novel PolarNet+ that uses retinal optical coherence tomography angiography (OCTA) to discriminate early-onset AD (EOAD) and MCI subjects from controls. Our method first maps OCTA images from Cartesian coordinates to polar coordinates, allowing approximate sub-region calculation to implement the clinician-friendly early treatment of diabetic retinopathy study (ETDRS) grid analysis. We then introduce a multi-view module to serialize and analyze the images along three dimensions for comprehensive, clinically useful information extraction. Finally, we abstract the sequence embedding into a graph, transforming the detection task into a general graph classification problem. A regional relationship module is applied after the multi-view module to excavate the relationship between the sub-regions. Such regional relationship analyses validate known eye-brain links and reveal new discriminative patterns.

8/12/2024

🖼️

Vessel-Promoted OCT to OCTA Image Translation by Heuristic Contextual Constraints

Shuhan Li, Dong Zhang, Xiaomeng Li, Chubin Ou, Lin An, Yanwu Xu, Kwang-Ting Cheng

Optical Coherence Tomography Angiography (OCTA) is a crucial tool in the clinical screening of retinal diseases, allowing for accurate 3D imaging of blood vessels through non-invasive scanning. However, the hardware-based approach for acquiring OCTA images presents challenges due to the need for specialized sensors and expensive devices. In this paper, we introduce a novel method called TransPro, which can translate the readily available 3D Optical Coherence Tomography (OCT) images into 3D OCTA images without requiring any additional hardware modifications. Our TransPro method is primarily driven by two novel ideas that have been overlooked by prior work. The first idea is derived from a critical observation that the OCTA projection map is generated by averaging pixel values from its corresponding B-scans along the Z-axis. Hence, we introduce a hybrid architecture incorporating a 3D adversarial generative network and a novel Heuristic Contextual Guidance (HCG) module, which effectively maintains the consistency of the generated OCTA images between 3D volumes and projection maps. The second idea is to improve the vessel quality in the translated OCTA projection maps. As a result, we propose a novel Vessel Promoted Guidance (VPG) module to enhance the attention of network on retinal vessels. Experimental results on two datasets demonstrate that our TransPro outperforms state-of-the-art approaches, with relative improvements around 11.4% in MAE, 2.7% in PSNR, 2% in SSIM, 40% in VDE, and 9.1% in VDC compared to the baseline method. The code is available at: https://github.com/ustlsh/TransPro.

8/22/2024