Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers

Read original: arXiv:2408.15667 - Published 9/4/2024 by Qian Wang, Zhaoyang Bu, Jiaxuan Mao, Wenyu Zhu, Jingya Zhao, Wei Du, Guochao Shi, Min Zhou, Si Chen, Jieming Qu

Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers

Overview

This paper explores using cough sounds and vision transformers for reliable respiratory disease diagnosis.
The researchers developed a deep learning model that can detect COVID-19, COPD, and other respiratory conditions from cough audio and chest X-ray images.
The model achieved high accuracy in detecting these respiratory diseases, suggesting it could be a useful tool for screening and diagnosis.

Plain English Explanation

The researchers in this study wanted to create a way to diagnose respiratory diseases like COVID-19 and chronic obstructive pulmonary disease (COPD) using two types of data - cough sounds and chest X-ray images.

They developed a deep learning model, which is a type of artificial intelligence that can learn patterns from data, to analyze these two data sources. The model was trained on a large dataset of cough recordings and chest X-rays from people with and without respiratory diseases.

The researchers found that their model was able to accurately detect COVID-19, COPD, and other respiratory conditions by looking at the cough sounds and chest X-ray images. This suggests the model could be a useful tool for quickly screening and diagnosing these respiratory illnesses, potentially allowing for earlier treatment and better outcomes for patients.

Technical Explanation

The core of the researchers' approach was to build a deep learning model that could leverage both cough sound data and chest X-ray images to diagnose respiratory diseases.

They used a vision transformer architecture to process the chest X-ray images, which allowed the model to learn high-level visual features relevant for disease detection. For the cough sounds, they extracted acoustic features and fed them into a separate neural network branch.

The outputs of the cough and vision transformer branches were then combined and passed through additional neural network layers to produce the final disease prediction. The model was trained and evaluated on a large dataset of over 7,000 cough recordings and 15,000 chest X-rays from patients with COVID-19, COPD, and healthy controls.

The researchers found that their multi-modal approach, using both cough sounds and chest X-rays, achieved significantly higher accuracy compared to using either data source alone. This suggests that the complementary information from these two modalities is valuable for reliable respiratory disease diagnosis.

Critical Analysis

One key limitation of this study is that the dataset, while large, was collected from a single healthcare system. This raises questions about the generalizability of the model's performance to more diverse patient populations and healthcare settings.

Additionally, the paper does not provide much insight into the model's ability to differentiate between similar respiratory conditions, such as distinguishing COVID-19 from influenza or pneumonia. Further research would be needed to evaluate the model's differential diagnostic capabilities.

Another area for further exploration is the interpretability of the model's predictions. While the vision transformer component can provide some visual explanations for its decisions, it is unclear how the model is integrating the cough sound information to arrive at its final diagnoses. Enhancing the model's explainability could increase trust and adoption in clinical settings.

Conclusion

This study demonstrates the potential of combining cough sounds and chest X-ray images using deep learning models for reliable respiratory disease diagnosis. The high accuracy achieved by the researchers' multi-modal approach suggests it could be a valuable tool for screening and early detection of conditions like COVID-19 and COPD.

Further research is needed to explore the model's generalizability, differential diagnostic capabilities, and interpretability. However, this work represents an important step forward in leveraging multimodal data and advanced AI techniques for improved respiratory disease management.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers

Qian Wang, Zhaoyang Bu, Jiaxuan Mao, Wenyu Zhu, Jingya Zhao, Wei Du, Guochao Shi, Min Zhou, Si Chen, Jieming Qu

Recent advancements in deep learning techniques have sparked performance boosts in various real-world applications including disease diagnosis based on multi-modal medical data. Cough sound data-based respiratory disease (e.g., COVID-19 and Chronic Obstructive Pulmonary Disease) diagnosis has also attracted much attention. However, existing works usually utilise traditional machine learning or deep models of moderate scales. On the other hand, the developed approaches are trained and evaluated on small-scale data due to the difficulty of curating and annotating clinical data on scale. To address these issues in prior works, we create a unified framework to evaluate various deep models from lightweight Convolutional Neural Networks (e.g., ResNet18) to modern vision transformers and compare their performance in respiratory disease classification. Based on the observations from such an extensive empirical study, we propose a novel approach to cough-based disease classification based on both self-supervised and supervised learning on a large-scale cough data set. Experimental results demonstrate our proposed approach outperforms prior arts consistently on two benchmark datasets for COVID-19 diagnosis and a proprietary dataset for COPD/non-COPD classification with an AUROC of 92.5%.

9/4/2024

🔎

CoVid-19 Detection leveraging Vision Transformers and Explainable AI

Pangoth Santhosh Kumar, Kundrapu Supriya, Mallikharjuna Rao K, Taraka Satya Krishna Teja Malisetti

Lung disease is a common health problem in many parts of the world. It is a significant risk to people health and quality of life all across the globe since it is responsible for five of the top thirty leading causes of death. Among them are COVID 19, pneumonia, and tuberculosis, to name just a few. It is critical to diagnose lung diseases in their early stages. Several different models including machine learning and image processing have been developed for this purpose. The earlier a condition is diagnosed, the better the patient chances of making a full recovery and surviving into the long term. Thanks to deep learning algorithms, there is significant promise for the autonomous, rapid, and accurate identification of lung diseases based on medical imaging. Several different deep learning strategies, including convolutional neural networks (CNN), vanilla neural networks, visual geometry group based networks (VGG), and capsule networks , are used for the goal of making lung disease forecasts. The standard CNN has a poor performance when dealing with rotated, tilted, or other aberrant picture orientations. As a result of this, within the scope of this study, we have suggested a vision transformer based approach end to end framework for the diagnosis of lung disorders. In the architecture, data augmentation, training of the suggested models, and evaluation of the models are all included. For the purpose of detecting lung diseases such as pneumonia, Covid 19, lung opacity, and others, a specialised Compact Convolution Transformers (CCT) model have been tested and evaluated on datasets such as the Covid 19 Radiography Database. The model has achieved a better accuracy for both its training and validation purposes on the Covid 19 Radiography Database.

5/7/2024

🔎

COVID-19 Detection System: A Comparative Analysis of System Performance Based on Acoustic Features of Cough Audio Signals

Asmaa Shati, Ghulam Mubashar Hassan, Amitava Datta

A wide range of respiratory diseases, such as cold and flu, asthma, and COVID-19, affect people's daily lives worldwide. In medical practice, respiratory sounds are widely used in medical services to diagnose various respiratory illnesses and lung disorders. The traditional diagnosis of such sounds requires specialized knowledge, which can be costly and reliant on human expertise. Despite this, recent advancements, such as cough audio recordings, have emerged as a means to automate the detection of respiratory conditions. Therefore, this research aims to explore various acoustic features that enhance the performance of machine learning (ML) models in detecting COVID-19 from cough signals. It investigates the efficacy of three feature extraction techniques, including Mel Frequency Cepstral Coefficients (MFCC), Chroma, and Spectral Contrast features, when applied to two machine learning algorithms, Support Vector Machine (SVM) and Multilayer Perceptron (MLP), and therefore proposes an efficient CovCepNet detection system. The proposed system provides a practical solution and demonstrates state-of-the-art classification performance, with an AUC of 0.843 on the COUGHVID dataset and 0.953 on the Virufy dataset for COVID-19 detection from cough audio signals.

6/21/2024

Multi-Task Learning for Lung sound & Lung disease classification

Suma K V, Deepali Koppad, Preethi Kumar, Neha A Kantikar, Surabhi Ramesh

In recent years, advancements in deep learning techniques have considerably enhanced the efficiency and accuracy of medical diagnostics. In this work, a novel approach using multi-task learning (MTL) for the simultaneous classification of lung sounds and lung diseases is proposed. Our proposed model leverages MTL with four different deep learning models such as 2D CNN, ResNet50, MobileNet and Densenet to extract relevant features from the lung sound recordings. The ICBHI 2017 Respiratory Sound Database was employed in the current study. The MTL for MobileNet model performed better than the other models considered, with an accuracy of74% for lung sound analysis and 91% for lung diseases classification. Results of the experimentation demonstrate the efficacy of our approach in classifying both lung sounds and lung diseases concurrently. In this study,using the demographic data of the patients from the database, risk level computation for Chronic Obstructive Pulmonary Disease is also carried out. For this computation, three machine learning algorithms namely Logistic Regression, SVM and Random Forest classifierswere employed. Among these ML algorithms, the Random Forest classifier had the highest accuracy of 92%.This work helps in considerably reducing the physician's burden of not just diagnosing the pathology but also effectively communicating to the patient about the possible causes or outcomes.

4/8/2024