Enhanced Classification of Heart Sounds Using Mel Frequency Cepstral Coefficients: A Comparative Study of Single and Ensemble Classifier Strategies

Read original: arXiv:2406.00702 - Published 7/2/2024 by Amir Masoud Rahmani, Amir Haider, Mohammad Adeli, Olfa Mzoughi, Entesar Gemeay, Mokhtar Mohammadi, Hamid Alinejad-Rokny, Parisa Khoshvaght, Mehdi Hosseinzadeh

🏷️

Overview

This paper explores the use of Mel Frequency Cepstral Coefficients (MFCCs) to detect abnormal phonocardiograms (heart sounds)
Two classification strategies were tested: a single-classifier approach and an ensemble-classifier approach
Phonocardiograms were segmented into key intervals (S1, systole, S2, diastole) and 13 MFCCs were extracted from each segment
The single-classifier approach used the average of 9 consecutive beats to classify, while the ensemble-classifier used 9 individual classifiers with a majority vote

Plain English Explanation

Phonocardiograms are recordings of the sounds made by the heart. This paper looked at using a specific type of audio feature called Mel Frequency Cepstral Coefficients (MFCCs) to detect abnormal heart sounds.

The researchers divided the phonocardiogram recordings into four key parts: the first heart sound (S1), the contraction phase (systole), the second heart sound (S2), and the relaxation phase (diastole). They calculated 13 MFCC values for each of these four parts, giving a total of 52 MFCC features per heartbeat.

They tested two different ways of using these MFCC features to classify the heart sounds as normal or abnormal. In the first approach, they averaged the MFCC features from 9 consecutive heartbeats and used that to make the classification. In the second approach, they used 9 separate classifiers, each looking at one heartbeat, and then took the majority vote of those 9 classifiers.

The results showed that the second, ensemble-classifier approach performed better than the single-classifier approach. This suggests that MFCCs are effective features for detecting abnormal heart sounds and that looking at multiple heartbeats individually can improve the accuracy compared to just averaging the features.

Technical Explanation

The researchers in this study explored the use of Mel Frequency Cepstral Coefficients (MFCCs) for the task of classifying phonocardiograms as normal or abnormal. They compared two different classification strategies: a single-classifier approach and an ensemble-classifier approach.

For the phonocardiogram data, the researchers segmented each recording into four key intervals: the first heart sound (S1), the systolic contraction phase, the second heart sound (S2), and the diastolic relaxation phase. They then calculated 13 MFCC features from each of these four segments, resulting in a total of 52 MFCC features per heartbeat.

In the single-classifier approach, the researchers averaged the MFCC features across 9 consecutive heartbeats and used that as the input to a single classifier to predict whether the phonocardiogram was normal or abnormal.

In contrast, the ensemble-classifier approach employed 9 separate classifiers, each making a prediction based on the MFCC features of a single heartbeat. The final classification was then determined by a majority vote across the 9 individual classifiers.

The researchers tested both approaches on a publicly available phonocardiogram database. Their results showed that the ensemble-classifier strategy achieved higher accuracy compared to the single-classifier approach. This suggests that MFCCs are more effective features for heart sound analysis than other features like time, time-frequency, and statistical features that have been evaluated in similar studies.

Critical Analysis

The researchers in this study provide a thorough evaluation of using MFCCs for phonocardiogram classification, but there are a few potential areas for further exploration.

One limitation is that the study only tested the approaches on a single public dataset. Evaluating the methods on a wider range of phonocardiogram datasets, including those with more diverse patient populations, would help strengthen the generalizability of the findings.

Additionally, the paper does not provide much insight into the specific types of abnormalities that the MFCC-based classifiers were able to detect. Understanding the detection capabilities for different heart sound pathologies could be valuable for clinical applications.

It would also be interesting to see how the MFCC-based approaches compare to more recent deep learning methods for phonocardiogram analysis. Combining MFCCs with deep learning architectures may further enhance the performance and interpretability of automated heart sound classification systems.

Overall, this study demonstrates the potential of MFCCs as effective features for detecting abnormal heart sounds. Further research expanding on these findings could lead to improved cardiac auscultation tools and better support for clinical diagnosis of cardiovascular conditions.

Conclusion

This paper explored the use of Mel Frequency Cepstral Coefficients (MFCCs) for the classification of normal and abnormal phonocardiograms. The researchers tested two different classification strategies: a single-classifier approach that averaged MFCC features across multiple heartbeats, and an ensemble-classifier approach that used multiple individual classifiers to vote on the final prediction.

The results showed that the ensemble-classifier strategy outperformed the single-classifier approach, demonstrating the effectiveness of MFCCs as features for detecting abnormal heart sounds. This work contributes to the growing body of research on automated heart sound analysis and could lead to improved clinical tools for cardiovascular disease diagnosis and monitoring.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🏷️

Enhanced Classification of Heart Sounds Using Mel Frequency Cepstral Coefficients: A Comparative Study of Single and Ensemble Classifier Strategies

Amir Masoud Rahmani, Amir Haider, Mohammad Adeli, Olfa Mzoughi, Entesar Gemeay, Mokhtar Mohammadi, Hamid Alinejad-Rokny, Parisa Khoshvaght, Mehdi Hosseinzadeh

This paper explores the efficacy of Mel Frequency Cepstral Coefficients (MFCCs) in detecting abnormal heart sounds using two classification strategies: a single classifier and an ensemble classifier approach. Heart sounds were first pre-processed to remove noise and then segmented into S1, systole, S2, and diastole intervals, with thirteen MFCCs estimated from each segment, yielding 52 MFCCs per beat. Finally, MFCCs were used for heart sound classification. For that purpose, in the single classifier strategy, the MFCCs from nine consecutive beats were averaged to classify heart sounds by a single classifier (either a support vector machine (SVM), the k nearest neighbors (kNN), or a decision tree (DT)). Conversely, the ensemble classifier strategy employed nine classifiers (either nine SVMs, nine kNN classifiers, or nine DTs) to individually assess beats as normal or abnormal, with the overall classification based on the majority vote. Both methods were tested on a publicly available phonocardiogram database. The heart sound classification accuracy was 91.95% for the SVM, 91.9% for the kNN, and 87.33% for the DT in the single classifier strategy. Also, the accuracy was 93.59% for the SVM, 91.84% for the kNN, and 92.22% for the DT in the ensemble classifier strategy. Overall, the results demonstrated that the ensemble classifier strategy improved the accuracies of the DT and the SVM by 4.89% and 1.64%, establishing MFCCs as more effective than other features, including time, time-frequency, and statistical features, evaluated in similar studies.

7/2/2024

Optimising MFCC parameters for the automatic detection of respiratory diseases

Yuyang Yan, Sami O. Simons, Loes van Bemmel, Lauren Reinders, Frits M. E. Franssen, Visara Urovi

Voice signals originating from the respiratory tract are utilized as valuable acoustic biomarkers for the diagnosis and assessment of respiratory diseases. Among the employed acoustic features, Mel Frequency Cepstral Coefficients (MFCC) is widely used for automatic analysis, with MFCC extraction commonly relying on default parameters. However, no comprehensive study has systematically investigated the impact of MFCC extraction parameters on respiratory disease diagnosis. In this study, we address this gap by examining the effects of key parameters, namely the number of coefficients, frame length, and hop length between frames, on respiratory condition examination. Our investigation uses four datasets: the Cambridge COVID-19 Sound database, the Coswara dataset, the Saarbrucken Voice Disorders (SVD) database, and a TACTICAS dataset. The Support Vector Machine (SVM) is employed as the classifier, given its widespread adoption and efficacy. Our findings indicate that the accuracy of MFCC decreases as hop length increases, and the optimal number of coefficients is observed to be approximately 30. The performance of MFCC varies with frame length across the datasets: for the COVID-19 datasets (Cambridge COVID-19 Sound database and Coswara dataset), performance declines with longer frame lengths, while for the SVD dataset, performance improves with increasing frame length (from 50 ms to 500 ms). Furthermore, we investigate the optimized combination of these parameters and observe substantial enhancements in accuracy. Compared to the worst combination, the SVM model achieves an accuracy of 81.1%, 80.6%, and 71.7%, with improvements of 19.6%, 16.10%, and 14.90% for the Cambridge COVID-19 Sound database, the Coswara dataset, and the SVD dataset respectively.

8/15/2024

Model-driven Heart Rate Estimation and Heart Murmur Detection based on Phonocardiogram

Jingping Nie, Ran Liu, Behrooz Mahasseni, Erdrin Azemi, Vikramjit Mitra

Acoustic signals are crucial for health monitoring, particularly heart sounds which provide essential data like heart rate and detect cardiac anomalies such as murmurs. This study utilizes a publicly available phonocardiogram (PCG) dataset to estimate heart rate using model-driven methods and extends the best-performing model to a multi-task learning (MTL) framework for simultaneous heart rate estimation and murmur detection. Heart rate estimates are derived using a sliding window technique on heart sound snippets, analyzed with a combination of acoustic features (Mel spectrogram, cepstral coefficients, power spectral density, root mean square energy). Our findings indicate that a 2D convolutional neural network (textbf{texttt{2dCNN}}) is most effective for heart rate estimation, achieving a mean absolute error (MAE) of 1.312 bpm. We systematically investigate the impact of different feature combinations and find that utilizing all four features yields the best results. The MTL model (textbf{texttt{2dCNN-MTL}}) achieves accuracy over 95% in murmur detection, surpassing existing models, while maintaining an MAE of 1.636 bpm in heart rate estimation, satisfying the requirements stated by Association for the Advancement of Medical Instrumentation (AAMI).

7/29/2024

Heart Sound Segmentation Using Deep Learning Techniques

Manas Madine

Heart disease remains a leading cause of mortality worldwide. Auscultation, the process of listening to heart sounds, can be enhanced through computer-aided analysis using Phonocardiogram (PCG) signals. This paper presents a novel approach for heart sound segmentation and classification into S1 (LUB) and S2 (DUB) sounds. We employ FFT-based filtering, dynamic programming for event detection, and a Siamese network for robust classification. Our method demonstrates superior performance on the PASCAL heart sound dataset compared to existing approaches.

6/11/2024