A pilot protocol and cohort for the investigation of non-pathological variability in speech

Read original: arXiv:2406.07497 - Published 6/12/2024 by Nicholas Cummins, Lauren L. White, Zahia Rahman, Catriona Lucas, Tian Pan, Ewan Carr, Faith Matcham, Johnny Downs, Richard J. Dobson, Judith Dineley
Total Score

0

🗣️

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • Potential for speech-based biomarkers to remotely and objectively assess symptom severity
  • Complex nature of speech and subtle health changes make findings highly dependent on study methods
  • Need for standardized protocols and better reporting to advance clinical research and practice

Plain English Explanation

Speech patterns can provide valuable insights into a person's health conditions. Researchers are exploring the use of speech-based biomarkers - measurable characteristics of speech that could be used to regularly and objectively assess symptom severity, both remotely and in-clinic.

However, speech is a complex behavior, and the changes associated with health conditions are often subtle. This means that the findings from studies investigating speech-based health assessment can be highly dependent on the specific methods used, such as the choice of recording device, the type of speech task, and the demographic characteristics of the participants. Unfortunately, these important methodological details are often not reported clearly in the research.

To address this issue, the researchers in this study developed a standardized protocol for collecting speech data from healthy individuals. They recorded participants speaking in different ways (e.g., reading, describing a picture, conversing) using various microphone types, and collected detailed information about the participants' characteristics. This pilot dataset and protocol can serve as an example for future studies, helping to improve the consistency and quality of speech-based health research.

Technical Explanation

The researchers conducted a thematic literature review to inform the development of their speech data collection protocol and the choice of speech features to analyze. Their protocol included the elicitation of three different types of speech (read, picture description, and conversation) from participants, recorded using three different microphone types. This approach was chosen to assess the impact of methodological factors, such as device choice and speech task, on the extracted speech features.

The researchers collected speech data from 28 healthy individuals on three occasions, spaced 8-11 weeks apart, as well as from 25 additional healthy participants recorded three times within one week. Demographic information, such as sex, age, native language status, and voice use habits, was also collected for each participant.

The researchers then developed a processing pipeline to extract 14 exemplar speech features covering various aspects of speech, including timing, prosody, voice quality, articulation, and spectral moment characteristics. This pilot dataset of normative speech data and extracted features can serve as a resource for future research investigating speech-based health assessment.

Critical Analysis

The researchers acknowledge that their study is limited in scope, as it only involved healthy participants and did not assess speech patterns in individuals with known health conditions. However, this pilot dataset and protocol provide a valuable foundation for future research in this area.

One potential concern is the relatively small sample size, which may limit the generalizability of the normative speech feature values reported in the study. Additionally, the researchers did not provide a detailed analysis of the impact of the various methodological factors (e.g., device choice, speech task) on the extracted speech features.

Further research is needed to quantify the effect of speech pathology on the speech features and to investigate how these features may change with different health conditions. Larger-scale studies with more diverse participant populations would also help to establish more robust normative data and guidelines for the use of speech-based biomarkers in clinical practice.

Conclusion

This study provides an important step towards the development of standardized protocols and best practices for the collection and analysis of speech data in the context of health assessment. The pilot dataset and exemplar speech features can serve as a valuable resource for researchers and clinicians working to advance the use of speech-based biomarkers in remote monitoring and clinical evaluation of symptom severity. Continued efforts to harmonize methodological approaches and reporting standards will be crucial for translating this promising research into practical applications that can benefit patients and healthcare providers.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🗣️

Total Score

0

A pilot protocol and cohort for the investigation of non-pathological variability in speech

Nicholas Cummins, Lauren L. White, Zahia Rahman, Catriona Lucas, Tian Pan, Ewan Carr, Faith Matcham, Johnny Downs, Richard J. Dobson, Judith Dineley

Background Speech-based biomarkers have potential as a means for regular, objective assessment of symptom severity, remotely and in-clinic in combination with advanced analytical models. However, the complex nature of speech and the often subtle changes associated with health mean that findings are highly dependent on methodological and cohort choices. These are often not reported adequately in studies investigating speech-based health assessment Objective To develop and apply an exemplar protocol to generate a pilot dataset of healthy speech with detailed metadata for the assessment of factors in the speech recording-analysis pipeline, including device choice, speech elicitation task and non-pathological variability. Methods We developed our collection protocol and choice of exemplar speech features based on a thematic literature review. Our protocol includes the elicitation of three different speech types. With a focus towards remote applications, we also choose to collect speech with three different microphone types. We developed a pipeline to extract a set of 14 exemplar speech features. Results We collected speech from 28 individuals three times in one day, repeated at the same times 8-11 weeks later, and from 25 healthy individuals three times in one week. Participant characteristics collected included sex, age, native language status and voice use habits of the participant. A preliminary set of 14 speech features covering timing, prosody, voice quality, articulation and spectral moment characteristics were extracted that provide a resource of normative values. Conclusions There are multiple methodological factors involved in the collection, processing and analysis of speech recordings. Consistent reporting and greater harmonisation of study protocols are urgently required to aid the translation of speech processing into clinical research and practice.

Read more

6/12/2024

A Comprehensive Rubric for Annotating Pathological Speech
Total Score

0

A Comprehensive Rubric for Annotating Pathological Speech

Mario Corrales-Astorgano, David Escudero-Mancebo, Lourdes Aguilar, Valle Flores-Lucas, Valent'in Carde~noso-Payo, Carlos Vivaracho-Pascual, C'esar Gonz'alez-Ferreras

Rubrics are a commonly used tool for labeling voice corpora in speech quality assessment, although their application in the context of pathological speech remains relatively limited. In this study, we introduce a comprehensive rubric based on various dimensions of speech quality, including phonetics, fluency, and prosody. The objective is to establish standardized criteria for identifying errors within the speech of individuals with Down syndrome, thereby enabling the development of automated assessment systems. To achieve this objective, we utilized the Prautocal corpus. To assess the quality of annotations using our rubric, two experiments were conducted, focusing on phonetics and fluency. For phonetic evaluation, we employed the Goodness of Pronunciation (GoP) metric, utilizing automatic segmentation systems and correlating the results with evaluations conducted by a specialized speech therapist. While the obtained correlation values were not notably high, a positive trend was observed. In terms of fluency assessment, deep learning models like wav2vec were used to extract audio features, and we employed an SVM classifier trained on a corpus focused on identifying fluency issues to categorize Prautocal corpus samples. The outcomes highlight the complexities of evaluating such phenomena, with variability depending on the specific type of disfluency detected.

Read more

4/30/2024

⚙️

Total Score

0

Survey on biomarkers in human vocalizations

Aki Harma, Bert den Brinker, Ulf Grossekathofer, Okke Ouweltjes, Srikanth Nallanthighal, Sidharth Abrol, Vibhu Sharma

Recent years has witnessed an increase in technologies that use speech for the sensing of the health of the talker. This survey paper proposes a general taxonomy of the technologies and a broad overview of current progress and challenges. Vocal biomarkers are often secondary measures that are approximating a signal of another sensor or identifying an underlying mental, cognitive, or physiological state. Their measurement involve disturbances and uncertainties that may be considered as noise sources and the biomarkers are coarsely qualified in terms of the various sources of noise involved in their determination. While in some proposed biomarkers the error levels seem high, there are vocal biomarkers where the errors are expected to be low and thus are more likely to qualify as candidates for adoption in healthcare applications.

Read more

8/12/2024

🗣️

Total Score

0

Exploring Speech Pattern Disorders in Autism using Machine Learning

Chuanbo Hu, Jacob Thrasher, Wenqi Li, Mindi Ruan, Xiangxu Yu, Lynn K Paul, Shuo Wang, Xin Li

Diagnosing autism spectrum disorder (ASD) by identifying abnormal speech patterns from examiner-patient dialogues presents significant challenges due to the subtle and diverse manifestations of speech-related symptoms in affected individuals. This study presents a comprehensive approach to identify distinctive speech patterns through the analysis of examiner-patient dialogues. Utilizing a dataset of recorded dialogues, we extracted 40 speech-related features, categorized into frequency, zero-crossing rate, energy, spectral characteristics, Mel Frequency Cepstral Coefficients (MFCCs), and balance. These features encompass various aspects of speech such as intonation, volume, rhythm, and speech rate, reflecting the complex nature of communicative behaviors in ASD. We employed machine learning for both classification and regression tasks to analyze these speech features. The classification model aimed to differentiate between ASD and non-ASD cases, achieving an accuracy of 87.75%. Regression models were developed to predict speech pattern related variables and a composite score from all variables, facilitating a deeper understanding of the speech dynamics associated with ASD. The effectiveness of machine learning in interpreting intricate speech patterns and the high classification accuracy underscore the potential of computational methods in supporting the diagnostic processes for ASD. This approach not only aids in early detection but also contributes to personalized treatment planning by providing insights into the speech and communication profiles of individuals with ASD.

Read more

5/9/2024