Enhancing the analysis of murine neonatal ultrasonic vocalizations: Development, evaluation, and application of different mathematical models

Read original: arXiv:2405.12957 - Published 5/29/2024 by Rudolf Herdt, Louisa Kinzel, Johann Georg Maa{ss}, Marvin Walther, Henning Frohlich, Tim Schubert, Peter Maass, Christian Patrick Schaaf

❗

Overview

This paper explores the use of various deep learning models for the classification of ultrasonic vocalizations (USVs) in rodents.
USVs are an important tool for understanding the social communication, affective states, and developmental stages of animals.
The researchers systematically evaluated different neural network architectures, including feedforward networks, convolutional neural networks, residual networks, and a Vision Transformer, for the task of USV classification.
The best-performing model was integrated into a fully automated pipeline that can reliably analyze extensive USV datasets, allowing researchers to specify individual accuracy thresholds for their needs.
The pipeline has proven valuable for identifying key differences in USVs produced by mice with autism-like behaviors, as part of an ongoing phenotyping study.

Plain English Explanation

Rodents, such as mice and rats, communicate using a wide range of high-pitched sounds called ultrasonic vocalizations (USVs). These vocalizations provide valuable insights into the animals' emotional states, social interactions, and developmental stages. Researchers have been exploring the use of deep learning, a type of artificial intelligence, to automatically analyze these USVs.

In this study, the researchers tested different deep learning models, including custom-built networks and well-known architectures like convolutional neural networks and residual networks, to see which one could best classify the different types of USVs. They also developed a new algorithm to help the model accurately detect and identify the vocalizations.

The best-performing model was then integrated into a fully automated pipeline that can analyze large datasets of USVs with high reliability. This pipeline allows researchers to set their own accuracy thresholds, so they can choose to have the model classify only the most confident calls, leaving the rest for manual review.

The researchers have used this pipeline as part of an ongoing study to identify differences in the USVs produced by mice with autism-like behaviors. This automated system for analyzing animal vocalizations has proven to be a valuable tool for understanding the communication and development of these animals.

Technical Explanation

The researchers in this study focused on developing and evaluating various deep learning models for the classification of ultrasonic vocalizations (USVs) in rodents. They assessed a range of feedforward neural network architectures, including a custom-built, fully-connected network and a convolutional neural network (CNN), as well as different residual neural networks (ResNets), an EfficientNet, and a Vision Transformer (ViT).

To prepare the data for the models, the researchers implemented a refined, entropy-based detection algorithm that achieved a recall of 94.9% and a precision of 99.3%. This allowed the models to accurately identify and segment the individual USV calls.

The best-performing architecture, which achieved an accuracy of 86.79%, was then integrated into a fully automated pipeline for the analysis of extensive USV datasets. This pipeline enables researchers to specify a minimum accuracy threshold for the model's classifications, allowing them to selectively classify the high-confidence calls and leave the rest for manual inspection.

The researchers have used this semi-automated pipeline as part of an ongoing phenotyping study, where it has proven to be a valuable tool for identifying key differences in the USVs produced by mice with autism-like behaviors.

Critical Analysis

The researchers in this study have made a significant contribution to the field of animal communication research by developing a highly reliable and flexible deep learning-based system for the analysis of ultrasonic vocalizations (USVs) in rodents.

One of the key strengths of this work is the systematic evaluation of a wide range of neural network architectures, which allowed the researchers to identify the best-performing model for this specific task. By integrating this model into a customizable, semi-automated pipeline, the researchers have created a tool that can be tailored to the needs of individual researchers, enabling them to balance the trade-off between automation and manual review.

However, it is important to note that the study focuses exclusively on neonatal USVs, and the researchers acknowledge that further research is needed to generalize the approach to other developmental stages and animal species. Additionally, while the pipeline has proven valuable for the researchers' ongoing phenotyping study, it would be beneficial to see the tool validated on a broader range of research questions and experimental scenarios.

Developing acoustic models for automatic speech recognition in Swedish is another example of how deep learning can be applied to the analysis of animal vocalizations, and the insights from this work could potentially be leveraged to further refine and improve the USV classification pipeline presented in this paper.

Overall, this research represents an important step forward in the automated analysis of animal communication, and the researchers have provided a valuable tool that can significantly enhance the efficiency and reliability of USV research.

Conclusion

This paper presents a systematic evaluation of different deep learning models for the classification of ultrasonic vocalizations (USVs) in rodents. The researchers developed a refined, entropy-based detection algorithm and integrated the best-performing model into a fully automated pipeline that can reliably analyze extensive USV datasets.

The pipeline's semi-automated approach, which allows researchers to specify individual accuracy thresholds, has proven to be a valuable tool for identifying key differences in the USVs produced by mice with autism-like behaviors. This work represents an important advancement in the field of animal communication research, providing researchers with a powerful tool for the efficient and reliable analysis of these important social signals.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

❗

Enhancing the analysis of murine neonatal ultrasonic vocalizations: Development, evaluation, and application of different mathematical models

Rudolf Herdt, Louisa Kinzel, Johann Georg Maa{ss}, Marvin Walther, Henning Frohlich, Tim Schubert, Peter Maass, Christian Patrick Schaaf

Rodents employ a broad spectrum of ultrasonic vocalizations (USVs) for social communication. As these vocalizations offer valuable insights into affective states, social interactions, and developmental stages of animals, various deep learning approaches have aimed to automate both the quantitative (detection) and qualitative (classification) analysis of USVs. Here, we present the first systematic evaluation of different types of neural networks for USV classification. We assessed various feedforward networks, including a custom-built, fully-connected network and convolutional neural network, different residual neural networks (ResNets), an EfficientNet, and a Vision Transformer (ViT). Paired with a refined, entropy-based detection algorithm (achieving recall of 94.9% and precision of 99.3%), the best architecture (achieving 86.79% accuracy) was integrated into a fully automated pipeline capable of analyzing extensive USV datasets with high reliability. Additionally, users can specify an individual minimum accuracy threshold based on their research needs. In this semi-automated setup, the pipeline selectively classifies calls with high pseudo-probability, leaving the rest for manual inspection. Our study focuses exclusively on neonatal USVs. As part of an ongoing phenotyping study, our pipeline has proven to be a valuable tool for identifying key differences in USVs produced by mice with autism-like behaviors.

5/29/2024

An automatic analysis of ultrasound vocalisations for the prediction of interaction context in captive Egyptian fruit bats

Andreas Triantafyllopoulos, Alexander Gebhard, Manuel Milling, Simon Rampp, Bjorn Schuller

Prior work in computational bioacoustics has mostly focused on the detection of animal presence in a particular habitat. However, animal sounds contain much richer information than mere presence; among others, they encapsulate the interactions of those animals with other members of their species. Studying these interactions is almost impossible in a naturalistic setting, as the ground truth is often lacking. The use of animals in captivity instead offers a viable alternative pathway. However, most prior works follow a traditional, statistics-based approach to analysing interactions. In the present work, we go beyond this standard framework by attempting to predict the underlying context in interactions between captive emph{Rousettus Aegyptiacus} using deep neural networks. We reach an unweighted average recall of over 30% -- more than thrice the chance level -- and show error patterns that differ from our statistical analysis. This work thus represents an important step towards the automatic analysis of states in animals from sound.

6/11/2024

Advanced Framework for Animal Sound Classification With Features Optimization

Qiang Yang, Xiuying Chen, Changsheng Ma, Carlos M. Duarte, Xiangliang Zhang

The automatic classification of animal sounds presents an enduring challenge in bioacoustics, owing to the diverse statistical properties of sound signals, variations in recording equipment, and prevalent low Signal-to-Noise Ratio (SNR) conditions. Deep learning models like Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) have excelled in human speech recognition but have not been effectively tailored to the intricate nature of animal sounds, which exhibit substantial diversity even within the same domain. We propose an automated classification framework applicable to general animal sound classification. Our approach first optimizes audio features from Mel-frequency cepstral coefficients (MFCC) including feature rearrangement and feature reduction. It then uses the optimized features for the deep learning model, i.e., an attention-based Bidirectional LSTM (Bi-LSTM), to extract deep semantic features for sound classification. We also contribute an animal sound benchmark dataset encompassing oceanic animals and birds1. Extensive experimentation with real-world datasets demonstrates that our approach consistently outperforms baseline methods by over 25% in precision, recall, and accuracy, promising advancements in animal sound classification.

7/8/2024

🏷️

Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis

Jialu Li, Mark Hasegawa-Johnson, Karrie Karahalios

The assessment of children at risk of autism typically involves a clinician observing, taking notes, and rating children's behaviors. A machine learning model that can label adult and child audio may largely save labor in coding children's behaviors, helping clinicians capture critical events and better communicate with parents. In this study, we leverage Wav2Vec 2.0 (W2V2), pre-trained on 4300-hour of home audio of children under 5 years old, to build a unified system for tasks of clinician-child speaker diarization and vocalization classification (VC). To enhance children's VC, we build a W2V2 phoneme recognition system for children under 4 years old, and we incorporate its phonetically-tuned embeddings as auxiliary features or recognize pseudo phonetic transcripts as an auxiliary task. We test our method on two corpora (Rapid-ABC and BabbleCor) and obtain consistent improvements. Additionally, we outperform the state-of-the-art performance on the reproducible subset of BabbleCor. Code available at https://huggingface.co/lijialudew

6/7/2024