MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum Disorder

Read original: arXiv:2408.15077 - Published 8/30/2024 by Pavan Uttej Ravva, Behdokht Kiafar, Pinar Kullu, Jicheng Li, Anjana Bhat, Roghayeh Leila Barmaki

🎯

Overview

Autism spectrum disorder (ASD) is a condition characterized by challenges in social interaction and understanding communication signals.
Researchers have been exploring the use of deep learning-powered computer vision techniques to monitor progress in therapeutic interventions for ASD.
However, privacy concerns have made it difficult to share datasets across different ASD research models, hampering comparisons.

Plain English Explanation

The paper introduces MMASD+, a dataset that addresses these challenges. MMASD+ includes diverse data modalities such as 3D-Skeleton, 3D Body Mesh, and Optical Flow data. It also integrates the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and the children, a significant barrier in the original dataset.

Additionally, the researchers propose a Multimodal Transformer framework to predict 11 action types and the presence of ASD. This framework achieves high accuracy, demonstrating over a 10% improvement compared to models trained on single data modalities. The findings highlight the advantages of integrating multiple data modalities within the Multimodal Transformer framework.

Technical Explanation

The paper introduces the MMASD+ dataset, which includes diverse data modalities such as 3D-Skeleton, 3D Body Mesh, and Optical Flow data. To address a significant barrier in the original dataset, the researchers integrated the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and the children.

Furthermore, the paper proposes a Multimodal Transformer framework to predict 11 action types and the presence of ASD. This framework achieved an accuracy of 95.03% for predicting action types and 96.42% for predicting ASD presence, demonstrating over a 10% improvement compared to models trained on single data modalities.

Critical Analysis

The paper highlights the advantages of integrating multiple data modalities within the Multimodal Transformer framework for ASD research. However, the researchers do not address potential limitations or caveats of their approach, such as the generalizability of the dataset or the interpretability of the Transformer model's predictions.

Additionally, the paper does not discuss any potential ethical considerations or privacy concerns related to the use of private, non-public datasets in ASD research. Further exploration of these issues could provide valuable insights for the field.

Conclusion

The MMASD+ dataset and the Multimodal Transformer framework proposed in this paper demonstrate the potential of integrating multiple data modalities for improving the accuracy of ASD-related predictions. These findings could have significant implications for the development of more effective therapeutic interventions and monitoring tools for individuals with autism spectrum disorder.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🎯

MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum Disorder

Pavan Uttej Ravva, Behdokht Kiafar, Pinar Kullu, Jicheng Li, Anjana Bhat, Roghayeh Leila Barmaki

Autism spectrum disorder (ASD) is characterized by significant challenges in social interaction and comprehending communication signals. Recently, therapeutic interventions for ASD have increasingly utilized Deep learning powered-computer vision techniques to monitor individual progress over time. These models are trained on private, non-public datasets from the autism community, creating challenges in comparing results across different models due to privacy-preserving data-sharing issues. This work introduces MMASD+, an enhanced version of the novel open-source dataset called Multimodal ASD (MMASD). MMASD+ consists of diverse data modalities, including 3D-Skeleton, 3D Body Mesh, and Optical Flow data. It integrates the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and children, addressing a significant barrier in the original dataset. Additionally, a Multimodal Transformer framework is proposed to predict 11 action types and the presence of ASD. This framework achieves an accuracy of 95.03% for predicting action types and 96.42% for predicting ASD presence, demonstrating over a 10% improvement compared to models trained on single data modalities. These findings highlight the advantages of integrating multiple data modalities within the Multimodal Transformer framework.

8/30/2024

A Novel Dataset for Video-Based Autism Classification Leveraging Extra-Stimulatory Behavior

Manuel Serna-Aguilera, Xuan Bac Nguyen, Han-Seok Seo, Khoa Luu

Autism Spectrum Disorder (ASD) can affect individuals at varying degrees of intensity, from challenges in overall health, communication, and sensory processing, and this often begins at a young age. Thus, it is critical for medical professionals to be able to accurately diagnose ASD in young children, but doing so is difficult. Deep learning can be responsibly leveraged to improve productivity in addressing this task. The availability of data, however, remains a considerable obstacle. Hence, in this work, we introduce the Video ASD dataset--a dataset that contains video frame convolutional and attention map feature data--to foster further progress in the task of ASD classification. The original videos showcase children reacting to chemo-sensory stimuli, among auditory, touch, and vision This dataset contains the features of the frames spanning 2,467 videos, for a total of approximately 1.4 million frames. Additionally, head pose angles are included to account for head movement noise, as well as full-sentence text labels for the taste and smell videos that describe how the facial expression changes before, immediately after, and long after interaction with the stimuli. In addition to providing features, we also test foundation models on this data to showcase how movement noise affects performance and the need for more data and more complex labels.

9/10/2024

$Ensemble Modeling of Multiple Physical Indicators to Dynamically Phenotype Autism Spectrum Disorder$

Ensemble Modeling of Multiple Physical Indicators to Dynamically Phenotype Autism Spectrum Disorder

Marie Huynh (Stanford University), Aaron Kline (Stanford University), Saimourya Surabhi (Stanford University), Kaitlyn Dunlap (Stanford University), Onur Cezmi Mutlu (Stanford University), Mohammadmahdi Honarmand (Stanford University), Parnian Azizian (Stanford University), Peter Washington (University of Hawaii at Manoa), Dennis P. Wall (Stanford University)

Early detection of autism, a neurodevelopmental disorder marked by social communication challenges, is crucial for timely intervention. Recent advancements have utilized naturalistic home videos captured via the mobile application GuessWhat. Through interactive games played between children and their guardians, GuessWhat has amassed over 3,000 structured videos from 382 children, both diagnosed with and without Autism Spectrum Disorder (ASD). This collection provides a robust dataset for training computer vision models to detect ASD-related phenotypic markers, including variations in emotional expression, eye contact, and head movements. We have developed a protocol to curate high-quality videos from this dataset, forming a comprehensive training set. Utilizing this set, we trained individual LSTM-based models using eye gaze, head positions, and facial landmarks as input features, achieving test AUCs of 86%, 67%, and 78%, respectively. To boost diagnostic accuracy, we applied late fusion techniques to create ensemble models, improving the overall AUC to 90%. This approach also yielded more equitable results across different genders and age groups. Our methodology offers a significant step forward in the early detection of ASD by potentially reducing the reliance on subjective assessments and making early identification more accessibly and equitable.

8/26/2024

MADE-for-ASD: A Multi-Atlas Deep Ensemble Network for Diagnosing Autism Spectrum Disorder

Xuehan Liu, Md Rakibul Hasan, Tom Gedeon, Md Zakir Hossain

In response to the global need for efficient early diagnosis of Autism Spectrum Disorder (ASD), this paper bridges the gap between traditional, time-consuming diagnostic methods and potential automated solutions. We propose a multi-atlas deep ensemble network, MADE-for-ASD, that integrates multiple atlases of the brain's functional magnetic resonance imaging (fMRI) data through a weighted deep ensemble network. Our approach integrates demographic information into the prediction workflow, which enhances ASD diagnosis performance and offers a more holistic perspective on patient profiling. We experiment with the well-known publicly available ABIDE (Autism Brain Imaging Data Exchange) I dataset, consisting of resting state fMRI data from 17 different laboratories around the globe. Our proposed system achieves 75.20% accuracy on the entire dataset and 96.40% on a specific subset $-$ both surpassing reported ASD diagnosis accuracy in ABIDE I fMRI studies. Specifically, our model improves by 4.4 percentage points over prior works on the same amount of data. The model exhibits a sensitivity of 82.90% and a specificity of 69.70% on the entire dataset, and 91.00% and 99.50%, respectively, on the specific subset. We leverage the F-score to pinpoint the top 10 ROI in ASD diagnosis, such as precuneus and anterior cingulate/ventromedial. The proposed system can potentially pave the way for more cost-effective, efficient and scalable strategies in ASD diagnosis. Codes and evaluations are publicly available at https://github.com/hasan-rakibul/MADE-for-ASD.

9/5/2024