Introducing 3DCNN ResNets for ASD full-body kinematic assessment: a comparison with hand-crafted features

Read original: arXiv:2311.14533 - Published 6/27/2024 by Alberto Altozano, Maria Eleonora Minissi, Mariano Alca~niz, Javier Mar'in-Morales

Introducing 3DCNN ResNets for ASD full-body kinematic assessment: a comparison with hand-crafted features

Materials

The paper discusses a study that compares two approaches for assessing autism spectrum disorder (ASD) using full-body tracking data: feature engineering and end-to-end deep learning.

Overview

The study investigates the effectiveness of feature engineering versus end-to-end deep learning for ASD assessment using full-body tracking data.
Two datasets were used: the Early Autism Diagnosis Based on Path Signature of Siamese Networks dataset and the Localizing Moments of Actions in Untrimmed Videos of Infants with Autism dataset.
The feature engineering approach involves extracting and selecting relevant features from the full-body tracking data, while the end-to-end deep learning approach directly learns features and a classification model from the raw data.
The performance of the two approaches is compared on the task of ASD assessment.

Plain English Explanation

This study is looking at two different ways to use full-body tracking data to assess whether someone has autism spectrum disorder (ASD). The first approach is called "feature engineering," where the researchers take the full-body tracking data and carefully select and extract certain features that they think are relevant for identifying ASD. The second approach is "end-to-end deep learning," where the researchers use a deep neural network to automatically learn the relevant features and how to use them to identify ASD, without the manual feature selection step.

The researchers tested these two approaches on two different datasets of full-body tracking data from people with and without ASD. The goal was to see which approach works better for accurately assessing whether someone has ASD based on their full-body movements.

Technical Explanation

The paper evaluates and compares two approaches for ASD assessment using full-body tracking data:

Feature Engineering: This approach involves carefully selecting and extracting relevant features from the full-body tracking data, such as measures of movement, posture, and coordination. The researchers then use these engineered features as input to a machine learning classifier to predict ASD.
End-to-End Deep Learning: This approach uses a deep neural network to directly learn relevant features and a classification model from the raw full-body tracking data, without the manual feature engineering step. The deep learning model takes the raw data as input and outputs a prediction of whether the individual has ASD.

The researchers tested these two approaches on two datasets: the Early Autism Diagnosis Based on Path Signature of Siamese Networks dataset and the Localizing Moments of Actions in Untrimmed Videos of Infants with Autism dataset. They compared the performance of the feature engineering and end-to-end deep learning approaches on the task of ASD assessment.

Critical Analysis

The paper provides a thorough comparison of feature engineering and end-to-end deep learning approaches for ASD assessment using full-body tracking data. However, the study does not address certain limitations:

The datasets used are relatively small, which may limit the generalizability of the results. Larger and more diverse datasets would be needed to further validate the findings.
The paper does not discuss the interpretability of the deep learning models, which is an important consideration for clinical applications where model decisions need to be explained.
The study focuses on binary classification (ASD vs. non-ASD), but it would be valuable to explore the ability of these approaches to provide more nuanced assessments, such as identifying different severity levels of ASD.

Additional research is needed to better understand the trade-offs between feature engineering and end-to-end deep learning for ASD assessment, particularly in terms of model interpretability, generalizability, and ability to provide detailed assessments.

Conclusion

This study compares two approaches, feature engineering and end-to-end deep learning, for assessing autism spectrum disorder (ASD) using full-body tracking data. The results suggest that both approaches can be effective for ASD assessment, but more research is needed to fully understand the strengths and limitations of each approach, especially in terms of model interpretability, dataset size and diversity, and ability to provide nuanced assessments. The findings of this study contribute to the ongoing efforts to develop more objective and accurate tools for ASD diagnosis and monitoring.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Introducing 3DCNN ResNets for ASD full-body kinematic assessment: a comparison with hand-crafted features

Alberto Altozano, Maria Eleonora Minissi, Mariano Alca~niz, Javier Mar'in-Morales

Autism Spectrum Disorder (ASD) is characterized by challenges in social communication and restricted patterns, with motor abnormalities gaining traction for early detection. However, kinematic analysis in ASD is limited, often lacking robust validation and relying on hand-crafted features for single tasks, leading to inconsistencies across studies. End-to-end models have emerged as promising methods to overcome the need for feature engineering. Our aim is to propose a newly adapted 3DCNN ResNet from and compare it to widely used hand-crafted features for motor ASD assessment. Specifically, we developed a virtual reality environment with multiple motor tasks and trained models using both approaches. We prioritized a reliable validation framework with repeated cross-validation. Results show the proposed model achieves a maximum accuracy of 85$pm$3%, outperforming state-of-the-art end-to-end models with short 1-to-3 minute samples. Our comparative analysis with hand-crafted features shows feature-engineered models outperformed our end-to-end model in certain tasks. However, our end-to-end model achieved a higher mean AUC of 0.80$pm$0.03. Additionally, statistical differences were found in model variance, with our end-to-end model providing more consistent results with less variability across all VR tasks, demonstrating domain generalization and reliability. These findings show that end-to-end models enable less variable and context-independent ASD classification without requiring domain knowledge or task specificity. However, they also recognize the effectiveness of hand-crafted features in specific task scenarios.

6/27/2024

$Ensemble Modeling of Multiple Physical Indicators to Dynamically Phenotype Autism Spectrum Disorder$

Ensemble Modeling of Multiple Physical Indicators to Dynamically Phenotype Autism Spectrum Disorder

Marie Huynh (Stanford University), Aaron Kline (Stanford University), Saimourya Surabhi (Stanford University), Kaitlyn Dunlap (Stanford University), Onur Cezmi Mutlu (Stanford University), Mohammadmahdi Honarmand (Stanford University), Parnian Azizian (Stanford University), Peter Washington (University of Hawaii at Manoa), Dennis P. Wall (Stanford University)

Early detection of autism, a neurodevelopmental disorder marked by social communication challenges, is crucial for timely intervention. Recent advancements have utilized naturalistic home videos captured via the mobile application GuessWhat. Through interactive games played between children and their guardians, GuessWhat has amassed over 3,000 structured videos from 382 children, both diagnosed with and without Autism Spectrum Disorder (ASD). This collection provides a robust dataset for training computer vision models to detect ASD-related phenotypic markers, including variations in emotional expression, eye contact, and head movements. We have developed a protocol to curate high-quality videos from this dataset, forming a comprehensive training set. Utilizing this set, we trained individual LSTM-based models using eye gaze, head positions, and facial landmarks as input features, achieving test AUCs of 86%, 67%, and 78%, respectively. To boost diagnostic accuracy, we applied late fusion techniques to create ensemble models, improving the overall AUC to 90%. This approach also yielded more equitable results across different genders and age groups. Our methodology offers a significant step forward in the early detection of ASD by potentially reducing the reliance on subjective assessments and making early identification more accessibly and equitable.

8/26/2024

A Novel Dataset for Video-Based Autism Classification Leveraging Extra-Stimulatory Behavior

Manuel Serna-Aguilera, Xuan Bac Nguyen, Han-Seok Seo, Khoa Luu

Autism Spectrum Disorder (ASD) can affect individuals at varying degrees of intensity, from challenges in overall health, communication, and sensory processing, and this often begins at a young age. Thus, it is critical for medical professionals to be able to accurately diagnose ASD in young children, but doing so is difficult. Deep learning can be responsibly leveraged to improve productivity in addressing this task. The availability of data, however, remains a considerable obstacle. Hence, in this work, we introduce the Video ASD dataset--a dataset that contains video frame convolutional and attention map feature data--to foster further progress in the task of ASD classification. The original videos showcase children reacting to chemo-sensory stimuli, among auditory, touch, and vision This dataset contains the features of the frames spanning 2,467 videos, for a total of approximately 1.4 million frames. Additionally, head pose angles are included to account for head movement noise, as well as full-sentence text labels for the taste and smell videos that describe how the facial expression changes before, immediately after, and long after interaction with the stimuli. In addition to providing features, we also test foundation models on this data to showcase how movement noise affects performance and the need for more data and more complex labels.

9/10/2024

🎯

MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum Disorder

Pavan Uttej Ravva, Behdokht Kiafar, Pinar Kullu, Jicheng Li, Anjana Bhat, Roghayeh Leila Barmaki

Autism spectrum disorder (ASD) is characterized by significant challenges in social interaction and comprehending communication signals. Recently, therapeutic interventions for ASD have increasingly utilized Deep learning powered-computer vision techniques to monitor individual progress over time. These models are trained on private, non-public datasets from the autism community, creating challenges in comparing results across different models due to privacy-preserving data-sharing issues. This work introduces MMASD+, an enhanced version of the novel open-source dataset called Multimodal ASD (MMASD). MMASD+ consists of diverse data modalities, including 3D-Skeleton, 3D Body Mesh, and Optical Flow data. It integrates the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and children, addressing a significant barrier in the original dataset. Additionally, a Multimodal Transformer framework is proposed to predict 11 action types and the presence of ASD. This framework achieves an accuracy of 95.03% for predicting action types and 96.42% for predicting ASD presence, demonstrating over a 10% improvement compared to models trained on single data modalities. These findings highlight the advantages of integrating multiple data modalities within the Multimodal Transformer framework.

8/30/2024