Assessing the Efficacy of Deep Learning Approaches for Facial Expression Recognition in Individuals with Intellectual Disabilities

2401.11877

Published 5/30/2024 by F. Xavier Gaya-Morey, Silvia Ramis, Jose M. Buades-Rubio, Cristina Manresa-Yee

🤿

Abstract

Facial expression recognition has gained significance as a means of imparting social robots with the capacity to discern the emotional states of users. The use of social robotics includes a variety of settings, including homes, nursing homes or daycare centers, serving to a wide range of users. Remarkable performance has been achieved by deep learning approaches, however, its direct use for recognizing facial expressions in individuals with intellectual disabilities has not been yet studied in the literature, to the best of our knowledge. To address this objective, we train a set of 12 convolutional neural networks in different approaches, including an ensemble of datasets without individuals with intellectual disabilities and a dataset featuring such individuals. Our examination of the outcomes, both the performance and the important image regions for the models, reveals significant distinctions in facial expressions between individuals with and without intellectual disabilities, as well as among individuals with intellectual disabilities. Remarkably, our findings show the need of facial expression recognition within this population through tailored user-specific training methodologies, which enable the models to effectively address the unique expressions of each user.

Create account to get full access

Overview

Facial expression recognition is an important capability for social robots to understand the emotional states of users.
Deep learning has achieved remarkable performance in facial expression recognition, but its use for individuals with intellectual disabilities has not been well studied.
This research examines how well deep learning models can recognize facial expressions in individuals with and without intellectual disabilities, and explores the key differences in facial expression patterns between the two groups.

Plain English Explanation

Facial expression recognition is a way for social robots to understand how people are feeling. Deep learning, a type of artificial intelligence, has been very successful at recognizing facial expressions. However, the research team noticed that most of this work has focused on people without intellectual disabilities, and it's not clear how well these models would work for people with intellectual disabilities.

To address this, the researchers trained a set of 12 deep learning models in different ways. Some models were trained on data from people without intellectual disabilities, while others included data from people with intellectual disabilities. The researchers then looked at how well the models performed and what parts of the face were most important for the models' decisions.

Interestingly, the results showed that there are significant differences in facial expressions between people with and without intellectual disabilities, as well as among individuals with intellectual disabilities. This suggests that tailored user-specific training methodologies may be needed to help social robots effectively recognize the unique facial expressions of people with intellectual disabilities.

Technical Explanation

The researchers trained a set of 12 convolutional neural networks, a type of deep learning model, using different approaches. Some models were trained on datasets that did not include individuals with intellectual disabilities, while others were trained on a dataset that did include such individuals.

By examining the performance of the models and the important regions of the face that the models focused on, the researchers found that there are significant differences in facial expressions between individuals with and without intellectual disabilities, as well as among individuals with intellectual disabilities.

The findings suggest that tailored user-specific training methodologies may be necessary to enable deep learning models to effectively recognize the unique facial expressions of people with intellectual disabilities. This could involve exploring multimodal fusion-based deep learning networks that incorporate additional sensory data, or testing the performance of face recognition on people with Down syndrome to better understand the specific challenges.

Critical Analysis

The paper provides valuable insights into the challenges of using deep learning for facial expression recognition in individuals with intellectual disabilities. The researchers acknowledge the need for further research to alleviate catastrophic forgetting in facial expression recognition and develop more effective user-specific training methodologies.

One potential limitation of the study is the size and diversity of the dataset featuring individuals with intellectual disabilities. Expanding the dataset and including a wider range of intellectual disabilities could help validate the findings and provide a more comprehensive understanding of the differences in facial expression patterns.

Additionally, the paper does not delve into the potential ethical considerations of deploying facial expression recognition systems for individuals with intellectual disabilities. Issues such as privacy, consent, and the risk of perpetuating biases or stereotypes should be carefully explored in future research.

Conclusion

This research highlights the need for tailored approaches to facial expression recognition for individuals with intellectual disabilities. The significant differences in facial expression patterns between those with and without intellectual disabilities, as well as among individuals with intellectual disabilities, suggest that user-specific training methodologies are crucial for enabling social robots to effectively understand the emotional states of this population.

By addressing these challenges, researchers can help social robots become more inclusive and enhance their ability to provide personalized support and care for individuals with intellectual disabilities in various settings, such as homes, nursing homes, and daycare centers.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

👁️

Explainable Facial Expression Recognition for People with Intellectual Disabilities

Silvia Ramis Guarinos, Cristina Manresa Yee, Jose Maria Buades Rubio, Francesc Xavier Gaya-Morey

Facial expression recognition plays an important role in human behaviour, communication, and interaction. Recent neural networks have demonstrated to perform well at its automatic recognition, with different explainability techniques available to make them more transparent. In this work, we propose a facial expression recognition study for people with intellectual disabilities that would be integrated into a social robot. We train two well-known neural networks with five databases of facial expressions and test them with two databases containing people with and without intellectual disabilities. Finally, we study in which regions the models focus to perceive a particular expression using two different explainability techniques: LIME and RISE, assessing the differences when used on images containing disabled and non-disabled people.

5/21/2024

cs.HC

Post-hoc and manifold explanations analysis of facial expression data based on deep learning

Yang Xiao

The complex information processing system of humans generates a lot of objective and subjective evaluations, making the exploration of human cognitive products of great cutting-edge theoretical value. In recent years, deep learning technologies, which are inspired by biological brain mechanisms, have made significant strides in the application of psychological or cognitive scientific research, particularly in the memorization and recognition of facial data. This paper investigates through experimental research how neural networks process and store facial expression data and associate these data with a range of psychological attributes produced by humans. Researchers utilized deep learning model VGG16, demonstrating that neural networks can learn and reproduce key features of facial data, thereby storing image memories. Moreover, the experimental results reveal the potential of deep learning models in understanding human emotions and cognitive processes and establish a manifold visualization interpretation of cognitive products or psychological attributes from a non-Euclidean space perspective, offering new insights into enhancing the explainability of AI. This study not only advances the application of AI technology in the field of psychology but also provides a new psychological theoretical understanding the information processing of the AI. The code is available in here: https://github.com/NKUShaw/Psychoinformatics.

4/30/2024

cs.CV cs.AI

🧪

Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study

Peter Washington, Haik Kalantarian, John Kent, Arman Husic, Aaron Kline, Emilie Leblanc, Cathy Hou, Onur Cezmi Mutlu, Kaitlyn Dunlap, Yordan Penev, Maya Varma, Nate Tyler Stockham, Brianna Chrisman, Kelley Paskov, Min Woo Sun, Jae-Yoon Jung, Catalin Voss, Nick Haber, Dennis Paul Wall

Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emotion-enriched images to boost the performance of automatic child emotion recognition models to a level closer to what will be needed for digital health care approaches. Methods: We leveraged our prototype therapeutic smartphone game, GuessWhat, which was designed in large part for children with developmental and behavioral conditions, to gamify the secure collection of video data of children expressing a variety of emotions prompted by the game. Independently, we created a secure web interface to gamify the human labeling effort, called HollywoodSquares, tailored for use by any qualified labeler. We gathered and labeled 2155 videos, 39,968 emotion frames, and 106,001 labels on all images. With this drastically expanded pediatric emotion-centric database (>30 times larger than existing public pediatric emotion data sets), we trained a convolutional neural network (CNN) computer vision classifier of happy, sad, surprised, fearful, angry, disgust, and neutral expressions evoked by children. Results: The classifier achieved a 66.9% balanced accuracy and 67.4% F1-score on the entirety of the Child Affective Facial Expression (CAFE) as well as a 79.1% balanced accuracy and 78% F1-score on CAFE Subset A, a subset containing at least 60% human agreement on emotions labels. This performance is at least 10% higher than all previously developed classifiers evaluated against CAFE, the best of which reached a 56% balanced accuracy even when combining anger and disgust into a single class.

6/5/2024

cs.CV cs.CY cs.HC

Exploring a Multimodal Fusion-based Deep Learning Network for Detecting Facial Palsy

Nicole Heng Yim Oo, Min Hun Lee, Jeong Hoon Lim

Algorithmic detection of facial palsy offers the potential to improve current practices, which usually involve labor-intensive and subjective assessment by clinicians. In this paper, we present a multimodal fusion-based deep learning model that utilizes unstructured data (i.e. an image frame with facial line segments) and structured data (i.e. features of facial expressions) to detect facial palsy. We then contribute to a study to analyze the effect of different data modalities and the benefits of a multimodal fusion-based approach using videos of 21 facial palsy patients. Our experimental results show that among various data modalities (i.e. unstructured data - RGB images and images of facial line segments and structured data - coordinates of facial landmarks and features of facial expressions), the feed-forward neural network using features of facial expression achieved the highest precision of 76.22 while the ResNet-based model using images of facial line segments achieved the highest recall of 83.47. When we leveraged both images of facial line segments and features of facial expressions, our multimodal fusion-based deep learning model slightly improved the precision score to 77.05 at the expense of a decrease in the recall score.

5/28/2024

cs.CV cs.AI cs.LG