Explainable Facial Expression Recognition for People with Intellectual Disabilities

2405.11482

Published 5/21/2024 by Silvia Ramis Guarinos, Cristina Manresa Yee, Jose Maria Buades Rubio, Francesc Xavier Gaya-Morey

👁️

Abstract

Facial expression recognition plays an important role in human behaviour, communication, and interaction. Recent neural networks have demonstrated to perform well at its automatic recognition, with different explainability techniques available to make them more transparent. In this work, we propose a facial expression recognition study for people with intellectual disabilities that would be integrated into a social robot. We train two well-known neural networks with five databases of facial expressions and test them with two databases containing people with and without intellectual disabilities. Finally, we study in which regions the models focus to perceive a particular expression using two different explainability techniques: LIME and RISE, assessing the differences when used on images containing disabled and non-disabled people.

Create account to get full access

Overview

This paper explores using neural networks for facial expression recognition, with a focus on people with intellectual disabilities.
The researchers trained two popular neural network models on five different facial expression databases, then tested them on two databases containing people with and without intellectual disabilities.
They used two explainability techniques, LIME and RISE, to understand which regions of the face the models were focusing on to recognize different expressions.

Plain English Explanation

Facial expression recognition is an important skill for understanding human behavior, communication, and social interaction. Recent advances in neural networks have shown that computers can be quite good at automatically recognizing facial expressions. However, it's not always clear how these neural networks are making their decisions.

In this study, the researchers wanted to see how well neural networks would perform at facial expression recognition for people with intellectual disabilities. They trained two well-known neural network models on five different databases of facial expressions. Then, they tested the models on two new databases - one with people who have intellectual disabilities and one with people who don't.

The researchers also used two different techniques, called LIME and RISE, to try to understand which parts of the face the models were focusing on to recognize each expression. This can help make the models more transparent and explainable.

The results of this study could be important for developing social robots or other technologies that need to be able to interact with and understand people with intellectual disabilities. By understanding how the models perceive facial expressions in this population, the researchers can work to improve the technology and make it more inclusive.

Technical Explanation

The researchers trained two popular neural network architectures, VGG-16 and ResNet-50, on five different databases of facial expressions: CK+, JAFFE, RAF-DB, ExpW, and FER2013. They then tested the models on two new databases: one containing people with intellectual disabilities and one containing people without disabilities.

To understand how the models were perceiving the facial expressions, the researchers used two explainability techniques: LIME and RISE. These methods identify the regions of the face that the models are focusing on when making their predictions. By comparing the results for the two test databases, the researchers could see if the models were prioritizing different facial features for the two groups.

Critical Analysis

The researchers acknowledge several limitations of their study. First, the databases used for training and testing were relatively small, which may limit the generalizability of the results. Additionally, the databases did not contain a large number of individuals with intellectual disabilities, so the conclusions about this population may not be robust.

Another potential issue is that the explainability techniques used, LIME and RISE, have their own limitations and biases. It's possible that these methods do not fully capture the decision-making process of the neural networks, which could lead to incomplete or misleading insights.

Finally, the study does not address the potential ethical implications of using facial expression recognition on marginalized populations, such as concerns about privacy, bias, and the potential for misuse of the technology. These are important considerations that should be carefully examined in future research.

Conclusion

Overall, this study provides valuable insights into the performance of facial expression recognition models on people with intellectual disabilities. The researchers demonstrated that popular neural network architectures can be trained to recognize expressions in this population, but there are important differences in how the models perceive the facial features.

The use of explainability techniques like LIME and RISE can help make these models more transparent and accountable, which is crucial for developing inclusive and ethical technologies. Future research should build on these findings to further improve facial expression recognition for people with intellectual disabilities and other marginalized groups.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤿

Assessing the Efficacy of Deep Learning Approaches for Facial Expression Recognition in Individuals with Intellectual Disabilities

F. Xavier Gaya-Morey, Silvia Ramis, Jose M. Buades-Rubio, Cristina Manresa-Yee

Facial expression recognition has gained significance as a means of imparting social robots with the capacity to discern the emotional states of users. The use of social robotics includes a variety of settings, including homes, nursing homes or daycare centers, serving to a wide range of users. Remarkable performance has been achieved by deep learning approaches, however, its direct use for recognizing facial expressions in individuals with intellectual disabilities has not been yet studied in the literature, to the best of our knowledge. To address this objective, we train a set of 12 convolutional neural networks in different approaches, including an ensemble of datasets without individuals with intellectual disabilities and a dataset featuring such individuals. Our examination of the outcomes, both the performance and the important image regions for the models, reveals significant distinctions in facial expressions between individuals with and without intellectual disabilities, as well as among individuals with intellectual disabilities. Remarkably, our findings show the need of facial expression recognition within this population through tailored user-specific training methodologies, which enable the models to effectively address the unique expressions of each user.

5/30/2024

cs.CV

Post-hoc and manifold explanations analysis of facial expression data based on deep learning

Yang Xiao

The complex information processing system of humans generates a lot of objective and subjective evaluations, making the exploration of human cognitive products of great cutting-edge theoretical value. In recent years, deep learning technologies, which are inspired by biological brain mechanisms, have made significant strides in the application of psychological or cognitive scientific research, particularly in the memorization and recognition of facial data. This paper investigates through experimental research how neural networks process and store facial expression data and associate these data with a range of psychological attributes produced by humans. Researchers utilized deep learning model VGG16, demonstrating that neural networks can learn and reproduce key features of facial data, thereby storing image memories. Moreover, the experimental results reveal the potential of deep learning models in understanding human emotions and cognitive processes and establish a manifold visualization interpretation of cognitive products or psychological attributes from a non-Euclidean space perspective, offering new insights into enhancing the explainability of AI. This study not only advances the application of AI technology in the field of psychology but also provides a new psychological theoretical understanding the information processing of the AI. The code is available in here: https://github.com/NKUShaw/Psychoinformatics.

4/30/2024

cs.CV cs.AI

Testing the Performance of Face Recognition for People with Down Syndrome

Christian Rathgeb, Mathias Ibsen, Denise Hartmann, Simon Hradetzky, Berglind 'Olafsd'ottir

The fairness of biometric systems, in particular facial recognition, is often analysed for larger demographic groups, e.g. female vs. male or black vs. white. In contrast to this, minority groups are commonly ignored. This paper investigates the performance of facial recognition algorithms on individuals with Down syndrome, a common chromosomal abnormality that affects approximately one in 1,000 births per year. To do so, a database of 98 individuals with Down syndrome, each represented by at least five facial images, is semi-automatically collected from YouTube. Subsequently, two facial image quality assessment algorithms and five recognition algorithms are evaluated on the newly collected database and on the public facial image databases CelebA and FRGCv2. The results show that the quality scores of facial images for individuals with Down syndrome are comparable to those of individuals without Down syndrome captured under similar conditions. Furthermore, it is observed that face recognition performance decreases significantly for individuals with Down syndrome, which is largely attributed to the increased likelihood of false matches.

5/21/2024

cs.CV

Contextual Emotion Recognition using Large Vision Language Models

Yasaman Etesam, Ozge Nilay Yalc{c}{i}n, Chuxuan Zhang, Angelica Lim

How does the person in the bounding box feel? Achieving human-level recognition of the apparent emotion of a person in real world situations remains an unsolved task in computer vision. Facial expressions are not enough: body pose, contextual knowledge, and commonsense reasoning all contribute to how humans perform this emotional theory of mind task. In this paper, we examine two major approaches enabled by recent large vision language models: 1) image captioning followed by a language-only LLM, and 2) vision language models, under zero-shot and fine-tuned setups. We evaluate the methods on the Emotions in Context (EMOTIC) dataset and demonstrate that a vision language model, fine-tuned even on a small dataset, can significantly outperform traditional baselines. The results of this work aim to help robots and agents perform emotionally sensitive decision-making and interaction in the future.

5/16/2024

cs.CV