An Explainable Fast Deep Neural Network for Emotion Recognition

Read original: arXiv:2407.14865 - Published 7/23/2024 by Francesco Di Luzio, Antonello Rosato, Massimo Panella

An Explainable Fast Deep Neural Network for Emotion Recognition

Overview

This paper presents a fast and explainable deep neural network for emotion recognition.
The model aims to achieve high accuracy while maintaining interpretability and low computational cost.
Experiments on benchmark datasets demonstrate the model's effectiveness compared to existing approaches.

Plain English Explanation

The paper describes a new deep learning model designed for recognizing emotions from facial expressions. The key goals are to:

Achieve high accuracy in classifying emotions like happiness, sadness, anger, etc.
Provide explanations for how the model makes its predictions, making it more transparent and interpretable.
Run efficiently with low computational requirements, allowing the model to be used in real-time applications.

The researchers developed a novel neural network architecture that meets these objectives. By incorporating specific design choices, they were able to create a fast and explainable emotion recognition system. Experiments on standard emotion datasets showed this new model outperformed existing approaches in accuracy while maintaining the desired speed and interpretability.

Technical Explanation

The paper proposes an "Explainable Fast Deep Neural Network (EF-DNN)" for emotion recognition. The model uses a lightweight convolutional neural network backbone coupled with attention mechanisms to identify the most relevant facial regions for each emotion.

The attention-based design allows the model to not only classify emotions, but also highlight the specific facial features that contributed to its predictions. This visual explanation helps make the model's decision-making process more transparent.

Experiments on the Aff-Wild2 and AFEW emotion recognition datasets showed the EF-DNN achieved state-of-the-art accuracy while running significantly faster than other deep learning approaches. This efficiency makes the model suitable for real-time applications like facial expression analysis for people with intellectual disabilities.

Critical Analysis

The paper presents a well-designed and thorough evaluation of the proposed EF-DNN model. The authors acknowledge that while the model achieves high accuracy and efficiency, it may not generalize as well to in-the-wild scenarios with more diverse facial expressions and environments.

Additionally, the paper does not delve into the potential biases or fairness issues that could arise from deploying such an emotion recognition system in real-world applications. Further research is needed to ensure the model performs equitably across different demographics and use cases.

Conclusion

This paper introduces an innovative deep learning approach for fast and explainable emotion recognition from facial expressions. By incorporating attention mechanisms, the model not only achieves state-of-the-art accuracy, but also provides visual explanations for its predictions, making it more transparent and trustworthy.

The efficiency of the EF-DNN model enables its use in real-time applications, such as facial expression analysis for individuals with intellectual disabilities. However, further research is needed to address potential biases and ensure the model's fairness and robustness in diverse real-world settings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

An Explainable Fast Deep Neural Network for Emotion Recognition

Francesco Di Luzio, Antonello Rosato, Massimo Panella

In the context of artificial intelligence, the inherent human attribute of engaging in logical reasoning to facilitate decision-making is mirrored by the concept of explainability, which pertains to the ability of a model to provide a clear and interpretable account of how it arrived at a particular outcome. This study explores explainability techniques for binary deep neural architectures in the framework of emotion classification through video analysis. We investigate the optimization of input features to binary classifiers for emotion recognition, with face landmarks detection using an improved version of the Integrated Gradients explainability method. The main contribution of this paper consists in the employment of an innovative explainable artificial intelligence algorithm to understand the crucial facial landmarks movements during emotional feeling, using this information also for improving the performances of deep learning-based emotion classifiers. By means of explainability, we can optimize the number and the position of the facial landmarks used as input features for facial emotion recognition, lowering the impact of noisy landmarks and thus increasing the accuracy of the developed models. In order to test the effectiveness of the proposed approach, we considered a set of deep binary models for emotion classification trained initially with a complete set of facial landmarks, which are progressively reduced based on a suitable optimization procedure. The obtained results prove the robustness of the proposed explainable approach in terms of understanding the relevance of the different facial points for the different emotions, also improving the classification accuracy and diminishing the computational cost.

7/23/2024

Explainable Emotion Decoding for Human and Computer Vision

Alessio Borriero, Martina Milazzo, Matteo Diano, Davide Orsenigo, Maria Chiara Villa, Chiara Di Fazio, Marco Tamietto, Alan Perotti

Modern Machine Learning (ML) has significantly advanced various research fields, but the opaque nature of ML models hinders their adoption in several domains. Explainable AI (XAI) addresses this challenge by providing additional information to help users understand the internal decision-making process of ML models. In the field of neuroscience, enriching a ML model for brain decoding with attribution-based XAI techniques means being able to highlight which brain areas correlate with the task at hand, thus offering valuable insights to domain experts. In this paper, we analyze human and Computer Vision (CV) systems in parallel, training and explaining two ML models based respectively on functional Magnetic Resonance Imaging (fMRI) and movie frames. We do so by leveraging the StudyForrest dataset, which includes functional Magnetic Resonance Imaging (fMRI) scans of subjects watching the Forrest Gump movie, emotion annotations, and eye-tracking data. For human vision the ML task is to link fMRI data with emotional annotations, and the explanations highlight the brain regions strongly correlated with the label. On the other hand, for computer vision, the input data is movie frames, and the explanations are pixel-level heatmaps. We cross-analyzed our results, linking human attention (obtained through eye-tracking) with XAI saliency on CV models and brain region activations. We show how a parallel analysis of human and computer vision can provide useful information for both the neuroscience community (allocation theory) and the ML community (biological plausibility of convolutional models).

8/2/2024

Post-hoc and manifold explanations analysis of facial expression data based on deep learning

Yang Xiao

The complex information processing system of humans generates a lot of objective and subjective evaluations, making the exploration of human cognitive products of great cutting-edge theoretical value. In recent years, deep learning technologies, which are inspired by biological brain mechanisms, have made significant strides in the application of psychological or cognitive scientific research, particularly in the memorization and recognition of facial data. This paper investigates through experimental research how neural networks process and store facial expression data and associate these data with a range of psychological attributes produced by humans. Researchers utilized deep learning model VGG16, demonstrating that neural networks can learn and reproduce key features of facial data, thereby storing image memories. Moreover, the experimental results reveal the potential of deep learning models in understanding human emotions and cognitive processes and establish a manifold visualization interpretation of cognitive products or psychological attributes from a non-Euclidean space perspective, offering new insights into enhancing the explainability of AI. This study not only advances the application of AI technology in the field of psychology but also provides a new psychological theoretical understanding the information processing of the AI. The code is available in here: https://github.com/NKUShaw/Psychoinformatics.

4/30/2024

👁️

Explainable Facial Expression Recognition for People with Intellectual Disabilities

Silvia Ramis Guarinos, Cristina Manresa Yee, Jose Maria Buades Rubio, Francesc Xavier Gaya-Morey

Facial expression recognition plays an important role in human behaviour, communication, and interaction. Recent neural networks have demonstrated to perform well at its automatic recognition, with different explainability techniques available to make them more transparent. In this work, we propose a facial expression recognition study for people with intellectual disabilities that would be integrated into a social robot. We train two well-known neural networks with five databases of facial expressions and test them with two databases containing people with and without intellectual disabilities. Finally, we study in which regions the models focus to perceive a particular expression using two different explainability techniques: LIME and RISE, assessing the differences when used on images containing disabled and non-disabled people.

5/21/2024