From Macro to Micro: Boosting micro-expression recognition via pre-training on macro-expression videos

Read original: arXiv:2405.16451 - Published 6/5/2024 by Hanting Li, Hongjing Niu, Feng Zhao

From Macro to Micro: Boosting micro-expression recognition via pre-training on macro-expression videos

Overview

This research paper explores a novel approach to boosting micro-expression recognition by pre-training on macro-expression videos.
Micro-expressions are subtle, fleeting facial movements that can reveal a person's true emotions, while macro-expressions are more obvious and longer-lasting.
The authors propose a transfer learning method that leverages the knowledge gained from recognizing macro-expressions to improve the performance of micro-expression recognition models.

Plain English Explanation

Facial expressions can reveal a lot about how a person is feeling. Macro-expressions are big, obvious expressions that last for a while, like a wide smile or a frown. Micro-expressions, on the other hand, are tiny, quick changes in the face that are harder to spot. They can show a person's true emotions, even if they're trying to hide them.

The researchers in this paper had an idea to use what a model learns from recognizing macro-expressions to help it get better at recognizing micro-expressions. They call this transfer learning. The theory is that the model can take what it's learned about facial movements and expressions from the macro-expression videos and apply that knowledge to the micro-expression videos, making it better at spotting those subtle changes in the face.

By pre-training the model on the macro-expression videos first, the researchers were able to boost the model's performance on micro-expression recognition tasks. This could be really useful for applications like lie detection, emotional intelligence, and understanding human behavior.

Technical Explanation

The authors propose a transfer learning approach to improve micro-expression recognition by pre-training on macro-expression videos. They leverage the knowledge gained from recognizing macro-expressions to boost the performance of micro-expression recognition models.

The key steps of their method are:

Pre-train a facial expression recognition model on a large dataset of macro-expression videos.
Fine-tune the pre-trained model on a smaller dataset of micro-expression videos.
Evaluate the fine-tuned model's performance on micro-expression recognition tasks.

The authors' experiments show that this transfer learning approach significantly outperforms training the micro-expression recognition model from scratch. By starting with the knowledge gained from macro-expressions, the model is better equipped to handle the more challenging task of micro-expression recognition.

The authors also explore different pre-training and fine-tuning strategies, as well as the impact of dataset size and model architecture on the overall performance.

Critical Analysis

The authors acknowledge several limitations of their work. First, the performance gains may be dependent on the specific datasets and model architectures used. Additionally, the transfer learning approach assumes there is some commonality between macro and micro-expressions, which may not always be the case.

Another potential issue is the reliance on pre-labelled datasets for both macro and micro-expressions. In real-world scenarios, obtaining accurate labels for micro-expressions can be very challenging, which could limit the practical application of this method.

The authors also note that further research is needed to better understand the relationship between macro and micro-expressions, and how to optimally leverage this connection for micro-expression recognition. Exploring alternative transfer learning techniques or integrating other modalities, such as body language or context, may also help improve the generalization of these models.

Conclusion

This research paper presents a novel approach to boosting micro-expression recognition by leveraging pre-training on macro-expression videos. The transfer learning method demonstrated promising results, highlighting the potential of utilizing knowledge gained from more easily recognizable facial expressions to enhance the detection of subtle, fleeting micro-expressions.

The findings of this study could have significant implications for a variety of applications, such as emotional intelligence, lie detection, and human behavior understanding. By improving micro-expression recognition, researchers can gain deeper insights into human emotions and social interactions, paving the way for more advanced AI systems that can better understand and respond to the nuances of human communication.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

From Macro to Micro: Boosting micro-expression recognition via pre-training on macro-expression videos

Hanting Li, Hongjing Niu, Feng Zhao

Micro-expression recognition (MER) has drawn increasing attention in recent years due to its potential applications in intelligent medical and lie detection. However, the shortage of annotated data has been the major obstacle to further improve deep-learning based MER methods. Intuitively, utilizing sufficient macro-expression data to promote MER performance seems to be a feasible solution. However, the facial patterns of macro-expressions and micro-expressions are significantly different, which makes naive transfer learning methods difficult to deploy directly. To tacle this issue, we propose a generalized transfer learning paradigm, called textbf{MA}cro-expression textbf{TO} textbf{MI}cro-expression (MA2MI). Under our paradigm, networks can learns the ability to represent subtle facial movement by reconstructing future frames. In addition, we also propose a two-branch micro-action network (MIACNet) to decouple facial position features and facial action features, which can help the network more accurately locate facial action locations. Extensive experiments on three popular MER benchmarks demonstrate the superiority of our method.

6/5/2024

Micro-Expression Recognition by Motion Feature Extraction based on Pre-training

Ruolin Li, Lu Wang, Tingting Yang, Lisheng Xu, Bingyang Ma, Yongchun Li, Hongchao Wei

Micro-expressions (MEs) are spontaneous, unconscious facial expressions that have promising applications in various fields such as psychotherapy and national security. Thus, micro-expression recognition (MER) has attracted more and more attention from researchers. Although various MER methods have emerged especially with the development of deep learning techniques, the task still faces several challenges, e.g. subtle motion and limited training data. To address these problems, we propose a novel motion extraction strategy (MoExt) for the MER task and use additional macro-expression data in the pre-training process. We primarily pretrain the feature separator and motion extractor using the contrastive loss, thus enabling them to extract representative motion features. In MoExt, shape features and texture features are first extracted separately from onset and apex frames, and then motion features related to MEs are extracted based on the shape features of both frames. To enable the model to more effectively separate features, we utilize the extracted motion features and the texture features from the onset frame to reconstruct the apex frame. Through pre-training, the module is enabled to extract inter-frame motion features of facial expressions while excluding irrelevant information. The feature separator and motion extractor are ultimately integrated into the MER network, which is then fine-tuned using the target ME data. The effectiveness of proposed method is validated on three commonly used datasets, i.e., CASME II, SMIC, SAMM, and CAS(ME)3 dataset. The results show that our method performs favorably against state-of-the-art methods.

7/11/2024

Meta-Auxiliary Learning for Micro-Expression Recognition

Jingyao Wang, Yunhan Tian, Yuxuan Yang, Xiaoxin Chen, Changwen Zheng, Wenwen Qiang

Micro-expressions (MEs) are involuntary movements revealing people's hidden feelings, which has attracted numerous interests for its objectivity in emotion detection. However, despite its wide applications in various scenarios, micro-expression recognition (MER) remains a challenging problem in real life due to three reasons, including (i) data-level: lack of data and imbalanced classes, (ii) feature-level: subtle, rapid changing, and complex features of MEs, and (iii) decision-making-level: impact of individual differences. To address these issues, we propose a dual-branch meta-auxiliary learning method, called LightmanNet, for fast and robust micro-expression recognition. Specifically, LightmanNet learns general MER knowledge from limited data through a dual-branch bi-level optimization process: (i) In the first level, it obtains task-specific MER knowledge by learning in two branches, where the first branch is for learning MER features via primary MER tasks, while the other branch is for guiding the model obtain discriminative features via auxiliary tasks, i.e., image alignment between micro-expressions and macro-expressions since their resemblance in both spatial and temporal behavioral patterns. The two branches of learning jointly constrain the model of learning meaningful task-specific MER knowledge while avoiding learning noise or superficial connections between MEs and emotions that may damage its generalization ability. (ii) In the second level, LightmanNet further refines the learned task-specific knowledge, improving model generalization and efficiency. Extensive experiments on various benchmark datasets demonstrate the superior robustness and efficiency of LightmanNet.

4/19/2024

New!Synergistic Spotting and Recognition of Micro-Expression via Temporal State Transition

Bochao Zou, Zizheng Guo, Wenfeng Qin, Xin Li, Kangsheng Wang, Huimin Ma

Micro-expressions are involuntary facial movements that cannot be consciously controlled, conveying subtle cues with substantial real-world applications. The analysis of micro-expressions generally involves two main tasks: spotting micro-expression intervals in long videos and recognizing the emotions associated with these intervals. Previous deep learning methods have primarily relied on classification networks utilizing sliding windows. However, fixed window sizes and window-level hard classification introduce numerous constraints. Additionally, these methods have not fully exploited the potential of complementary pathways for spotting and recognition. In this paper, we present a novel temporal state transition architecture grounded in the state space model, which replaces conventional window-level classification with video-level regression. Furthermore, by leveraging the inherent connections between spotting and recognition tasks, we propose a synergistic strategy that enhances overall analysis performance. Extensive experiments demonstrate that our method achieves state-of-the-art performance. The codes and pre-trained models are available at https://github.com/zizheng-guo/ME-TST.

9/17/2024