Deep Domain Adaptation for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labelled Videos

Read original: arXiv:2008.06392 - Published 7/9/2024 by R. Gnana Praveen, Eric Granger, Patrick Cardinal

🤿

Overview

This paper introduces a new deep learning model called WSDA-OR (Weakly-Supervised Domain Adaptation with Ordinal Regression) for estimating pain intensity from facial expressions in videos.
The model is designed to address challenges in facial expression recognition, such as subjective variations and differences in capture conditions across datasets.
The paper explores the use of weakly-supervised learning and domain adaptation techniques to improve the accuracy of pain intensity estimation.

Plain English Explanation

The paper describes a new deep learning model that can estimate the intensity of pain experienced by a person based on the expressions on their face in a video. This is an important capability for healthcare applications, as being able to accurately assess pain levels can help with diagnosis and treatment.

One of the challenges in this area is that people express pain in different ways, and the quality of the video footage can vary depending on how and where it was captured. This can make it difficult for AI models to accurately recognize the facial expressions associated with different levels of pain. To address this, the researchers used weakly-supervised learning and domain adaptation techniques.

Weakly-supervised learning means the model is trained on data that has only coarse or incomplete labels, rather than needing detailed annotations. This can make the training process more efficient, as collecting and labeling large amounts of video data is a laborious task.

Domain adaptation allows the model to adapt to differences between the training data (the "source" domain) and the data it will be used on in the real world (the "target" domain). This helps the model perform well even when the new data has characteristics that differ from the original training data.

The key innovation in this paper is the WSDA-OR model, which combines weakly-supervised learning with ordinal regression - a technique that recognizes the inherent ordering of pain intensity levels. The model also leverages the temporal coherence of multiple frames in a video sequence to improve its estimates.

Technical Explanation

The WSDA-OR model is designed to address the challenges of domain shift and weak supervision in facial expression-based pain intensity estimation. It learns discriminant and domain-invariant feature representations by integrating multiple instance learning with deep adversarial domain adaptation.

The model is trained on a fully-labeled source domain dataset (RECOLA) and then adapted to a weakly-labeled target domain dataset (UNBC-McMaster shoulder pain). The target domain labels only provide coarse information about the pain intensity levels, rather than precise annotations.

WSDA-OR enforces the ordinal relationships among the intensity levels assigned to the target sequences and associates multiple relevant frames to the sequence-level labels. It uses soft Gaussian labels to efficiently represent the weak ordinal sequence-level labels from the target domain.

The proposed approach was validated not only on the RECOLA and UNBC-McMaster datasets, but also on the BIOVID and Fatigue datasets for sequence-level pain intensity estimation.

Critical Analysis

The paper presents a compelling approach to addressing the challenges of weakly-supervised learning and domain adaptation in the context of pain intensity estimation from facial expressions.

One potential limitation is the reliance on coarse, sequence-level labels in the target domain, which may not capture the full complexity of pain expression. The authors acknowledge this and suggest that weakly-supervised test-time domain adaptation could be explored to further improve performance.

Additionally, the paper does not provide extensive details on the architectural choices and hyperparameter tuning of the WSDA-OR model. This makes it difficult to fully evaluate the model's complexity and the potential for overfitting or other issues.

Overall, the research presented in this paper represents a valuable contribution to the field of facial expression-based pain estimation, and the WSDA-OR model shows promise for practical healthcare applications. Further research and validation on larger and more diverse datasets would help to solidify the model's capabilities and generalizability.

Conclusion

This paper introduces a new deep learning model, WSDA-OR, that can effectively estimate pain intensity from facial expressions in video data, even when the training and test data come from different domains and the target domain labels are only weakly-supervised.

The key innovations of the WSDA-OR model include the integration of weakly-supervised learning, ordinal regression, and deep adversarial domain adaptation. This allows the model to learn discriminant and domain-invariant feature representations that can accurately assess pain levels, while overcoming the challenges of subjective variations in facial expressions and differences in data capture conditions.

The successful validation of the WSDA-OR model on multiple datasets suggests that it has the potential to significantly improve the accuracy and practical deployment of facial expression-based pain estimation systems in healthcare applications. Further research and development in this area could lead to more effective pain monitoring and management solutions, benefiting both patients and healthcare providers.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

Deep Domain Adaptation for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labelled Videos

R. Gnana Praveen, Eric Granger, Patrick Cardinal

Estimation of pain intensity from facial expressions captured in videos has an immense potential for health care applications. Given the challenges related to subjective variations of facial expressions, and operational capture conditions, the accuracy of state-of-the-art DL models for recognizing facial expressions may decline. Domain adaptation has been widely explored to alleviate the problem of domain shifts that typically occur between video data captured across various source and target domains. Moreover, given the laborious task of collecting and annotating videos, and subjective bias due to ambiguity among adjacent intensity levels, weakly-supervised learning is gaining attention in such applications. State-of-the-art WSL models are typically formulated as regression problems, and do not leverage the ordinal relationship among pain intensity levels, nor temporal coherence of multiple consecutive frames. This paper introduces a new DL model for weakly-supervised DA with ordinal regression that can be adapted using target domain videos with coarse labels provided on a periodic basis. The WSDA-OR model enforces ordinal relationships among intensity levels assigned to target sequences, and associates multiple relevant frames to sequence-level labels. In particular, it learns discriminant and domain-invariant feature representations by integrating multiple instance learning with deep adversarial DA, where soft Gaussian labels are used to efficiently represent the weak ordinal sequence-level labels from target domain. The proposed approach was validated using RECOLA video dataset as fully-labeled source domain data, and UNBC-McMaster shoulder pain video dataset as weakly-labeled target domain data. We have also validated WSDA-OR on BIOVID and Fatigue datasets for sequence level estimation.

7/9/2024

🤿

Deep Weakly-Supervised Domain Adaptation for Pain Localization in Videos

R. Gnana Praveen, Eric Granger, Patrick Cardinal

Automatic pain assessment has an important potential diagnostic value for populations that are incapable of articulating their pain experiences. As one of the dominating nonverbal channels for eliciting pain expression events, facial expressions has been widely investigated for estimating the pain intensity of individual. However, using state-of-the-art deep learning (DL) models in real-world pain estimation applications poses several challenges related to the subjective variations of facial expressions, operational capture conditions, and lack of representative training videos with labels. Given the cost of annotating intensity levels for every video frame, we propose a weakly-supervised domain adaptation (WSDA) technique that allows for training 3D CNNs for spatio-temporal pain intensity estimation using weakly labeled videos, where labels are provided on a periodic basis. In particular, WSDA integrates multiple instance learning into an adversarial deep domain adaptation framework to train an Inflated 3D-CNN (I3D) model such that it can accurately estimate pain intensities in the target operational domain. The training process relies on weak target loss, along with domain loss and source loss for domain adaptation of the I3D model. Experimental results obtained using labeled source domain RECOLA videos and weakly-labeled target domain UNBC-McMaster videos indicate that the proposed deep WSDA approach can achieve significantly higher level of sequence (bag)-level and frame (instance)-level pain localization accuracy than related state-of-the-art approaches.

7/9/2024

👁️

Subject-Based Domain Adaptation for Facial Expression Recognition

Muhammad Osama Zeeshan, Muhammad Haseeb Aslam, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

Adapting a deep learning model to a specific target individual is a challenging facial expression recognition (FER) task that may be achieved using unsupervised domain adaptation (UDA) methods. Although several UDA methods have been proposed to adapt deep FER models across source and target data sets, multiple subject-specific source domains are needed to accurately represent the intra- and inter-person variability in subject-based adaption. This paper considers the setting where domains correspond to individuals, not entire datasets. Unlike UDA, multi-source domain adaptation (MSDA) methods can leverage multiple source datasets to improve the accuracy and robustness of the target model. However, previous methods for MSDA adapt image classification models across datasets and do not scale well to a more significant number of source domains. This paper introduces a new MSDA method for subject-based domain adaptation in FER. It efficiently leverages information from multiple source subjects (labeled source domain data) to adapt a deep FER model to a single target individual (unlabeled target domain data). During adaptation, our subject-based MSDA first computes a between-source discrepancy loss to mitigate the domain shift among data from several source subjects. Then, a new strategy is employed to generate augmented confident pseudo-labels for the target subject, allowing a reduction in the domain shift between source and target subjects. Experiments performed on the challenging BioVid heat and pain dataset with 87 subjects and the UNBC-McMaster shoulder pain dataset with 25 subjects show that our subject-based MSDA can outperform state-of-the-art methods yet scale well to multiple subject-based source domains.

4/30/2024

Source-Free Domain Adaptation of Weakly-Supervised Object Localization Models for Histology

Alexis Guichemerre, Soufiane Belharbi, Tsiry Mayet, Shakeeb Murtaza, Pourya Shamsolmoali, Luke McCaffrey, Eric Granger

Given the emergence of deep learning, digital pathology has gained popularity for cancer diagnosis based on histology images. Deep weakly supervised object localization (WSOL) models can be trained to classify histology images according to cancer grade and identify regions of interest (ROIs) for interpretation, using inexpensive global image-class annotations. A WSOL model initially trained on some labeled source image data can be adapted using unlabeled target data in cases of significant domain shifts caused by variations in staining, scanners, and cancer type. In this paper, we focus on source-free (unsupervised) domain adaptation (SFDA), a challenging problem where a pre-trained source model is adapted to a new target domain without using any source domain data for privacy and efficiency reasons. SFDA of WSOL models raises several challenges in histology, most notably because they are not intended to adapt for both classification and localization tasks. In this paper, 4 state-of-the-art SFDA methods, each one representative of a main SFDA family, are compared for WSOL in terms of classification and localization accuracy. They are the SFDA-Distribution Estimation, Source HypOthesis Transfer, Cross-Domain Contrastive Learning, and Adaptively Domain Statistics Alignment. Experimental results on the challenging Glas (smaller, breast cancer) and Camelyon16 (larger, colon cancer) histology datasets indicate that these SFDA methods typically perform poorly for localization after adaptation when optimized for classification.

5/14/2024