Unsupervised Statistical Feature-Guided Diffusion Model for Sensor-based Human Activity Recognition

2306.05285

Published 5/21/2024 by Si Zuo, Vitor Fortes Rey, Sungho Suh, Stephan Sigg, Paul Lukowicz

🤷

Abstract

Human activity recognition (HAR) from on-body sensors is a core functionality in many AI applications: from personal health, through sports and wellness to Industry 4.0. A key problem holding up progress in wearable sensor-based HAR, compared to other ML areas, such as computer vision, is the unavailability of diverse and labeled training data. Particularly, while there are innumerable annotated images available in online repositories, freely available sensor data is sparse and mostly unlabeled. We propose an unsupervised statistical feature-guided diffusion model specifically optimized for wearable sensor-based human activity recognition with devices such as inertial measurement unit (IMU) sensors. The method generates synthetic labeled time-series sensor data without relying on annotated training data. Thereby, it addresses the scarcity and annotation difficulties associated with real-world sensor data. By conditioning the diffusion model on statistical information such as mean, standard deviation, Z-score, and skewness, we generate diverse and representative synthetic sensor data. We conducted experiments on public human activity recognition datasets and compared the method to conventional oversampling and state-of-the-art generative adversarial network methods. Experimental results demonstrate that this can improve the performance of human activity recognition and outperform existing techniques.

Create account to get full access

Overview

Human activity recognition (HAR) from wearable sensors is a crucial capability for many AI applications, but progress has been hampered by a lack of diverse and labeled training data.
Researchers propose an unsupervised statistical feature-guided diffusion model to generate synthetic labeled sensor data for wearable HAR, addressing the data scarcity problem.
The method conditions the diffusion model on statistical properties of real sensor data to generate diverse and representative synthetic data without relying on annotated training samples.

Plain English Explanation

Recognizing human activities from data collected by wearable sensors, like smartwatches or fitness trackers, has many applications in areas like personal health, sports, and industrial automation. However, this sensor-based HAR has lagged behind advancements in computer vision because there is much less labeled training data available for sensor data compared to images.

To address this data scarcity issue, the researchers developed a new method that can generate synthetic, labeled sensor data without relying on real annotated samples. Their approach uses a type of machine learning model called a "diffusion model," which learns to transform random noise into realistic data by following statistical patterns in the training data.

By conditioning the diffusion model on simple statistical properties of real sensor data, like the average and spread of the measurements, the researchers were able to generate diverse and representative synthetic sensor data for HAR. This allows training machine learning models for HAR even when there is limited real annotated sensor data available.

Technical Explanation

The key innovation of this research is the use of a diffusion model, a type of generative model, that is conditioned on statistical features of the target sensor data distribution. Diffusion models work by learning to reverse a process of gradually adding noise to data, allowing them to generate new samples that match the statistics of the original training data.

The researchers trained their diffusion model on publicly available HAR datasets, but instead of just feeding in the raw sensor data, they also provided the model with summary statistics like the mean, standard deviation, z-score, and skewness of the data. This guided the diffusion process to generate synthetic sensor data that matched the key statistical properties of real activity data, without needing any labeled training samples.

The researchers compared their approach to conventional data augmentation techniques as well as state-of-the-art generative adversarial network (GAN) models for sensor data synthesis. Physical-Aware Cross-Modal Adversarial Network for Wearable is an example of a GAN-based method for generating synthetic sensor data. Experiments showed that the proposed diffusion model outperformed these alternatives in boosting the performance of downstream HAR models.

Critical Analysis

The main strength of this research is its ability to generate diverse and representative synthetic sensor data for HAR without requiring any labeled training samples. This addresses a key bottleneck in advancing sensor-based activity recognition that has limited progress compared to other machine learning domains like computer vision.

However, the paper does not provide a deep analysis of the limitations of the proposed method. For example, it is unclear how well the synthetic data would generalize to real-world sensor data collected in unconstrained environments, as the experiments were conducted on relatively curated public datasets. Additionally, the paper does not explore how sensitive the diffusion model's performance is to the choice of statistical features used to condition the generation process.

Further research could investigate the robustness of this approach, exploring its ability to handle noisy, incomplete, or heterogeneous sensor data, as is often the case in real-world wearable HAR applications. Comparisons to other data synthesis techniques, such as variational autoencoders or flow-based models, could also provide additional insights.

Conclusion

This research presents a promising approach to address the data scarcity problem in wearable sensor-based human activity recognition. By leveraging a statistical feature-guided diffusion model, the method can generate diverse and representative synthetic sensor data without relying on annotated training samples.

The ability to create labeled synthetic data has the potential to significantly accelerate the development of practical HAR systems, enabling their deployment in a wide range of applications, from personal wellness tracking to industrial automation. As the field of wearable computing continues to evolve, techniques like the one proposed in this paper could play a crucial role in unlocking the full potential of sensor-based human activity recognition.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Sensor Data Augmentation from Skeleton Pose Sequences for Improving Human Activity Recognition

Parham Zolfaghari, Vitor Fortes Rey, Lala Ray, Hyun Kim, Sungho Suh, Paul Lukowicz

The proliferation of deep learning has significantly advanced various fields, yet Human Activity Recognition (HAR) has not fully capitalized on these developments, primarily due to the scarcity of labeled datasets. Despite the integration of advanced Inertial Measurement Units (IMUs) in ubiquitous wearable devices like smartwatches and fitness trackers, which offer self-labeled activity data from users, the volume of labeled data remains insufficient compared to domains where deep learning has achieved remarkable success. Addressing this gap, in this paper, we propose a novel approach to improve wearable sensor-based HAR by introducing a pose-to-sensor network model that generates sensor data directly from 3D skeleton pose sequences. our method simultaneously trains the pose-to-sensor network and a human activity classifier, optimizing both data reconstruction and activity recognition. Our contributions include the integration of simultaneous training, direct pose-to-sensor generation, and a comprehensive evaluation on the MM-Fit dataset. Experimental results demonstrate the superiority of our framework with significant performance improvements over baseline methods.

6/26/2024

eess.SP cs.CV cs.LG

👁️

Wearable-based behaviour interpolation for semi-supervised human activity recognition

Haoran Duan, Shidong Wang, Varun Ojha, Shizheng Wang, Yawen Huang, Yang Long, Rajiv Ranjan, Yefeng Zheng

While traditional feature engineering for Human Activity Recognition (HAR) involves a trial-anderror process, deep learning has emerged as a preferred method for high-level representations of sensor-based human activities. However, most deep learning-based HAR requires a large amount of labelled data and extracting HAR features from unlabelled data for effective deep learning training remains challenging. We, therefore, introduce a deep semi-supervised HAR approach, MixHAR, which concurrently uses labelled and unlabelled activities. Our MixHAR employs a linear interpolation mechanism to blend labelled and unlabelled activities while addressing both inter- and intra-activity variability. A unique challenge identified is the activityintrusion problem during mixing, for which we propose a mixing calibration mechanism to mitigate it in the feature embedding space. Additionally, we rigorously explored and evaluated the five conventional/popular deep semi-supervised technologies on HAR, acting as the benchmark of deep semi-supervised HAR. Our results demonstrate that MixHAR significantly improves performance, underscoring the potential of deep semi-supervised techniques in HAR.

5/28/2024

cs.CV

👁️

Self-supervised Learning for Human Activity Recognition Using 700,000 Person-days of Wearable Data

Hang Yuan, Shing Chan, Andrew P. Creagh, Catherine Tong, Aidan Acquah, David A. Clifton, Aiden Doherty

Advances in deep learning for human activity recognition have been relatively limited due to the lack of large labelled datasets. In this study, we leverage self-supervised learning techniques on the UK-Biobank activity tracker dataset--the largest of its kind to date--containing more than 700,000 person-days of unlabelled wearable sensor data. Our resulting activity recognition model consistently outperformed strong baselines across seven benchmark datasets, with an F1 relative improvement of 2.5%-100% (median 18.4%), the largest improvements occurring in the smaller datasets. In contrast to previous studies, our results generalise across external datasets, devices, and environments. Our open-source model will help researchers and developers to build customisable and generalisable activity classifiers with high performance.

6/21/2024

eess.SP cs.AI cs.LG

👁️

Human Activity Recognition from Wearable Sensor Data Using Self-Attention

Saif Mahmud, M Tanjid Hasan Tonmoy, Kishor Kumar Bhaumik, A K M Mahbubur Rahman, M Ashraful Amin, Mohammad Shoyaib, Muhammad Asif Hossain Khan, Amin Ahsan Ali

Human Activity Recognition from body-worn sensor data poses an inherent challenge in capturing spatial and temporal dependencies of time-series signals. In this regard, the existing recurrent or convolutional or their hybrid models for activity recognition struggle to capture spatio-temporal context from the feature space of sensor reading sequence. To address this complex problem, we propose a self-attention based neural network model that foregoes recurrent architectures and utilizes different types of attention mechanisms to generate higher dimensional feature representation used for classification. We performed extensive experiments on four popular publicly available HAR datasets: PAMAP2, Opportunity, Skoda and USC-HAD. Our model achieve significant performance improvement over recent state-of-the-art models in both benchmark test subjects and Leave-one-subject-out evaluation. We also observe that the sensor attention maps produced by our model is able capture the importance of the modality and placement of the sensors in predicting the different activity classes.

4/23/2024

cs.CV cs.AI cs.LG stat.ML