A Light-weight Deep Human Activity Recognition Algorithm Using Multi-knowledge Distillation

Read original: arXiv:2107.07331 - Published 9/16/2024 by Runze Chen, Haiyong Luo, Fang Zhao, Xuechun Meng, Zhiqing Xie, Yida Zhu

🤿

Overview

This paper presents a novel approach called Stage-Logits-Memory Distillation (SMLDist) for human activity recognition (HAR) using inertial sensors.
HAR is the foundation of many human-centered mobile applications, and deep learning-based models can enable accurate classification in complex scenarios.
However, existing fine-grained deep HAR models have large storage and computational requirements, hindering their deployment on resource-limited platforms.
The authors leverage knowledge distillation to design a more efficient and robust HAR model based on MobileNet.

Plain English Explanation

The paper describes a new technique called SMLDist for recognizing human activities using data from motion sensors. Recognizing human activities is important for many mobile apps that are focused on people, like fitness trackers or activity monitors. Deep learning models can do a great job of classifying activities accurately, even in complex situations.

The problem is that the existing deep learning models for this task are very large and computationally intensive. This makes them difficult to use on devices with limited resources, like smartphones or wearables. The authors of this paper wanted to create a more efficient model that could still maintain high accuracy.

They did this by using a technique called knowledge distillation. This involves training a smaller "student" model to mimic the behavior of a larger, more complex "teacher" model. The key innovation in this work is that they paid special attention to the frequency-related features during the distillation process, which helped make the student model more robust at classifying activities.

They also developed an automatic way to combine different types of classifiers to further improve the performance of the student model. Through extensive testing, the authors showed that their SMLDist model outperformed other state-of-the-art HAR frameworks in terms of accuracy and efficiency, making it well-suited for deployment on resource-constrained devices.

Technical Explanation

The authors designed the SMLDist pipeline to address the limitations of existing fine-grained deep HAR models. It is based on the widely-used MobileNet architecture and leverages knowledge distillation to improve the model's efficiency and robustness.

The key components of SMLDist are:

Stage-level distillation: Distilling knowledge from the teacher model at different stages of the network to capture diverse feature representations.
Logits-level distillation: Matching the logits (pre-softmax activations) of the student and teacher models to preserve the relative class probabilities.
Memory-level distillation: Incorporating the teacher's internal feature representations to guide the student's training.

By paying more attention to the frequency-related features during distillation, SMLDist improves the HAR classification robustness of the student model. The authors also propose an auto-search mechanism to combine heterogeneous classifiers, further enhancing the student's performance.

Extensive experiments on public datasets and real-world deployment on the Jetson Xavier AGX platform demonstrate that SMLDist outperforms various state-of-the-art HAR frameworks in terms of accuracy, F1 score, and computational/energy efficiency. The comparative experiments on other classification tasks also validate the good generalization of the SMLDist approach.

Critical Analysis

The paper presents a well-designed and comprehensive study on improving the efficiency and robustness of deep learning-based HAR models through knowledge distillation.

One potential limitation is that the authors only evaluated SMLDist on inertial sensor-based HAR tasks. It would be interesting to see how the approach performs on other sensor modalities, such as vision-based or multimodal HAR.

Additionally, the paper does not provide much insight into the inner workings of the auto-search mechanism for the heterogeneous classifiers. A more detailed explanation of this component would help readers better understand its contribution to the overall performance.

Furthermore, the authors could have explored the transferability of the SMLDist approach to other domains beyond HAR, such as ambient sensor-based activity recognition, to demonstrate its broader applicability.

Overall, the SMLDist framework presents a promising direction for deploying accurate and efficient deep learning models on resource-constrained platforms, and the paper provides a solid foundation for future research in this area.

Conclusion

This paper introduces a novel Stage-Logits-Memory Distillation (SMLDist) approach for human activity recognition using inertial sensors. By leveraging knowledge distillation and an auto-search mechanism for heterogeneous classifiers, the authors were able to develop a highly efficient and robust deep learning model that outperforms state-of-the-art HAR frameworks.

The practical evaluation on the Jetson Xavier AGX platform demonstrated the energy-efficiency and computation-efficiency of the SMLDist model, making it well-suited for deployment on resource-constrained devices. The broader applicability of the SMLDist approach was also validated through comparative experiments on other classification tasks.

This work contributes to the ongoing efforts to enable accurate and efficient human activity recognition for a wide range of mobile and wearable applications, with the potential for significant impact on the field of human-centered computing.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

A Light-weight Deep Human Activity Recognition Algorithm Using Multi-knowledge Distillation

Runze Chen, Haiyong Luo, Fang Zhao, Xuechun Meng, Zhiqing Xie, Yida Zhu

Inertial sensor-based human activity recognition (HAR) is the base of many human-centered mobile applications. Deep learning-based fine-grained HAR models enable accurate classification in various complex application scenarios. Nevertheless, the large storage and computational overhead of the existing fine-grained deep HAR models hinder their widespread deployment on resource-limited platforms. Inspired by the knowledge distillation's reasonable model compression and potential performance improvement capability, we design a multi-level HAR modeling pipeline called Stage-Logits-Memory Distillation (SMLDist) based on the widely-used MobileNet. By paying more attention to the frequency-related features during the distillation process, the SMLDist improves the HAR classification robustness of the students. We also propose an auto-search mechanism in the heterogeneous classifiers to improve classification performance. Extensive simulation results demonstrate that SMLDist outperforms various state-of-the-art HAR frameworks in accuracy and F1 macro score. The practical evaluation of the Jetson Xavier AGX platform shows that the SMLDist model is both energy-efficient and computation-efficient. These experiments validate the reasonable balance between the robustness and efficiency of the proposed model. The comparative experiments of knowledge distillation on six public datasets also demonstrate that the SMLDist outperforms other advanced knowledge distillation methods of students' performance, which verifies the good generalization of the SMLDist on other classification tasks, including but not limited to HAR.

9/16/2024

TSAK: Two-Stage Semantic-Aware Knowledge Distillation for Efficient Wearable Modality and Model Optimization in Manufacturing Lines

Hymalai Bello, Daniel Gei{ss}ler, Sungho Suh, Bo Zhou, Paul Lukowicz

Smaller machine learning models, with less complex architectures and sensor inputs, can benefit wearable sensor-based human activity recognition (HAR) systems in many ways, from complexity and cost to battery life. In the specific case of smart factories, optimizing human-robot collaboration hinges on the implementation of cutting-edge, human-centric AI systems. To this end, workers' activity recognition enables accurate quantification of performance metrics, improving efficiency holistically. We present a two-stage semantic-aware knowledge distillation (KD) approach, TSAK, for efficient, privacy-aware, and wearable HAR in manufacturing lines, which reduces the input sensor modalities as well as the machine learning model size, while reaching similar recognition performance as a larger multi-modal and multi-positional teacher model. The first stage incorporates a teacher classifier model encoding attention, causal, and combined representations. The second stage encompasses a semantic classifier merging the three representations from the first stage. To evaluate TSAK, we recorded a multi-modal dataset at a smart factory testbed with wearable and privacy-aware sensors (IMU and capacitive) located on both workers' hands. In addition, we evaluated our approach on OpenPack, the only available open dataset mimicking the wearable sensor placements on both hands in the manufacturing HAR scenario. We compared several KD strategies with different representations to regulate the training process of a smaller student model. Compared to the larger teacher model, the student model takes fewer sensor channels from a single hand, has 79% fewer parameters, runs 8.88 times faster, and requires 96.6% less computing power (FLOPS).

8/27/2024

SoK: Behind the Accuracy of Complex Human Activity Recognition Using Deep Learning

Duc-Anh Nguyen, Nhien-An Le-Khac

Human Activity Recognition (HAR) is a well-studied field with research dating back to the 1980s. Over time, HAR technologies have evolved significantly from manual feature extraction, rule-based algorithms, and simple machine learning models to powerful deep learning models, from one sensor type to a diverse array of sensing modalities. The scope has also expanded from recognising a limited set of activities to encompassing a larger variety of both simple and complex activities. However, there still exist many challenges that hinder advancement in complex activity recognition using modern deep learning methods. In this paper, we comprehensively systematise factors leading to inaccuracy in complex HAR, such as data variety and model capacity. Among many sensor types, we give more attention to wearable and camera due to their prevalence. Through this Systematisation of Knowledge (SoK) paper, readers can gain a solid understanding of the development history and existing challenges of HAR, different categorisations of activities, obstacles in deep learning-based complex HAR that impact accuracy, and potential research directions.

5/7/2024

A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities

Jungpil Shin, Najmul Hassan, Abu Saleh Musa Miah1, Satoshi Nishimura

Human Activity Recognition (HAR) systems aim to understand human behaviour and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. Each modality provides unique and complementary information suited to different application scenarios. Consequently, numerous studies have investigated diverse approaches for HAR using these modalities. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2024, focusing on machine learning (ML) and deep learning (DL) approaches categorized by input data modalities. We review both single-modality and multi-modality techniques, highlighting fusion-based and co-learning frameworks. Additionally, we cover advancements in hand-crafted action features, methods for recognizing human-object interactions, and activity detection. Our survey includes a detailed dataset description for each modality and a summary of the latest HAR systems, offering comparative results on benchmark datasets. Finally, we provide insightful observations and propose effective future research directions in HAR.

9/17/2024