GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition

Read original: arXiv:2305.19700 - Published 6/19/2024 by Haijun Xiong, Yunze Deng, Bin Feng, Xinggang Wang, Wenyu Liu

✨

Overview

Gait recognition is a growing field in biometric identification that uses an individual's distinct walking patterns for accurate identification.
Existing methods in gait recognition lack the incorporation of temporal information, which is crucial for understanding walking patterns.
This paper introduces a novel framework called GaitGS that aggregates temporal features simultaneously in both granularity and span dimensions to improve gait recognition.

Plain English Explanation

The way people walk is unique to each individual, and this can be used to identify them. GaitGS is a new system that recognizes people by their walking patterns. Previous methods in this field didn't fully consider the timing of how people move when walking. GaitGS looks at the timing of walking movements at different levels of detail and over different time periods. This helps capture the nuances of how each person walks, which improves the accuracy of identification.

Technical Explanation

The paper proposes a novel framework called GaitGS that aggregates temporal features simultaneously in both granularity and span dimensions to improve gait recognition. The Multi-Granularity Feature Extractor (MGFE) is designed to capture micro-motion and macro-motion information at fine and coarse levels, respectively. The Multi-Span Feature Extractor (MSFE) generates local and global temporal representations. Through extensive experiments on two datasets, GaitGS demonstrates state-of-the-art performance, achieving Rank-1 accuracy of 98.2%, 96.5%, and 89.7% on CASIA-B under different conditions, and 97.6% on OU-MVLP.

Critical Analysis

The paper provides a thorough evaluation of the GaitGS framework on multiple datasets, demonstrating its superior performance compared to existing methods. However, the authors do not discuss potential limitations or areas for further research. It would be interesting to see how GaitGS performs in real-world scenarios with more diverse and challenging data, as well as its implications for privacy and ethical considerations in biometric identification.

Conclusion

This paper presents a novel gait recognition framework, GaitGS, that effectively captures temporal information at multiple granularities and spans. By incorporating these temporal features, the system achieves state-of-the-art performance on benchmark datasets, demonstrating the importance of temporal modeling in gait recognition. As biometric identification technologies continue to advance, GaitGS represents a promising step forward in the field of gait recognition.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition

Haijun Xiong, Yunze Deng, Bin Feng, Xinggang Wang, Wenyu Liu

Gait recognition, a growing field in biological recognition technology, utilizes distinct walking patterns for accurate individual identification. However, existing methods lack the incorporation of temporal information. To reach the full potential of gait recognition, we advocate for the consideration of temporal features at varying granularities and spans. This paper introduces a novel framework, GaitGS, which aggregates temporal features simultaneously in both granularity and span dimensions. Specifically, the Multi-Granularity Feature Extractor (MGFE) is designed to capture micro-motion and macro-motion information at fine and coarse levels respectively, while the Multi-Span Feature Extractor (MSFE) generates local and global temporal representations. Through extensive experiments on two datasets, our method demonstrates state-of-the-art performance, achieving Rank-1 accuracy of 98.2%, 96.5%, and 89.7% on CASIA-B under different conditions, and 97.6% on OU-MVLP. The source code will be available at https://github.com/Haijun-Xiong/GaitGS.

6/19/2024

GaitMA: Pose-guided Multi-modal Feature Fusion for Gait Recognition

Fanxu Min, Shaoxiang Guo, Fan Hao, Junyu Dong

Gait recognition is a biometric technology that recognizes the identity of humans through their walking patterns. Existing appearance-based methods utilize CNN or Transformer to extract spatial and temporal features from silhouettes, while model-based methods employ GCN to focus on the special topological structure of skeleton points. However, the quality of silhouettes is limited by complex occlusions, and skeletons lack dense semantic features of the human body. To tackle these problems, we propose a novel gait recognition framework, dubbed Gait Multi-model Aggregation Network (GaitMA), which effectively combines two modalities to obtain a more robust and comprehensive gait representation for recognition. First, skeletons are represented by joint/limb-based heatmaps, and features from silhouettes and skeletons are respectively extracted using two CNN-based feature extractors. Second, a co-attention alignment module is proposed to align the features by element-wise attention. Finally, we propose a mutual learning module, which achieves feature fusion through cross-attention, Wasserstein loss is further introduced to ensure the effective fusion of two modalities. Extensive experimental results demonstrate the superiority of our model on Gait3D, OU-MVLP, and CASIA-B.

7/23/2024

Causality-inspired Discriminative Feature Learning in Triple Domains for Gait Recognition

Haijun Xiong, Bin Feng, Xinggang Wang, Wenyu Liu

Gait recognition is a biometric technology that distinguishes individuals by their walking patterns. However, previous methods face challenges when accurately extracting identity features because they often become entangled with non-identity clues. To address this challenge, we propose CLTD, a causality-inspired discriminative feature learning module designed to effectively eliminate the influence of confounders in triple domains, ie, spatial, temporal, and spectral. Specifically, we utilize the Cross Pixel-wise Attention Generator (CPAG) to generate attention distributions for factual and counterfactual features in spatial and temporal domains. Then, we introduce the Fourier Projection Head (FPH) to project spatial features into the spectral space, which preserves essential information while reducing computational costs. Additionally, we employ an optimization method with contrastive learning to enforce semantic consistency constraints across sequences from the same subject. Our approach has demonstrated significant performance improvements on challenging datasets, proving its effectiveness. Moreover, it can be seamlessly integrated into existing gait recognition methods.

7/18/2024

GLGait: A Global-Local Temporal Receptive Field Network for Gait Recognition in the Wild

Guozhen Peng, Yunhong Wang, Yuwei Zhao, Shaoxiong Zhang, Annan Li

Gait recognition has attracted increasing attention from academia and industry as a human recognition technology from a distance in non-intrusive ways without requiring cooperation. Although advanced methods have achieved impressive success in lab scenarios, most of them perform poorly in the wild. Recently, some Convolution Neural Networks (ConvNets) based methods have been proposed to address the issue of gait recognition in the wild. However, the temporal receptive field obtained by convolution operations is limited for long gait sequences. If directly replacing convolution blocks with visual transformer blocks, the model may not enhance a local temporal receptive field, which is important for covering a complete gait cycle. To address this issue, we design a Global-Local Temporal Receptive Field Network (GLGait). GLGait employs a Global-Local Temporal Module (GLTM) to establish a global-local temporal receptive field, which mainly consists of a Pseudo Global Temporal Self-Attention (PGTA) and a temporal convolution operation. Specifically, PGTA is used to obtain a pseudo global temporal receptive field with less memory and computation complexity compared with a multi-head self-attention (MHSA). The temporal convolution operation is used to enhance the local temporal receptive field. Besides, it can also aggregate pseudo global temporal receptive field to a true holistic temporal receptive field. Furthermore, we also propose a Center-Augmented Triplet Loss (CTL) in GLGait to reduce the intra-class distance and expand the positive samples in the training stage. Extensive experiments show that our method obtains state-of-the-art results on in-the-wild datasets, $i.e.$, Gait3D and GREW. The code is available at https://github.com/bgdpgz/GLGait.

8/14/2024