Real-time, accurate, and open source upper-limb musculoskeletal analysis using a single RGBD camera

2406.10007

Published 6/17/2024 by Amedeo Ceglia, Kael Facon, Mickael Begon, Lama Seoud

Real-time, accurate, and open source upper-limb musculoskeletal analysis using a single RGBD camera

Abstract

Biomechanical biofeedback may enhance rehabilitation and provide clinicians with more objective task evaluation. These feedbacks often rely on expensive motion capture systems, which restricts their widespread use, leading to the development of computer vision-based methods. These methods are subject to large joint angle errors, considering the upper limb, and exclude the scapula and clavicle motion in the analysis. Our open-source approach offers a user-friendly solution for high-fidelity upper-limb kinematics using a single low-cost RGBD camera and includes semi-automatic skin marker labeling. Real-time biomechanical analysis, ranging from kinematics to muscle force estimation, was conducted on eight participants performing a hand-cycling motion to demonstrate the applicability of our approach on the upper limb. Markers were recorded by the RGBD camera and an optoelectronic camera system, considered as a reference. Muscle activity and external load were recorded using eight EMG and instrumented hand pedals, respectively. Bland-Altman analysis revealed significant agreements in the 3D markers' positions between the two motion capture methods, with errors averaging 3.3$pm$3.9 mm. For the biomechanical analysis, the level of agreement was sensitive to whether the same marker set was used. For example, joint angle differences averaging 2.3$pm$2.8{deg} when using the same marker set, compared to 4.5$pm$2.9{deg} otherwise. Biofeedback from the RGBD camera was provided at 63 Hz. Our study introduces a novel method for using an RGBD camera as a low-cost motion capture solution, emphasizing its potential for accurate kinematic reconstruction and comprehensive upper-limb biomechanical studies.

Create account to get full access

Overview

This paper presents a real-time, accurate, and open-source system for upper-limb musculoskeletal analysis using a single RGB-D (color and depth) camera.
The system can track the 3D pose of the upper limbs, including the shoulders, elbows, and wrists, and estimate muscle activity in real-time.
The researchers developed a novel deep learning-based approach that can handle occlusions and does not require complex sensor setups or subject calibration.

Plain English Explanation

The paper describes a new way to analyze the movement and muscle activity of a person's upper body using just a single camera that can sense both color and depth information. This builds on previous research on using motion capture, wearable sensors, and computer vision to track body movements and muscle activity.

The key innovation is that this system can track the 3D position of the shoulders, elbows, and wrists in real-time, and also estimate the muscle activity, all using just a single camera. Previous systems often required multiple cameras or wearable sensors, which can be complex to set up.

The researchers developed a new deep learning model that can handle situations where parts of the arm are blocked from the camera's view, and doesn't require the user to go through a complex calibration process. This builds on prior work using machine learning to predict upper limb motion.

This system could be useful for applications like physical therapy, sports training, and ergonomics, where being able to accurately monitor arm and muscle movements is important. The open-source nature also allows others to build on this research.

Technical Explanation

The researchers developed a deep learning-based pipeline that takes input from a single RGB-D camera and can accurately track the 3D pose of the upper limbs in real-time. This builds on previous work on hand pose tracking using IMUs and video.

The key components include:

A 3D pose estimation model that can handle occlusions and does not require subject-specific calibration.
A muscle activity estimation model that can infer muscle activation levels from the 3D pose data.
An end-to-end framework that integrates these components for real-time analysis.

The researchers evaluated their system on multiple benchmarks and found that it outperformed existing methods in terms of accuracy and robustness to occlusions. They also demonstrated the system's capabilities in various application scenarios.

Critical Analysis

The paper presents a compelling approach for markerless, real-time upper-limb analysis using a single RGB-D camera. The use of deep learning to handle occlusions and eliminate the need for subject calibration is a notable advance over prior work. This builds on previous research exploring automated skeletal movement assessment.

However, the paper does not discuss the potential computational requirements of the deep learning models, which could limit the system's deployability on resource-constrained devices. Additionally, the authors do not provide a thorough analysis of the system's performance in diverse real-world environments and scenarios.

Further research could explore techniques to optimize the model for efficient inference, as well as more extensive validation of the system's robustness and generalizability. Integrating the system with other modalities, such as wearable sensors, could also enhance its capabilities and provide a more comprehensive understanding of upper-limb biomechanics.

Conclusion

This paper presents a novel, open-source system for real-time, accurate upper-limb musculoskeletal analysis using a single RGB-D camera. The deep learning-based approach can handle occlusions and does not require subject-specific calibration, making it a practical and accessible solution for applications like physical therapy, sports training, and ergonomics. While the paper demonstrates the system's strong performance, further research is needed to optimize its computational efficiency and validate its robustness in diverse real-world settings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Leveraging Digital Perceptual Technologies for Remote Perception and Analysis of Human Biomechanical Processes: A Contactless Approach for Workload and Joint Force Assessment

Jesudara Omidokun, Darlington Egeonu, Bochen Jia, Liang Yang

This study presents an innovative computer vision framework designed to analyze human movements in industrial settings, aiming to enhance biomechanical analysis by integrating seamlessly with existing software. Through a combination of advanced imaging and modeling techniques, the framework allows for comprehensive scrutiny of human motion, providing valuable insights into kinematic patterns and kinetic data. Utilizing Convolutional Neural Networks (CNNs), Direct Linear Transform (DLT), and Long Short-Term Memory (LSTM) networks, the methodology accurately detects key body points, reconstructs 3D landmarks, and generates detailed 3D body meshes. Extensive evaluations across various movements validate the framework's effectiveness, demonstrating comparable results to traditional marker-based models with minor differences in joint angle estimations and precise estimations of weight and height. Statistical analyses consistently support the framework's reliability, with joint angle estimations showing less than a 5-degree difference for hip flexion, elbow flexion, and knee angle methods. Additionally, weight estimation exhibits an average error of less than 6 % for weight and less than 2 % for height when compared to ground-truth values from 10 subjects. The integration of the Biomech-57 landmark skeleton template further enhances the robustness and reinforces the framework's credibility. This framework shows significant promise for meticulous biomechanical analysis in industrial contexts, eliminating the need for cumbersome markers and extending its utility to diverse research domains, including the study of specific exoskeleton devices' impact on facilitating the prompt return of injured workers to their tasks.

4/3/2024

cs.CV cs.HC

Fusing uncalibrated IMUs and handheld smartphone video to reconstruct knee kinematics

J. D. Peiffer, Kunal Shah, Shawana Anarwala, Kayan Abdou, R. James Cotton

Video and wearable sensor data provide complementary information about human movement. Video provides a holistic understanding of the entire body in the world while wearable sensors provide high-resolution measurements of specific body segments. A robust method to fuse these modalities and obtain biomechanically accurate kinematics would have substantial utility for clinical assessment and monitoring. While multiple video-sensor fusion methods exist, most assume that a time-intensive, and often brittle, sensor-body calibration process has already been performed. In this work, we present a method to combine handheld smartphone video and uncalibrated wearable sensor data at their full temporal resolution. Our monocular, video-only, biomechanical reconstruction already performs well, with only several degrees of error at the knee during walking compared to markerless motion capture. Reconstructing from a fusion of video and wearable sensor data further reduces this error. We validate this in a mixture of people with no gait impairments, lower limb prosthesis users, and individuals with a history of stroke. We also show that sensor data allows tracking through periods of visual occlusion.

5/28/2024

cs.CV

📊

A Machine Learning Approach for Predicting Upper Limb Motion Intentions with Multimodal Data in Virtual Reality

Pavan Uttej Ravva, Pinar Kullu, Mohammad Fahim Abrar, Roghayeh Leila Barmaki

Over the last decade, there has been significant progress in the field of interactive virtual rehabilitation. Physical therapy (PT) stands as a highly effective approach for enhancing physical impairments. However, patient motivation and progress tracking in rehabilitation outcomes remain a challenge. This work addresses the gap through a machine learning-based approach to objectively measure outcomes of the upper limb virtual therapy system in a user study with non-clinical participants. In this study, we use virtual reality to perform several tracing tasks while collecting motion and movement data using a KinArm robot and a custom-made wearable sleeve sensor. We introduce a two-step machine learning architecture to predict the motion intention of participants. The first step predicts reaching task segments to which the participant-marked points belonged using gaze, while the second step employs a Long Short-Term Memory (LSTM) model to predict directional movements based on resistance change values from the wearable sensor and the KinArm. We specifically propose to transpose our raw resistance data to the time-domain which significantly improves the accuracy of the models by 34.6%. To evaluate the effectiveness of our model, we compared different classification techniques with various data configurations. The results show that our proposed computational method is exceptional at predicting participant's actions with accuracy values of 96.72% for diamond reaching task, and 97.44% for circle reaching task, which demonstrates the great promise of using multimodal data, including eye-tracking and resistance change, to objectively measure the performance and intention in virtual rehabilitation settings.

5/24/2024

cs.HC

MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

Pengfei Xie, Wenqiang Xu, Tutian Tang, Zhenjun Yu, Cewu Lu

This work proposes a novel learning framework for visual hand dynamics analysis that takes into account the physiological aspects of hand motion. The existing models, which are simplified joint-actuated systems, often produce unnatural motions. To address this, we integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create a new model, MS-MANO. This model emulates the dynamics of muscles and tendons to drive the skeletal system, imposing physiologically realistic constraints on the resulting torque trajectories. We further propose a simulation-in-the-loop pose refinement framework, BioPR, that refines the initial estimated pose through a multi-layer perceptron (MLP) network. Our evaluation of the accuracy of MS-MANO and the efficacy of the BioPR is conducted in two separate parts. The accuracy of MS-MANO is compared with MyoSuite, while the efficacy of BioPR is benchmarked against two large-scale public datasets and two recent state-of-the-art methods. The results demonstrate that our approach consistently improves the baseline methods both quantitatively and qualitatively.

4/17/2024

cs.CV cs.RO