HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR

Read original: arXiv:2409.04398 - Published 9/17/2024 by Yudi Dai, Zhiyong Wang, Xiping Lin, Chenglu Wen, Lan Xu, Siqi Shen, Yuexin Ma, Cheng Wang
Total Score

0

HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • Focuses on human-centered interaction and 4D scene capture in large-scale spaces using wearable inertial measurement units (IMUs) and LiDAR sensors.
  • Captures the dynamic 4D (3D + time) scene and the human's movements within it.
  • Presents a dataset of human-scene interactions in large-scale environments.

Plain English Explanation

This research aims to better understand how people interact with their surroundings in large indoor or outdoor spaces. The researchers used special sensors worn by people, called inertial measurement units (IMUs) and laser-based 3D scanners called LiDAR, to capture the 3D shape of the environment over time, as well as the movements and actions of the people within it.

By combining these two types of data - the 3D scene and the human movements - the researchers were able to create a comprehensive 4D (3D + time) model that shows how people navigate and interact with their physical world. This could be useful for applications like augmented reality, motion capture, and robotics, where understanding human-environment interactions is important.

The researchers also created a new dataset of these 4D scenes and human interactions, which can be used by other researchers to further study this area.

Technical Explanation

The HiSC4D system uses a combination of wearable IMU sensors and a LiDAR scanner to capture the 4D (3D + time) scene and the corresponding human movements. The IMUs worn by the participants track their body poses and limb movements, while the LiDAR scanner builds a 3D map of the surrounding environment.

By aligning the data from the IMUs and LiDAR, the researchers are able to create a unified 4D representation that shows how people interact with and move through the space over time. This includes information about the human's position, orientation, limb movements, and interactions with objects in the environment.

The researchers evaluated their system in large-scale indoor and outdoor settings, and the resulting dataset provides a rich resource for further study of human-scene interactions. The dataset includes synchronized IMU and LiDAR data, along with annotations about the participants' activities and object interactions.

Critical Analysis

The HiSC4D system and dataset represent an important step forward in the study of human-environment interactions. By capturing both the 3D scene and the corresponding human movements, the researchers have created a valuable tool for understanding how people navigate and engage with their surroundings.

However, the system does have some limitations. The use of wearable IMU sensors, while providing detailed information about the human's movements, can be intrusive and may influence natural behavior. Additionally, the LiDAR scanning technology, while effective, can be expensive and may not be practical for large-scale or long-term deployments.

Future research could explore alternative sensing modalities, such as video-based motion capture or egocentric cameras, that could provide similar insights while potentially being more scalable and less obtrusive.

Conclusion

The HiSC4D system and dataset represent an important contribution to the field of human-environment interaction research. By capturing the 4D (3D + time) scene and the corresponding human movements, the researchers have created a valuable tool for understanding how people navigate and engage with their physical world.

This research has applications in areas such as augmented reality, motion capture, and robotics, where understanding human-environment interactions is crucial. The dataset provided by the researchers can also serve as a valuable resource for further study and exploration in this field.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR
Total Score

0

HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR

Yudi Dai, Zhiyong Wang, Xiping Lin, Chenglu Wen, Lan Xu, Siqi Shen, Yuexin Ma, Cheng Wang

We introduce HiSC4D, a novel Human-centered interaction and 4D Scene Capture method, aimed at accurately and efficiently creating a dynamic digital world, containing large-scale indoor-outdoor scenes, diverse human motions, rich human-human interactions, and human-environment interactions. By utilizing body-mounted IMUs and a head-mounted LiDAR, HiSC4D can capture egocentric human motions in unconstrained space without the need for external devices and pre-built maps. This affords great flexibility and accessibility for human-centered interaction and 4D scene capturing in various environments. Taking into account that IMUs can capture human spatially unrestricted poses but are prone to drifting for long-period using, and while LiDAR is stable for global localization but rough for local positions and orientations, HiSC4D employs a joint optimization method, harmonizing all sensors and utilizing environment cues, yielding promising results for long-term capture in large scenes. To promote research of egocentric human interaction in large scenes and facilitate downstream tasks, we also present a dataset, containing 8 sequences in 4 large scenes (200 to 5,000 $m^2$), providing 36k frames of accurate 4D human motions with SMPL annotations and dynamic scenes, 31k frames of cropped human point clouds, and scene mesh of the environment. A variety of scenarios, such as the basketball gym and commercial street, alongside challenging human motions, such as daily greeting, one-on-one basketball playing, and tour guiding, demonstrate the effectiveness and the generalization ability of HiSC4D. The dataset and code will be publicated on www.lidarhumanmotion.net/hisc4d available for research purposes.

Read more

9/17/2024

🏷️

Total Score

0

Revisit Human-Scene Interaction via Space Occupancy

Xinpeng Liu, Haowen Hou, Yanchao Yang, Yong-Lu Li, Cewu Lu

Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks. However, one of the major obstacles is its limited data scale. High-quality data with simultaneously captured human and 3D environments is hard to acquire, resulting in limited data diversity and complexity. In this work, we argue that interaction with a scene is essentially interacting with the space occupancy of the scene from an abstract physical perspective, leading us to a unified novel view of Human-Occupancy Interaction. By treating pure motion sequences as records of humans interacting with invisible scene occupancy, we can aggregate motion-only data into a large-scale paired human-occupancy interaction database: Motion Occupancy Base (MOB). Thus, the need for costly paired motion-scene datasets with high-quality scene scans can be substantially alleviated. With this new unified view of Human-Occupancy interaction, a single motion controller is proposed to reach the target state given the surrounding occupancy. Once trained on MOB with complex occupancy layout, which is stringent to human movements, the controller could handle cramped scenes and generalize well to general scenes with limited complexity like regular living rooms. With no GT 3D scenes for training, our method can generate realistic and stable HSI motions in diverse scenarios, including both static and dynamic scenes. The project is available at https://foruck.github.io/occu-page/.

Read more

7/16/2024

EgoHDM: An Online Egocentric-Inertial Human Motion Capture, Localization, and Dense Mapping System
Total Score

0

EgoHDM: An Online Egocentric-Inertial Human Motion Capture, Localization, and Dense Mapping System

Bonan Liu, Handi Yin, Manuel Kaufmann, Jinhao He, Sammy Christen, Jie Song, Pan Hui

We present EgoHDM, an online egocentric-inertial human motion capture (mocap), localization, and dense mapping system. Our system uses 6 inertial measurement units (IMUs) and a commodity head-mounted RGB camera. EgoHDM is the first human mocap system that offers dense scene mapping in near real-time. Further, it is fast and robust to initialize and fully closes the loop between physically plausible map-aware global human motion estimation and mocap-aware 3D scene reconstruction. Our key idea is integrating camera localization and mapping information with inertial human motion capture bidirectionally in our system. To achieve this, we design a tightly coupled mocap-aware dense bundle adjustment and physics-based body pose correction module leveraging a local body-centric elevation map. The latter introduces a novel terrain-aware contact PD controller, which enables characters to physically contact the given local elevation map thereby reducing human floating or penetration. We demonstrate the performance of our system on established synthetic and real-world benchmarks. The results show that our method reduces human localization, camera pose, and mapping accuracy error by 41%, 71%, 46%, respectively, compared to the state of the art. Our qualitative evaluations on newly captured data further demonstrate that EgoHDM can cover challenging scenarios in non-flat terrain including stepping over stairs and outdoor scenes in the wild.

Read more

9/6/2024

Motion Capture from Inertial and Vision Sensors
Total Score

0

Motion Capture from Inertial and Vision Sensors

Xiaodong Chen, Wu Liu, Qian Bao, Xinchen Liu, Quanwei Yang, Ruoli Dai, Tao Mei

Human motion capture is the foundation for many computer vision and graphics tasks. While industrial motion capture systems with complex camera arrays or expensive wearable sensors have been widely adopted in movie and game production, consumer-affordable and easy-to-use solutions for personal applications are still far from mature. To utilize a mixture of a monocular camera and very few inertial measurement units (IMUs) for accurate multi-modal human motion capture in daily life, we contribute MINIONS in this paper, a large-scale Motion capture dataset collected from INertial and visION Sensors. MINIONS has several featured properties: 1) large scale of over five million frames and 400 minutes duration; 2) multi-modality data of IMUs signals and RGB videos labeled with joint positions, joint rotations, SMPL parameters, etc.; 3) a diverse set of 146 fine-grained single and interactive actions with textual descriptions. With the proposed MINIONS, we conduct experiments on multi-modal motion capture and explore the possibilities of consumer-affordable motion capture using a monocular camera and very few IMUs. The experiment results emphasize the unique advantages of inertial and vision sensors, showcasing the promise of consumer-affordable multi-modal motion capture and providing a valuable resource for further research and development.

Read more

7/24/2024