MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

He Zhang1,*     Shenghao Ren3,*     Haolei Yuan1,*     Jianhui Zhao1     Fan Li1
Shuangpeng Sun2     Zhenghao Liang4     Tao Yu2,✉     Qiu Shen3,✉     Xun Cao3,✉    
1Beihang University      2Tsinghua University
3Nanjing University      4Weilan Tech, Beijing
*Equal Contribution    Corresponding Author
MMVP is a a Multimodal MoCap Dataset with Vision and Pressure sensors. It supports provides accurate and dense plantar pressure signals synchronized with RGBD observations.

Abstract

Foot contact is an important cue not only for human motion capture but also for motion understanding and physically plausible motion generation. However, most of the foot-contact annotations in existing datasets are estimated by purely visual matching and distance thresholding, which results in low accuracy and coarse granularity. Even though existing multimodal datasets synergistically capture plantar pressure (foot contact) and visual signals, they are specifically designed for small-range and slow motion such as Taiji Quan and Yoga. Therefore, there is still a lack of a vision-pressure multimodal dataset with large-range and fast human motion, as well as accurate and dense foot-contact annotation. To fill this gap, we propose a Multimodal MoCap Dataset with Vision and Pressure sensors, named MMVP. MMVP provides accurate and dense plantar pressure signals synchronized with RGBD observations, which is especially useful for both plausible shape estimation, robust pose fitting without foot drifting, and accurate global translation tracking. To validate the dataset, we propose an RGBD-P SMPL fitting method and also a monocular-video-based baseline framework, VP-MoCap, for human motion capture. Experiments demonstrate that our RGBD-P SMPL Fitting results significantly outperform pure visual motion capture. Moreover, VP-MoCap outperforms SOTA methods in foot-contact and global translation estimation accuracy. We believe the configuration of the dataset and the baseline frameworks will stimulate the research in this direction and also provide a good reference for MoCap applications in various domains.

Video

Examples in MMVP

BibTeX

@article{Zhang2024MMVP,
  author    = {He Zhang, Shenghao Ren, Haolei Yuan, Jianhui Zhao, Fan Li, Shuangpeng Sun, Zhenghao Liang, Tao Yu, Qiu Shen, Xun Cao},
  title     = {MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors},
  journal   = {CVPR},
  year      = {2024},
}