Geometric Pose Affordance: 3D Human Pose with Scene Constraints

doi:10.48550/arXiv.1905.07718

Geometric Pose Affordance: 3D Human Pose with Scene Constraints

Full 3D estimation of human pose from a single image remains a challenging task despite many recent advances. In this paper, we explore the hypothesis that strong prior information about scene geometry can be used to improve pose estimation accuracy. To tackle this question empirically, we have assembled a novel $\textbf{Geometric Pose Affordance}$ dataset, consisting of multi-view imagery of people interacting with a variety of rich 3D environments. We utilized a commercial motion capture system to collect gold-standard estimates of pose and construct accurate geometric 3D CAD models of the scene itself. To inject prior knowledge of scene constraints into existing frameworks for pose estimation from images, we introduce a novel, view-based representation of scene geometry, a $\textbf{multi-layer depth map}$, which employs multi-hit ray tracing to concisely encode multiple surface entry and exit points along each camera view ray direction. We propose two different mechanisms for integrating multi-layer depth information pose estimation: input as encoded ray features used in lifting 2D pose to full 3D, and secondly as a differentiable loss that encourages learned models to favor geometrically consistent pose estimates. We show experimentally that these techniques can improve the accuracy of 3D pose estimates, particularly in the presence of occlusion and complex scene geometry.

Publication:

arXiv e-prints

Pub Date:

May 2019

DOI:

10.48550/arXiv.1905.07718

arXiv:

arXiv:1905.07718

Bibcode:

2019arXiv190507718W

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

$\href{https://wangzheallen.github.io/GPA.html}{Project Page}$, in submission to CVIU

NASA/ADS

Geometric Pose Affordance: 3D Human Pose with Scene Constraints

Abstract