Panoptic Studio: A Massively Multiview System for Social Interaction Capture
Abstract
We present an approach to capture the 3D motion of a group of people engaged in a social interaction. The core challenges in capturing social interactions are: (1) occlusion is functional and frequent; (2) subtle motion needs to be measured over a space large enough to host a social group; (3) human appearance and configuration variation is immense; and (4) attaching markers to the body may prime the nature of interactions. The Panoptic Studio is a system organized around the thesis that social interactions should be measured through the integration of perceptual analyses over a large variety of view points. We present a modularized system designed around this principle, consisting of integrated structural, hardware, and software innovations. The system takes, as input, 480 synchronized video streams of multiple people engaged in social activities, and produces, as output, the labeled time-varying 3D structure of anatomical landmarks on individuals in the space. Our algorithm is designed to fuse the "weak" perceptual processes in the large number of views by progressively generating skeletal proposals from low-level appearance cues, and a framework for temporal refinement is also presented by associating body parts to reconstructed dense 3D trajectory stream. Our system and method are the first in reconstructing full body motion of more than five people engaged in social interactions without using markers. We also empirically demonstrate the impact of the number of views in achieving this goal.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2016
- DOI:
- 10.48550/arXiv.1612.03153
- arXiv:
- arXiv:1612.03153
- Bibcode:
- 2016arXiv161203153J
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence