D$^3$-Human: Dynamic Disentangled Digital Human from Monocular Video

doi:10.48550/arXiv.2501.01589

D$^3$-Human: Dynamic Disentangled Digital Human from Monocular Video

We introduce D$^3$-Human, a method for reconstructing Dynamic Disentangled Digital Human geometry from monocular videos. Past monocular video human reconstruction primarily focuses on reconstructing undecoupled clothed human bodies or only reconstructing clothing, making it difficult to apply directly in applications such as animation production. The challenge in reconstructing decoupled clothing and body lies in the occlusion caused by clothing over the body. To this end, the details of the visible area and the plausibility of the invisible area must be ensured during the reconstruction process. Our proposed method combines explicit and implicit representations to model the decoupled clothed human body, leveraging the robustness of explicit representations and the flexibility of implicit representations. Specifically, we reconstruct the visible region as SDF and propose a novel human manifold signed distance field (hmSDF) to segment the visible clothing and visible body, and then merge the visible and invisible body. Extensive experimental results demonstrate that, compared with existing reconstruction schemes, D$^3$-Human can achieve high-quality decoupled reconstruction of the human body wearing different clothing, and can be directly applied to clothing transfer and animation.

Publication:

arXiv e-prints

Pub Date:

January 2025

DOI:

10.48550/arXiv.2501.01589

arXiv:

arXiv:2501.01589

Bibcode:

2025arXiv250101589C

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Graphics

E-Print:

Project Page: https://ustc3dv.github.io/D3Human/

ADS

D$^3$-Human: Dynamic Disentangled Digital Human from Monocular Video

Abstract