FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis
Abstract
Text-to-motion synthesis is a crucial task in computer vision. Existing methods are limited in their universality, as they are tailored for single-person or two-person scenarios and can not be applied to generate motions for more individuals. To achieve the number-free motion synthesis, this paper reconsiders motion generation and proposes to unify the single and multi-person motion by the conditional motion distribution. Furthermore, a generation module and an interaction module are designed for our FreeMotion framework to decouple the process of conditional motion generation and finally support the number-free motion synthesis. Besides, based on our framework, the current single-person motion spatial control method could be seamlessly integrated, achieving precise control of multi-person motion. Extensive experiments demonstrate the superior performance of our method and our capability to infer single and multi-human motions simultaneously.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2024
- DOI:
- arXiv:
- arXiv:2405.15763
- Bibcode:
- 2024arXiv240515763F
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition