Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation

doi:10.48550/arXiv.2412.05560

Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation

Text-to-3D generation is a valuable technology in virtual reality and digital content creation. While recent works have pushed the boundaries of text-to-3D generation, producing high-fidelity 3D objects with inefficient prompts and simulating their physics-grounded motion accurately still remain unsolved challenges. To address these challenges, we present an innovative framework that utilizes the Large Language Model (LLM)-refined prompts and diffusion priors-guided Gaussian Splatting (GS) for generating 3D models with accurate appearances and geometric structures. We also incorporate a continuum mechanics-based deformation map and color regularization to synthesize vivid physics-grounded motion for the generated 3D Gaussians, adhering to the conservation of mass and momentum. By integrating text-to-3D generation with physics-grounded motion synthesis, our framework renders photo-realistic 3D objects that exhibit physics-aware motion, accurately reflecting the behaviors of the objects under various forces and constraints across different materials. Extensive experiments demonstrate that our approach achieves high-quality 3D generations with realistic physics-grounded motion.

Publication:

arXiv e-prints

Pub Date:

December 2024

DOI:

10.48550/arXiv.2412.05560

arXiv:

arXiv:2412.05560

Bibcode:

2024arXiv241205560W

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Artificial Intelligence;
Computer Science - Graphics;
Computer Science - Machine Learning;
Electrical Engineering and Systems Science - Image and Video Processing

ADS

Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation

Abstract