Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis
Abstract
In this paper, we present PointmapDiffusion, a novel framework for single-image novel view synthesis (NVS) that utilizes pre-trained 2D diffusion models. Our method is the first to leverage pointmaps (i.e. rasterized 3D scene coordinates) as a conditioning signal, capturing geometric prior from the reference images to guide the diffusion process. By embedding reference attention blocks and a ControlNet for pointmap features, our model balances between generative capability and geometric consistency, enabling accurate view synthesis across varying viewpoints. Extensive experiments on diverse real-world datasets demonstrate that PointmapDiffusion achieves high-quality, multi-view consistent results with significantly fewer trainable parameters compared to other baselines for single-image NVS tasks.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- DOI:
- arXiv:
- arXiv:2501.02913
- Bibcode:
- 2025arXiv250102913N
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition