GeoGuide: Geometric guidance of diffusion models
Abstract
Diffusion models are among the most effective methods for image generation. This is in particular because, unlike GANs, they can be easily conditioned during training to produce elements with desired class or properties. However, guiding a pre-trained diffusion model to generate elements from previously unlabeled data is significantly more challenging. One of the possible solutions was given by the ADM-G guiding approach. Although ADM-G successfully generates elements from the given class, there is a significant quality gap compared to a model originally conditioned on this class. In particular, the FID score obtained by the ADM-G-guided diffusion model is nearly three times lower than the class-conditioned guidance. We demonstrate that this issue is partly due to ADM-G providing minimal guidance during the final stage of the denoising process. To address this problem, we propose GeoGuide, a guidance model based on tracing the distance of the diffusion model's trajectory from the data manifold. The main idea of GeoGuide is to produce normalized adjustments during the backward denoising process. As shown in the experiments, GeoGuide surpasses the probabilistic approach ADM-G with respect to both the FID scores and the quality of the generated images.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2024
- DOI:
- 10.48550/arXiv.2407.12889
- arXiv:
- arXiv:2407.12889
- Bibcode:
- 2024arXiv240712889P
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition