Dynamic Attention-Guided Diffusion for Image Super-Resolution
Abstract
Diffusion models in image Super-Resolution (SR) treat all image regions with uniform intensity, which risks compromising the overall image quality. To address this, we introduce "You Only Diffuse Areas" (YODA), a dynamic attention-guided diffusion method for image SR. YODA selectively focuses on spatial regions using attention maps derived from the low-resolution image and the current time step in the diffusion process. This time-dependent targeting enables a more efficient conversion to high-resolution outputs by focusing on areas that benefit the most from the iterative refinement process, i.e., detail-rich objects. We empirically validate YODA by extending leading diffusion-based methods SR3 and SRDiff. Our experiments demonstrate new state-of-the-art performance in face and general SR across PSNR, SSIM, and LPIPS metrics. A notable finding is YODA's stabilization effect by reducing color shifts, especially when training with small batch sizes.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2023
- DOI:
- 10.48550/arXiv.2308.07977
- arXiv:
- arXiv:2308.07977
- Bibcode:
- 2023arXiv230807977M
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning
- E-Print:
- Brian B. Moser and Stanislav Frolov contributed equally