FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
Abstract
Despite the significant progress made by all-in-one models in universal image restoration, existing methods suffer from a generalization bottleneck in real-world scenarios, as they are mostly trained on small-scale synthetic datasets with limited degradations. Therefore, large-scale high-quality real-world training data is urgently needed to facilitate the emergence of foundational models for image restoration. To advance this field, we spare no effort in contributing a million-scale dataset with two notable advantages over existing training data: real-world samples with larger-scale, and degradation types with higher diversity. By adjusting internal camera settings and external imaging conditions, we can capture aligned image pairs using our well-designed data acquisition system over multiple rounds and our data alignment criterion. Moreover, we propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios, taking a further step toward foundation models. Specifically, we first utilize a diffusion-based generalist model to remove degradations by learning the degradation-agnostic common representations from diverse inputs, where incremental learning strategy is adopted to better guide model training. To refine the model's restoration capability in complex scenarios, we introduce degradation-aware specialist models for achieving final high-quality results. Extensive experiments show the value of our dataset and the effectiveness of our method.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- DOI:
- arXiv:
- arXiv:2412.01427
- Bibcode:
- 2024arXiv241201427L
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- Project website: https://www.foundir.net