Generative Dataset Distillation Based on Self-knowledge Distillation
Abstract
Dataset distillation is an effective technique for reducing the cost and complexity of model training while maintaining performance by compressing large datasets into smaller, more efficient versions. In this paper, we present a novel generative dataset distillation method that can improve the accuracy of aligning prediction logits. Our approach integrates self-knowledge distillation to achieve more precise distribution matching between the synthetic and original data, thereby capturing the overall structure and relationships within the data. To further improve the accuracy of alignment, we introduce a standardization step on the logits before performing distribution matching, ensuring consistency in the range of logits. Through extensive experiments, we demonstrate that our method outperforms existing state-of-the-art methods, resulting in superior distillation performance.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- arXiv:
- arXiv:2501.04202
- Bibcode:
- 2025arXiv250104202L
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning
- E-Print:
- Accepted by ICASSP 2025