Data is Overrated: Perceptual Metrics Can Lead Learning in the Absence of Training Data
Abstract
Perceptual metrics are traditionally used to evaluate the quality of natural signals, such as images and audio. They are designed to mimic the perceptual behaviour of human observers and usually reflect structures found in natural signals. This motivates their use as loss functions for training generative models such that models will learn to capture the structure held in the metric. We take this idea to the extreme in the audio domain by training a compressive autoencoder to reconstruct uniform noise, in lieu of natural data. We show that training with perceptual losses improves the reconstruction of spectrograms and re-synthesized audio at test time over models trained with a standard Euclidean loss. This demonstrates better generalisation to unseen natural signals when using perceptual metrics.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2023
- DOI:
- 10.48550/arXiv.2312.03455
- arXiv:
- arXiv:2312.03455
- Bibcode:
- 2023arXiv231203455N
- Keywords:
-
- Computer Science - Sound;
- Computer Science - Artificial Intelligence;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning;
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Electrical Engineering and Systems Science - Image and Video Processing
- E-Print:
- Machine Learning for Audio Workshop, NeurIPS 2023