A cover song, by definition, is a new performance or recording of a previously recorded, commercially released song. It may be by the original artist themselves or a different artist altogether and can vary from the original in unpredictable ways including key, arrangement, instrumentation, timbre and more. In this work we propose a novel approach to learning audio representations for the task of cover song detection. We train a neural architecture on tens of thousands of cover-song audio clips and test it on a held out set. We obtain a mean precision@1 of 65% over mini-batches, ten times better than random guessing. Our results indicate that Siamese network configurations show promise for approaching the cover song identification problem.
- Pub Date:
- May 2020
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Computer Science - Machine Learning;
- Computer Science - Sound;
- Statistics - Machine Learning
- Code available at https://github.com/markostam/coversongs-dual-convnet