We conduct a large-scale study of language models for chord prediction. Specifically, we compare N-gram models to various flavours of recurrent neural networks on a comprehensive dataset comprising all publicly available datasets of annotated chords known to us. This large amount of data allows us to systematically explore hyper-parameter settings for the recurrent neural networks---a crucial step in achieving good results with this model class. Our results show not only a quantitative difference between the models, but also a qualitative one: in contrast to static N-gram models, certain RNN configurations adapt to the songs at test time. This finding constitutes a further step towards the development of chord recognition systems that are more aware of local musical context than what was previously possible.
- Pub Date:
- April 2018
- Computer Science - Machine Learning;
- Computer Science - Sound;
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Statistics - Machine Learning
- Accepted at ICASSP 2018