From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano

doi:10.48550/arXiv.2407.04518

From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano

Our study investigates an approach for understanding musical performances through the lens of audio encoding models, focusing on the domain of solo Western classical piano music. Compared to composition-level attribute understanding such as key or genre, we identify a knowledge gap in performance-level music understanding, and address three critical tasks: expertise ranking, difficulty estimation, and piano technique detection, introducing a comprehensive Pianism-Labelling Dataset (PLD) for this purpose. We leverage pre-trained audio encoders, specifically Jukebox, Audio-MAE, MERT, and DAC, demonstrating varied capabilities in tackling downstream tasks, to explore whether domain-specific fine-tuning enhances capability in capturing performance nuances. Our best approach achieved 93.6\% accuracy in expertise ranking, 33.7\% in difficulty estimation, and 46.7\% in technique detection, with Audio-MAE as the overall most effective encoder. Finally, we conducted a case study on Chopin Piano Competition data using trained models for expertise ranking, which highlights the challenge of accurately assessing top-tier performances.

Publication:

arXiv e-prints

Pub Date:

July 2024

DOI:

10.48550/arXiv.2407.04518

arXiv:

arXiv:2407.04518

Bibcode:

2024arXiv240704518Z

Keywords:

Electrical Engineering and Systems Science - Audio and Speech Processing

E-Print:

Accepted by the 25th International Society for Music Information Retrieval (ISMIR)

NASA/ADS

From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano

Abstract