Error metrics for predicting discrimination of original and spectrally altered musical instrument sounds
The correspondence of various error metrics to human discrimination data was investigated. Time-varying harmonic amplitude data were obtained from spectral analysis of eight musical instrument sounds (bassoon, clarinet, flute, horn, oboe, saxophone, trumpet, and violin). The data were altered using fixed random multipliers on the harmonic amplitudes, and the sounds were additively resynthesized with estimated average spectral errors ranging from 1% to 50%. Listeners attempted to discriminate the randomly altered sounds from reference sounds resynthesized from the original data. Then, various error metrics were used to calculate the spectral differences between the original and altered sounds, and the R2 correspondence between the error metrics and the discrimination data was measured. A relative-amplitude spectral error metric gave the best correspondence to average subject discrimination data, capturing over 90% of the variation relative to a Fourth-order regression curve, although other formulas gave similar results. Error metrics which used a small number of representative analysis frames gave results which compared favorably to using all frames of the analysis.