Deep Scattering Spectrum
Abstract
A scattering transform defines a locally translation invariant representation which is stable to time-warping deformations. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.
- Publication:
-
IEEE Transactions on Signal Processing
- Pub Date:
- August 2014
- DOI:
- 10.1109/TSP.2014.2326991
- arXiv:
- arXiv:1304.6763
- Bibcode:
- 2014ITSP...62.4114A
- Keywords:
-
- Computer Science - Sound;
- Computer Science - Information Theory
- E-Print:
- doi:10.1109/TSP.2014.2326991