Deep Scattering Spectrum
Sound
2015-06-15 v2 Information Theory
math.IT
Abstract
A scattering transform defines a locally translation invariant representation which is stable to time-warping deformations. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.
Cite
@article{arxiv.1304.6763,
title = {Deep Scattering Spectrum},
author = {Joakim Andén and Stéphane Mallat},
journal= {arXiv preprint arXiv:1304.6763},
year = {2015}
}