English

Deep Scattering Spectrum

Sound 2015-06-15 v2 Information Theory math.IT

Abstract

A scattering transform defines a locally translation invariant representation which is stable to time-warping deformations. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.

Keywords

Cite

@article{arxiv.1304.6763,
  title  = {Deep Scattering Spectrum},
  author = {Joakim Andén and Stéphane Mallat},
  journal= {arXiv preprint arXiv:1304.6763},
  year   = {2015}
}
R2 v1 2026-06-22T00:05:58.308Z