English

Sound Event Detection Using Duration Robust Loss Function

Sound 2020-06-30 v1 Audio and Speech Processing

Abstract

Many methods of sound event detection (SED) based on machine learning regard a segmented time frame as one data sample to model training. However, the sound durations of sound events vary greatly depending on the sound event class, e.g., the sound event ``fan'' has a long time duration, while the sound event ``mouse clicking'' is instantaneous. The difference in the time duration between sound event classes thus causes a serious data imbalance problem in SED. In this paper, we propose a method for SED using a duration robust loss function, which can focus model training on sound events of short duration. In the proposed method, we focus on a relationship between the duration of the sound event and the ease/difficulty of model training. In particular, many sound events of long duration (e.g., sound event ``fan'') are stationary sounds, which have less variation in their acoustic features and their model training is easy. Meanwhile, some sound events of short duration (e.g., sound event ``object impact'') have more than one audio pattern, such as attack, decay, and release parts. We thus apply a class-wise reweighting to the binary-cross entropy loss function depending on the ease/difficulty of model training. Evaluation experiments conducted using TUT Sound Events 2016/2017 and TUT Acoustic Scenes 2016 datasets show that the proposed method respectively improves the detection performance of sound events by 3.15 and 4.37 percentage points in macro- and micro-Fscores compared with a conventional method using the binary-cross entropy loss function.

Keywords

Cite

@article{arxiv.2006.15253,
  title  = {Sound Event Detection Using Duration Robust Loss Function},
  author = {Daichi Akiyama and Keisuke Imoto and Noriyuki Tonami and Yuki Okamoto and Ryosuke Yamanishi and Takahiro Fukumori and Yoichi Yamashita},
  journal= {arXiv preprint arXiv:2006.15253},
  year   = {2020}
}

Comments

Submitted to DCASE2020 Workshop

R2 v1 2026-06-23T16:39:47.976Z