English

Learning Multiple Sound Source 2D Localization

Audio and Speech Processing 2020-12-11 v1 Machine Learning Sound

Abstract

In this paper, we propose novel deep learning based algorithms for multiple sound source localization. Specifically, we aim to find the 2D Cartesian coordinates of multiple sound sources in an enclosed environment by using multiple microphone arrays. To this end, we use an encoding-decoding architecture and propose two improvements on it to accomplish the task. In addition, we also propose two novel localization representations which increase the accuracy. Lastly, new metrics are developed relying on resolution-based multiple source association which enables us to evaluate and compare different localization approaches. We tested our method on both synthetic and real world data. The results show that our method improves upon the previous baseline approach for this problem.

Keywords

Cite

@article{arxiv.2012.05515,
  title  = {Learning Multiple Sound Source 2D Localization},
  author = {Guillaume Le Moing and Phongtharin Vinayavekhin and Tadanobu Inoue and Jayakorn Vongkulbhisal and Asim Munawar and Ryuki Tachibana and Don Joven Agravante},
  journal= {arXiv preprint arXiv:2012.05515},
  year   = {2020}
}

Comments

Published in: 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP)