Related papers: Sound Event Detection: A Tutorial

Sound Event Detection Based on Curriculum Learning Considering Learning Difficulty of Events

In conventional sound event detection (SED) models, two types of events, namely, those that are present and those that do not occur in an acoustic scene, are regarded as the same type of events. The conventional SED methods cannot…

Sound · Computer Science 2021-02-11 Noriyuki Tonami , Keisuke Imoto , Yuki Okamoto , Takahiro Fukumori , Yoichi Yamashita

DiffSED: Sound Event Detection with Denoising Diffusion

Sound Event Detection (SED) aims to predict the temporal boundaries of all the events of interest and their class labels, given an unconstrained audio sample. Taking either the splitand-classify (i.e., frame-level) strategy or the more…

Sound · Computer Science 2023-08-21 Swapnil Bhosale , Sauradip Nag , Diptesh Kanojia , Jiankang Deng , Xiatian Zhu

Polyphonic Sound Event and Sound Activity Detection: A Multi-task approach

Polyphonic Sound Event Detection (SED) in real-world recordings is a challenging task because of the dynamic polyphony level, intensity, and duration of sound events. Current polyphonic SED systems fail to model the temporal structure of…

Audio and Speech Processing · Electrical Eng. & Systems 2019-08-02 Arjun Pankajakshan , Helen L. Bear , Emmanouil Benetos

Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection

Sound event detection (SED) has gained increasing attention with its wide application in surveillance, video indexing, etc. Existing models in SED mainly generate frame-level prediction, converting it into a sequence multi-label…

Sound · Computer Science 2021-11-15 Zhirong Ye , Xiangdong Wang , Hong Liu , Yueliang Qian , Rui Tao , Long Yan , Kazushige Ouchi

Towards joint sound scene and polyphonic sound event recognition

Acoustic Scene Classification (ASC) and Sound Event Detection (SED) are two separate tasks in the field of computational sound scene analysis. In this work, we present a new dataset with both sound scene and sound event labels and use this…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-02 Helen L. Bear , Ines Nolasco , Emmanouil Benetos

Enhance Temporal Relations in Audio Captioning with Sound Event Detection

Automated audio captioning aims at generating natural language descriptions for given audio clips, not only detecting and classifying sounds, but also summarizing the relationships between audio events. Recent research advances in audio…

Sound · Computer Science 2024-07-19 Zeyu Xie , Xuenan Xu , Mengyue Wu , Kai Yu

Unified Audio Event Detection

Sound Event Detection (SED) detects regions of sound events, while Speaker Diarization (SD) segments speech conversations attributed to individual speakers. In SED, all speaker segments are classified as a single speech event, while in SD,…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-16 Yidi Jiang , Ruijie Tao , Wen Huang , Qian Chen , Wen Wang

Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling

A sound event detection (SED) method typically takes as an input a sequence of audio frames and predicts the activities of sound events in each frame. In real-life recordings, the sound events exhibit some temporal structure: for instance,…

Sound · Computer Science 2019-11-07 Konstantinos Drossos , Shayan Gharib , Paul Magron , Tuomas Virtanen

Text-Queried Target Sound Event Localization

Sound event localization and detection (SELD) aims to determine the appearance of sound classes, together with their Direction of Arrival (DOA). However, current SELD systems can only predict the activities of specific classes, for example,…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-25 Jinzheng Zhao , Xinyuan Qian , Yong Xu , Haohe Liu , Yin Cao , Davide Berghi , Wenwu Wang

Evaluation of post-processing algorithms for polyphonic sound event detection

Sound event detection (SED) aims at identifying audio events (audio tagging task) in recordings and then locating them temporally (localization task). This last task ends with the segmentation of the frame-level class predictions, that…

Audio and Speech Processing · Electrical Eng. & Systems 2019-06-25 Leo Cances , Patrice Guyot , Thomas Pellegrini

Sound Event Detection Guided by Semantic Contexts of Scenes

Some studies have revealed that contexts of scenes (e.g., "home," "office," and "cooking") are advantageous for sound event detection (SED). Mobile devices and sensing technologies give useful information on scenes for SED without the use…

Sound · Computer Science 2022-02-18 Noriyuki Tonami , Keisuke Imoto , Ryotaro Nagase , Yuki Okamoto , Takahiro Fukumori , Yoichi Yamashita

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation

Most sound event detection (SED) systems perform well on clean datasets but degrade significantly in noisy environments. Language-queried audio source separation (LASS) models show promise for robust SED by separating target events;…

Sound · Computer Science 2025-08-12 Yuanjian Chen , Yang Xiao , Han Yin , Yadong Guan , Xubo Liu

Sound Event Detection of Weakly Labelled Data with CNN-Transformer and Automatic Threshold Optimization

Sound event detection (SED) is a task to detect sound events in an audio recording. One challenge of the SED task is that many datasets such as the Detection and Classification of Acoustic Scenes and Events (DCASE) datasets are weakly…

Sound · Computer Science 2020-08-25 Qiuqiang Kong , Yong Xu , Wenwu Wang , Mark D. Plumbley

A Multi-Task Learning Framework for Sound Event Detection using High-level Acoustic Characteristics of Sounds

Sound event detection (SED) entails identifying the type of sound and estimating its temporal boundaries from acoustic signals. These events are uniquely characterized by their spatio-temporal features, which are determined by the way they…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-19 Tanmay Khandelwal , Rohan Kumar Das

Sound Event Detection and Localization with Distance Estimation

Sound Event Detection and Localization (SELD) is a combined task of identifying sound events and their corresponding direction-of-arrival (DOA). While this task has numerous applications and has been extensively researched in recent years,…

Sound · Computer Science 2024-06-13 Daniel Aleksander Krause , Archontis Politis , Annamaria Mesaros

Towards Open World Sound Event Detection

Sound Event Detection (SED) plays a vital role in audio understanding, with applications in surveillance, smart cities, healthcare, and multimedia indexing. However, conventional SED systems operate under a closed-world assumption, limiting…

Sound · Computer Science 2026-05-22 P. H. Hai , L. T. Minh , L. H. Son

Polyphonic Sound Event Detection and Localization using a Two-Stage Strategy

Sound event detection (SED) and localization refer to recognizing sound events and estimating their spatial and temporal locations. Using neural networks has become the prevailing method for SED. In the area of sound localization, which is…

Sound · Computer Science 2019-11-06 Yin Cao , Qiuqiang Kong , Turab Iqbal , Fengyan An , Wenwu Wang , Mark D. Plumbley

Sound Event Detection by Multitask Learning of Sound Events and Scenes with Soft Scene Labels

Sound event detection (SED) and acoustic scene classification (ASC) are major tasks in environmental sound analysis. Considering that sound events and scenes are closely related to each other, some works have addressed joint analyses of…

Sound · Computer Science 2020-02-17 Keisuke Imoto , Noriyuki Tonami , Yuma Koizumi , Masahiro Yasuda , Ryosuke Yamanishi , Yoichi Yamashita

Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features

In this paper, we propose the use of spatial and harmonic features in combination with long short term memory (LSTM) recurrent neural network (RNN) for automatic sound event detection (SED) task. Real life sound recordings typically have…

Sound · Computer Science 2017-06-09 Sharath Adavanne , Giambattista Parascandolo , Pasi Pertilä , Toni Heittola , Tuomas Virtanen

An Approach for Self-Training Audio Event Detectors Using Web Data

Audio Event Detection (AED) aims to recognize sounds within audio and video recordings. AED employs machine learning algorithms commonly trained and tested on annotated datasets. However, available datasets are limited in number of samples…

Sound · Computer Science 2017-06-28 Benjamin Elizalde , Ankit Shah , Siddharth Dalmia , Min Hun Lee , Rohan Badlani , Anurag Kumar , Bhiksha Raj , Ian Lane