Related papers: Sound Event Detection Using Duration Robust Loss F…

Impact of Sound Duration and Inactive Frames on Sound Event Detection Performance

In many methods of sound event detection (SED), a segmented time frame is regarded as one data sample to model training. The durations of sound events greatly depend on the sound event class, e.g., the sound event "fan" has a long duration,…

Sound · Computer Science 2021-02-04 Keisuke Imoto , Sakiko Mishima , Yumi Arai , Reishi Kondo

Sound Event Detection Based on Curriculum Learning Considering Learning Difficulty of Events

In conventional sound event detection (SED) models, two types of events, namely, those that are present and those that do not occur in an acoustic scene, are regarded as the same type of events. The conventional SED methods cannot…

Sound · Computer Science 2021-02-11 Noriyuki Tonami , Keisuke Imoto , Yuki Okamoto , Takahiro Fukumori , Yoichi Yamashita

Conditioned Time-Dilated Convolutions for Sound Event Detection

Sound event detection (SED) is the task of identifying sound events along with their onset and offset times. A recent, convolutional neural networks based SED method, proposed the usage of depthwise separable (DWS) and time-dilated…

Sound · Computer Science 2020-07-13 Konstantinos Drossos , Stylianos I. Mimilakis , Tuomas Virtanen

Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling

A sound event detection (SED) method typically takes as an input a sequence of audio frames and predicts the activities of sound events in each frame. In real-life recordings, the sound events exhibit some temporal structure: for instance,…

Sound · Computer Science 2019-11-07 Konstantinos Drossos , Shayan Gharib , Paul Magron , Tuomas Virtanen

Active Learning for Sound Event Detection

This paper proposes an active learning system for sound event detection (SED). It aims at maximizing the accuracy of a learned SED model with limited annotation effort. The proposed system analyzes an initially unlabeled audio dataset, from…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-10 Shuyang Zhao , Toni Heittola , Tuomas Virtanen

Towards duration robust weakly supervised sound event detection

Sound event detection (SED) is the task of tagging the absence or presence of audio events and their corresponding interval within a given audio clip. While SED can be done using supervised machine learning, where training data is fully…

Sound · Computer Science 2021-02-08 Heinrich Dinkel , Mengyue Wu , Kai Yu

Polyphonic Sound Event and Sound Activity Detection: A Multi-task approach

Polyphonic Sound Event Detection (SED) in real-world recordings is a challenging task because of the dynamic polyphony level, intensity, and duration of sound events. Current polyphonic SED systems fail to model the temporal structure of…

Audio and Speech Processing · Electrical Eng. & Systems 2019-08-02 Arjun Pankajakshan , Helen L. Bear , Emmanouil Benetos

DiffSED: Sound Event Detection with Denoising Diffusion

Sound Event Detection (SED) aims to predict the temporal boundaries of all the events of interest and their class labels, given an unconstrained audio sample. Taking either the splitand-classify (i.e., frame-level) strategy or the more…

Sound · Computer Science 2023-08-21 Swapnil Bhosale , Sauradip Nag , Diptesh Kanojia , Jiankang Deng , Xiatian Zhu

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation

Most sound event detection (SED) systems perform well on clean datasets but degrade significantly in noisy environments. Language-queried audio source separation (LASS) models show promise for robust SED by separating target events;…

Sound · Computer Science 2025-08-12 Yuanjian Chen , Yang Xiao , Han Yin , Yadong Guan , Xubo Liu

Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features

In this paper, we propose the use of spatial and harmonic features in combination with long short term memory (LSTM) recurrent neural network (RNN) for automatic sound event detection (SED) task. Real life sound recordings typically have…

Sound · Computer Science 2017-06-09 Sharath Adavanne , Giambattista Parascandolo , Pasi Pertilä , Toni Heittola , Tuomas Virtanen

Sound Event Detection with Depthwise Separable and Dilated Convolutions

State-of-the-art sound event detection (SED) methods usually employ a series of convolutional neural networks (CNNs) to extract useful features from the input audio signal, and then recurrent neural networks (RNNs) to model longer temporal…

Sound · Computer Science 2020-02-04 Konstantinos Drossos , Stylianos I. Mimilakis , Shayan Gharib , Yanxiong Li , Tuomas Virtanen

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection

Sound Event Detection (SED) is challenging in noisy environments where overlapping sounds obscure target events. Language-queried audio source separation (LASS) aims to isolate the target sound events from a noisy clip. However, this…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-14 Han Yin , Yang Xiao , Jisheng Bai , Rohan Kumar Das

A Multi-Task Learning Framework for Sound Event Detection using High-level Acoustic Characteristics of Sounds

Sound event detection (SED) entails identifying the type of sound and estimating its temporal boundaries from acoustic signals. These events are uniquely characterized by their spatio-temporal features, which are determined by the way they…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-19 Tanmay Khandelwal , Rohan Kumar Das

Sound Event Detection with Boundary-Aware Optimization and Inference

Temporal detection problems appear in many fields including time-series estimation, activity recognition and sound event detection (SED). In this work, we propose a new approach to temporal event modeling by explicitly modeling event onsets…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-08 Florian Schmid , Chi Ian Tang , Sanjeel Parekh , Vamsi Krishna Ithapu , Juan Azcarreta Ortiz , Giacomo Ferroni , Yijun Qian , Arnoldas Jasonas , Cosmin Frateanu , Camilla Clark , Gerhard Widmer , Çağdaş Bilen

Sound event detection via dilated convolutional recurrent neural networks

Convolutional recurrent neural networks (CRNNs) have achieved state-of-the-art performance for sound event detection (SED). In this paper, we propose to use a dilated CRNN, namely a CRNN with a dilated convolutional kernel, as the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-21 Yanxiong Li , Mingle Liu , Konstantinos Drossos , Tuomas Virtanen

Sound Event Triage: Detecting Sound Events Considering Priority of Classes

We propose a new task for sound event detection (SED): sound event triage (SET). The goal of SET is to detect an arbitrary number of high-priority event classes while allowing misdetections of low-priority event classes where the priority…

Sound · Computer Science 2023-01-12 Noriyuki Tonami , Keisuke Imoto

Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection

Sound event detection (SED) has gained increasing attention with its wide application in surveillance, video indexing, etc. Existing models in SED mainly generate frame-level prediction, converting it into a sequence multi-label…

Sound · Computer Science 2021-11-15 Zhirong Ye , Xiangdong Wang , Hong Liu , Yueliang Qian , Rui Tao , Long Yan , Kazushige Ouchi

Sound Event Detection and Time-Frequency Segmentation from Weakly Labelled Data

Sound event detection (SED) aims to detect when and recognize what sound events happen in an audio clip. Many supervised SED algorithms rely on strongly labelled data which contains the onset and offset annotations of sound events. However,…

Sound · Computer Science 2019-12-11 Qiuqiang Kong , Yong Xu , Iwona Sobieraj , Wenwu Wang , Mark D. Plumbley

Learning How to Listen: A Temporal-Frequential Attention Model for Sound Event Detection

In this paper, we propose a temporal-frequential attention model for sound event detection (SED). Our network learns how to listen with two attention models: a temporal attention model and a frequential attention model. Proposed system…

Sound · Computer Science 2025-05-06 Yu-Han Shen , Ke-Xin He , Wei-Qiang Zhang

Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks

In this paper, we propose a convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3D) space. The proposed network takes a sequence of…

Sound · Computer Science 2018-12-18 Sharath Adavanne , Archontis Politis , Joonas Nikunen , Tuomas Virtanen