Related papers: Polyphonic audio event detection: multi-label or m…
Polyphonic Sound Event Detection (SED) in real-world recordings is a challenging task because of the dynamic polyphony level, intensity, and duration of sound events. Current polyphonic SED systems fail to model the temporal structure of…
We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlapping audio events. The framework leverages the power of convolutional recurrent neural network…
Current audio classification models have small class vocabularies relative to the large number of sound event classes of interest in the real world. Thus, they provide a limited view of the world that may miss important yet unexpected or…
This report proposes a polyphonic sound event detection (SED) method for the DCASE 2020 Challenge Task 4. The proposed SED method is based on semi-supervised learning to deal with the different combination of training datasets such as…
Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural…
Polyphonic sound event localization and detection is not only detecting what sound events are happening but localizing corresponding sound sources. This series of tasks was first introduced in DCASE 2019 Task 3. In 2020, the sound event…
Polyphonic sound event localization and detection (SELD), which jointly performs sound event detection (SED) and direction-of-arrival (DoA) estimation, detects the type and occurrence time of sound events as well as their corresponding DoA…
Considering that acoustic scenes and sound events are closely related to each other, in some previous papers, a joint analysis of acoustic scenes and sound events utilizing multitask learning (MTL)-based neural networks was proposed. In…
Sound event detection is an important facet of audio tagging that aims to identify sounds of interest and define both the sound category and time boundaries for each sound event in a continuous recording. With advances in deep neural…
The challenges of polyphonic sound event detection (PSED) stem from the detection of multiple overlapping events in a time series. Recent efforts exploit Deep Neural Networks (DNNs) on Time-Frequency Representations (TFRs) of audio clips as…
Polyphonic sound event detection (polyphonic SED) is an interesting but challenging task due to the concurrence of multiple sound events. Recently, SED methods based on convolutional neural networks (CNN) and recurrent neural networks (RNN)…
Sound event detection (SED) entails identifying the type of sound and estimating its temporal boundaries from acoustic signals. These events are uniquely characterized by their spatio-temporal features, which are determined by the way they…
Sound event detection (SED) and acoustic scene classification (ASC) are important research topics in environmental sound analysis. Many research groups have addressed SED and ASC using neural-network-based methods, such as the convolutional…
Sound event detection is a challenging task, especially for scenes with multiple simultaneous events. While event classification methods tend to be fairly accurate, event localization presents additional challenges, especially when large…
In this paper, we propose a stacked convolutional and recurrent neural network (CRNN) with a 3D convolutional neural network (CNN) in the first layer for the multichannel sound event detection (SED) task. The 3D CNN enables the network to…
One hour before sunrise, one can experience the dawn chorus where birds from different species sing together. In this scenario, high levels of polyphony, as in the number of overlapping sound sources, are prone to happen resulting in a…
This paper considers a semi-supervised learning framework for weakly labeled polyphonic sound event detection problems for the DCASE 2019 challenge's task4 by combining both the tri-training and adversarial learning. The goal of the task4…
This paper presents our work of training acoustic event detection (AED) models using unlabeled dataset. Recent acoustic event detectors are based on large-scale neural networks, which are typically trained with huge amounts of labeled data.…
Sound event detection (SED) and acoustic scene classification (ASC) are major tasks in environmental sound analysis. Considering that sound events and scenes are closely related to each other, some works have addressed joint analyses of…
This paper proposes an effective modelling of sound event spectra with a hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The proposed method models each event as an aggregated representation of a few latent factors,…