English
Related papers

Related papers: Multi-label Open-set Audio Classification

200 papers

Polyphonic events are the main error source of audio event detection (AED) systems. In deep-learning context, the most common approach to deal with event overlaps is to treat the AED task as a multi-label classification problem. By doing…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-01 Huy Phan , Thi Ngoc Tho Nguyen , Philipp Koch , Alfred Mertins

Zero-shot learning models are capable of classifying new classes by transferring knowledge from the seen classes using auxiliary information. While most of the existing zero-shot learning methods focused on single-label classification…

Sound · Computer Science 2024-09-04 Duygu Dogan , Huang Xie , Toni Heittola , Tuomas Virtanen

Multi-label classification is a common supervised machine learning problem where each instance is associated with multiple classes. The key challenge in this problem is learning the correlations between the classes. An additional challenge…

Machine Learning · Computer Science 2016-04-05 Divya Padmanabhan , Satyanath Bhat , Shirish Shevade , Y. Narahari

In this paper, we propose a multi-level attention model to solve the weakly labelled audio classification problem. The objective of audio classification is to predict the presence or absence of audio events in an audio clip. Recently,…

Audio and Speech Processing · Electrical Eng. & Systems 2018-03-08 Changsong Yu , Karim Said Barsim , Qiuqiang Kong , Bin Yang

In this paper, we propose a multi-label classification framework to detect multiple speaking styles in a speech sample. Unlike previous studies that have primarily focused on identifying a single target style, our framework effectively…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-19 Miseul Kim , Seyun Um , Hyeonjin Cha , Hong-goo Kang

In real-world applications, as data availability increases, obtaining labeled data for machine learning (ML) projects remains challenging due to the high costs and intensive efforts required for data annotation. Many ML projects,…

Machine Learning · Computer Science 2024-12-24 Ismail Hakki Karaman , Gulser Koksal , Levent Eriskin , Salih Salihoglu

Audio-based music classification and tagging is typically based on categorical supervised learning with a fixed set of labels. This intrinsically cannot handle unseen labels such as newly added music genres or semantic words that users…

Machine Learning · Computer Science 2020-03-20 Jeong Choi , Jongpil Lee , Jiyoung Park , Juhan Nam

Acoustic event detection is essential for content analysis and description of multimedia recordings. The majority of current literature on the topic learns the detectors through fully-supervised techniques employing strongly labeled data.…

Sound · Computer Science 2016-07-07 Anurag Kumar , Bhiksha Raj

The study of label noise in sound event recognition has recently gained attention with the advent of larger and noisier datasets. This work addresses the problem of missing labels, one of the big weaknesses of large audio datasets, and one…

We propose a novel training scheme using self-label correction and data augmentation methods designed to deal with noisy labels and improve real-world accuracy on a polyphonic audio content detection task. The augmentation method reduces…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-23 Sebastian Braun , Hannes Gamper

We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlapping audio events. The framework leverages the power of convolutional recurrent neural network…

Machine Learning · Computer Science 2019-02-20 Huy Phan , Oliver Y. Chén , Philipp Koch , Lam Pham , Ian McLoughlin , Alfred Mertins , Maarten De Vos

In recent years, multi-label classification has attracted a significant body of research, motivated by real-life applications, such as text classification and medical diagnoses. Although sparsely studied in this context, Learning Classifier…

Neural and Evolutionary Computing · Computer Science 2015-12-29 Fani A. Tzima , Miltiadis Allamanis , Alexandros Filotheou , Pericles A. Mitkas

Acoustic scene classification systems using deep neural networks classify given recordings into pre-defined classes. In this study, we propose a novel scheme for acoustic scene classification which adopts an audio tagging system inspired by…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-21 Jee-weon Jung , Hye-jin Shim , Ju-ho Kim , Seung-bin Kim , Ha-Jin Yu

Multi-label classification (MLC) is a generalization of standard classification where multiple labels may be assigned to a given sample. In the real world, it is more common to deal with noisy datasets than clean datasets, given how modern…

Machine Learning · Computer Science 2021-02-18 Wenting Zhao , Carla Gomes

Accurately detecting dysfluencies in spoken language can help to improve the performance of automatic speech and language processing components and support the development of more inclusive speech and language technologies. Inspired by the…

Perceiving a scene most fully requires all the senses. Yet modeling how objects look and sound is challenging: most natural scenes and events contain multiple objects, and the audio track mixes all the sound sources together. We propose to…

Computer Vision and Pattern Recognition · Computer Science 2018-07-27 Ruohan Gao , Rogerio Feris , Kristen Grauman

Acoustic Scene Classification (ASC) and Sound Event Detection (SED) are two separate tasks in the field of computational sound scene analysis. In this work, we present a new dataset with both sound scene and sound event labels and use this…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-02 Helen L. Bear , Ines Nolasco , Emmanouil Benetos

We learn rich natural sound representations by capitalizing on large amounts of unlabeled sound data collected in the wild. We leverage the natural synchronization between vision and sound to learn an acoustic representation using…

Computer Vision and Pattern Recognition · Computer Science 2016-10-31 Yusuf Aytar , Carl Vondrick , Antonio Torralba

Machine listening systems often rely on fixed taxonomies to organize and label audio data, key for training and evaluating deep neural networks (DNNs) and other supervised algorithms. However, such taxonomies face significant constraints:…

Sound · Computer Science 2024-09-19 Paraskevas Stamatiadis , Michel Olvera , Slim Essid

Sound event detection is an important facet of audio tagging that aims to identify sounds of interest and define both the sound category and time boundaries for each sound event in a continuous recording. With advances in deep neural…

Sound · Computer Science 2024-12-31 Sangwook Park , David K. Han , Mounya Elhilali
‹ Prev 1 2 3 10 Next ›