English
Related papers

Related papers: Unsupervised Feature Learning for Audio Analysis

200 papers

Real-world sound scenes consist of time-varying collections of sound sources, each generating characteristic sound events that are mixed together in audio recordings. The association of these constituent sound events with their mixture and…

Machine hearing or listening represents an emerging area. Conventional approaches rely on the design of handcrafted features specialized to a specific audio task and that can hardly generalized to other audio fields. For example,…

Computer Vision and Pattern Recognition · Computer Science 2018-12-13 Imad Rida , Romain Hérault , Gilles Gasso

Unsupervised learning methods for feature extraction are becoming more and more popular. We combine the popular contrastive learning method (prototypical contrastive learning) and the classic representation learning method (autoencoder) to…

Computer Vision and Pattern Recognition · Computer Science 2022-05-11 Zeyu Cao , Xiaorun Li , Liaoying Zhao

Recent progress in network-based audio event classification has shown the benefit of pre-training models on visual data such as ImageNet. While this process allows knowledge transfer across different domains, training a model on large-scale…

Sound · Computer Science 2021-05-21 Sascha Hornauer , Ke Li , Stella X. Yu , Shabnam Ghaffarzadegan , Liu Ren

Sound event detection systems typically consist of two stages: extracting hand-crafted features from the raw audio waveform, and learning a mapping between these features and the target sound events using a classifier. Recently, the focus…

Sound · Computer Science 2018-05-11 Emre Çakır , Tuomas Virtanen

In zero-resource settings where transcribed speech audio is unavailable, unsupervised feature learning is essential for downstream speech processing tasks. Here we compare two recent methods for frame-level acoustic feature learning. For…

Computation and Language · Computer Science 2020-03-31 Petri-Johan Last , Herman A. Engelbrecht , Herman Kamper

High-dimensional data in many areas such as computer vision and machine learning tasks brings in computational and analytical difficulty. Feature selection which selects a subset from observed features is a widely used approach for…

Machine Learning · Computer Science 2018-04-10 Kai Han , Yunhe Wang , Chao Zhang , Chao Li , Chao Xu

Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on…

Unsupervised spoken term discovery consists of two tasks: finding the acoustic segment boundaries and labeling acoustically similar segments with the same labels. We perform segmentation based on the assumption that the frame feature…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-28 Saurabhchand Bhati , Jesús Villalba , Piotr Żelasko , Najim Dehak

Modern day audio signal classification techniques lack the ability to classify low feature audio signals in the form of spectrographic temporal frequency data representations. Additionally, currently utilized techniques rely on full diverse…

Sound · Computer Science 2024-10-30 Noel Elias

Audio captioning is an important research area that aims to generate meaningful descriptions for audio clips. Most of the existing research extracts acoustic features of audio clips as input to encoder-decoder and transformer architectures…

Sound · Computer Science 2022-04-20 Ayşegül Özkaya Eren , Mustafa Sert

We introduce a framework for unsupervised learning of structured predictors with overlapping, global features. Each input's latent representation is predicted conditional on the observable data using a feature-rich conditional random field.…

Machine Learning · Computer Science 2014-11-11 Waleed Ammar , Chris Dyer , Noah A. Smith

We focus on automatic feature extraction for raw audio heartbeat sounds, aimed at anomaly detection applications in healthcare. We learn features with the help of an autoencoder composed by a 1D non-causal convolutional encoder and a…

Sound · Computer Science 2021-02-25 Robert-George Colt , Csongor-Huba Várady , Riccardo Volpi , Luigi Malagò

Sound event detection is a core module for acoustic environmental analysis. Semi-supervised learning technique allows to largely scale up the dataset without increasing the annotation budget, and recently attracts lots of research…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-02 Xiaofei Li

The effectiveness of learning in massive open online courses (MOOCs) can be significantly enhanced by introducing personalized intervention schemes which rely on building predictive models of student learning behaviors such as some…

Machine Learning · Computer Science 2018-12-20 Mucong Ding , Kai Yang , Dit-Yan Yeung , Ting-Chuen Pong

In this work, a novel deep neural network, designed to enhance the efficiency and effectiveness of unsupervised sound anomaly detection, is presented. The proposed model exploits an attention module and separable convolutions to identify…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-14 Michael Neri , Marco Carli

Environmental audio tagging aims to predict only the presence or absence of certain acoustic events in the interested acoustic scene. In this paper we make contributions to audio tagging in two parts, respectively, acoustic modeling and…

This paper proposes to use low-level spatial features extracted from multichannel audio for sound event detection. We extend the convolutional recurrent neural network to handle more than one type of these multichannel features by learning…

Sound · Computer Science 2017-06-09 Sharath Adavanne , Pasi Pertilä , Tuomas Virtanen

Feature selection methods have an important role on the readability of data and the reduction of complexity of learning algorithms. In recent years, a variety of efforts are investigated on feature selection problems based on unsupervised…

Machine Learning · Computer Science 2019-12-12 Mohsen Ghassemi Parsa , Hadi Zare , Mehdi Ghatee

We investigate unsupervised learning of correspondences between sound events and textual phrases through aligning audio clips with textual captions describing the content of a whole audio clip. We align originally unaligned and unannotated…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-22 Huang Xie , Okko Räsänen , Konstantinos Drossos , Tuomas Virtanen
‹ Prev 1 2 3 10 Next ›