Related papers: Vector-Quantized Autoregressive Predictive Coding

Generative Pre-Training for Speech with Autoregressive Predictive Coding

Learning meaningful and general representations from unannotated speech that are applicable to a wide range of tasks remains challenging. In this paper we propose to use autoregressive predictive coding (APC), a recently proposed…

Audio and Speech Processing · Electrical Eng. & Systems 2020-01-28 Yu-An Chung , James Glass

Aligned Contrastive Predictive Coding

We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations. Rather than producing individual predictions for each of the future…

Machine Learning · Computer Science 2024-09-13 Jan Chorowski , Grzegorz Ciesielski , Jarosław Dzikowski , Adrian Łańcucki , Ricard Marxer , Mateusz Opala , Piotr Pusz , Paweł Rychlikowski , Michał Stypułkowski

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies

Self-supervised speech representations have been shown to be effective in a variety of speech applications. However, existing representation learning methods generally rely on the autoregressive model and/or observed global dependencies…

Computation and Language · Computer Science 2020-11-03 Alexander H. Liu , Yu-An Chung , James Glass

Variable-rate hierarchical CPC leads to acoustic unit discovery in speech

The success of deep learning comes from its ability to capture the hierarchical structure of data by learning high-level representations defined in terms of low-level ones. In this paper we explore self-supervised learning of hierarchical…

Sound · Computer Science 2022-12-06 Santiago Cuervo , Adrian Łańcucki , Ricard Marxer , Paweł Rychlikowski , Jan Chorowski

Self-supervised Representation Learning with Relative Predictive Coding

This paper introduces Relative Predictive Coding (RPC), a new contrastive representation learning objective that maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance. The key to the…

Machine Learning · Computer Science 2021-04-14 Yao-Hung Hubert Tsai , Martin Q. Ma , Muqiao Yang , Han Zhao , Louis-Philippe Morency , Ruslan Salakhutdinov

Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets

Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any…

Computation and Language · Computer Science 2020-07-09 María Andrea Cruz Blandón , Okko Räsänen

Improved Speech Representations with Multi-Target Autoregressive Predictive Coding

Training objectives based on predictive coding have recently been shown to be very effective at learning meaningful representations from unlabeled speech. One example is Autoregressive Predictive Coding (Chung et al., 2019), which trains an…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-14 Yu-An Chung , James Glass

Contrastive Separative Coding for Self-supervised Representation Learning

To extract robust deep representations from long sequential modeling of speech data, we propose a self-supervised learning approach, namely Contrastive Separative Coding (CSC). Our key finding is to learn such representations by separating…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-02 Jun Wang , Max W. Y. Lam , Dan Su , Dong Yu

Meta-Representational Predictive Coding: Biomimetic Self-Supervised Learning

Self-supervised learning has become an increasingly important paradigm in the domain of machine intelligence. Furthermore, evidence for self-supervised adaptation, such as contrastive formulations, has emerged in recent computational…

Neural and Evolutionary Computing · Computer Science 2025-03-31 Alexander Ororbia , Karl Friston , Rajesh P. N. Rao

Wav2vec-C: A Self-supervised Model for Speech Representation Learning

Wav2vec-C introduces a novel representation learning technique combining elements from wav2vec 2.0 and VQ-VAE. Our model learns to reproduce quantized representations from partially masked speech encoding using a contrastive loss in a way…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-25 Samik Sadhu , Di He , Che-Wei Huang , Sri Harish Mallidi , Minhua Wu , Ariya Rastrow , Andreas Stolcke , Jasha Droppo , Roland Maas

Representation Learning for Sequence Data with Deep Autoencoding Predictive Components

We propose Deep Autoencoding Predictive Components (DAPC) -- a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the…

Machine Learning · Computer Science 2021-03-02 Junwen Bai , Weiran Wang , Yingbo Zhou , Caiming Xiong

Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge

In this paper, we explore vector quantization for acoustic unit discovery. Leveraging unlabelled data, we aim to learn discrete representations of speech that separate phonetic content from speaker-specific details. We propose two neural…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-20 Benjamin van Niekerk , Leanne Nortje , Herman Kamper

Regularizing Contrastive Predictive Coding for Speech Applications

Self-supervised methods such as Contrastive predictive Coding (CPC) have greatly improved the quality of the unsupervised representations. These representations significantly reduce the amount of labeled data needed for downstream task…

Audio and Speech Processing · Electrical Eng. & Systems 2023-04-27 Saurabhchand Bhati , Jesús Villalba , Piotr Żelasko , Laureano Moro-Velazquez , Najim Dehak

Learning Product Codebooks using Vector Quantized Autoencoders for Image Retrieval

Vector-Quantized Variational Autoencoders (VQ-VAE)[1] provide an unsupervised model for learning discrete representations by combining vector quantization and autoencoders. In this paper, we study the use of VQ-VAE for representation…

Image and Video Processing · Electrical Eng. & Systems 2019-03-05 Hanwei Wu , Markus Flierl

Neural Discrete Representation Learning

Learning useful representations without supervision remains a key challenge in machine learning. In this paper, we propose a simple yet powerful generative model that learns such discrete representations. Our model, the Vector…

Machine Learning · Computer Science 2018-05-31 Aaron van den Oord , Oriol Vinyals , Koray Kavukcuoglu

Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification

In this paper, we propose a novel way of addressing text-dependent automatic speaker verification (TD-ASV) by using a shared-encoder with task-specific decoders. An autoregressive predictive coding (APC) encoder is pre-trained in an…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-11 Vijay Ravi , Ruchao Fan , Amber Afshan , Huanhua Lu , Abeer Alwan

An Unsupervised Autoregressive Model for Speech Representation Learning

This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is…

Computation and Language · Computer Science 2019-06-20 Yu-An Chung , Wei-Ning Hsu , Hao Tang , James Glass

Guided contrastive self-supervised pre-training for automatic speech recognition

Contrastive Predictive Coding (CPC) is a representation learning method that maximizes the mutual information between intermediate latent representations and the output of a given model. It can be used to effectively initialize the encoder…

Computation and Language · Computer Science 2023-02-06 Aparna Khare , Minhua Wu , Saurabhchand Bhati , Jasha Droppo , Roland Maas

Unsupervised speech representation learning using WaveNet autoencoders

We consider the task of unsupervised extraction of meaningful latent representations of speech by applying autoencoding neural networks to speech waveforms. The goal is to learn a representation able to capture high level semantic content…

Machine Learning · Computer Science 2019-09-12 Jan Chorowski , Ron J. Weiss , Samy Bengio , Aäron van den Oord

Compressed Predictive Information Coding

Unsupervised learning plays an important role in many fields, such as artificial intelligence, machine learning, and neuroscience. Compared to static data, methods for extracting low-dimensional structure for dynamic data are lagging. We…

Machine Learning · Computer Science 2022-03-07 Rui Meng , Tianyi Luo , Kristofer Bouchard