Related papers: Regularizing Contrastive Predictive Coding for Spe…

Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing

Contrastive predictive coding (CPC) aims to learn representations of speech by distinguishing future observations from a set of negative examples. Previous work has shown that linear classifiers trained on CPC features can accurately…

Audio and Speech Processing · Electrical Eng. & Systems 2021-08-03 Benjamin van Niekerk , Leanne Nortje , Matthew Baas , Herman Kamper

Aligned Contrastive Predictive Coding

We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations. Rather than producing individual predictions for each of the future…

Machine Learning · Computer Science 2024-09-13 Jan Chorowski , Grzegorz Ciesielski , Jarosław Dzikowski , Adrian Łańcucki , Ricard Marxer , Mateusz Opala , Piotr Pusz , Paweł Rychlikowski , Michał Stypułkowski

Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding

Typically, unsupervised segmentation of speech into the phone and word-like units are treated as separate tasks and are often done via different methods which do not fully leverage the inter-dependence of the two tasks. Here, we unify them…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-12 Saurabhchand Bhati , Jesús Villalba , Piotr Żelasko , Laureano Moro-Velazquez , Najim Dehak

Contrastive Predictive Coding Based Feature for Automatic Speaker Verification

This thesis describes our ongoing work on Contrastive Predictive Coding (CPC) features for speaker verification. CPC is a recently proposed representation learning framework based on predictive coding and noise contrastive estimation. We…

Computation and Language · Computer Science 2019-04-04 Cheng-I Lai

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal. However, it still under-performs other methods on…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-03 Eugene Kharitonov , Morgane Rivière , Gabriel Synnaeve , Lior Wolf , Pierre-Emmanuel Mazaré , Matthijs Douze , Emmanuel Dupoux

Investigating Enhancements to Contrastive Predictive Coding for Human Activity Recognition

The dichotomy between the challenging nature of obtaining annotations for activities, and the more straightforward nature of data collection from wearables, has resulted in significant interest in the development of techniques that utilize…

Machine Learning · Computer Science 2022-11-14 Harish Haresamudram , Irfan Essa , Thomas Ploetz

Guided contrastive self-supervised pre-training for automatic speech recognition

Contrastive Predictive Coding (CPC) is a representation learning method that maximizes the mutual information between intermediate latent representations and the output of a given model. It can be used to effectively initialize the encoder…

Computation and Language · Computer Science 2023-02-06 Aparna Khare , Minhua Wu , Saurabhchand Bhati , Jasha Droppo , Roland Maas

Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets

Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any…

Computation and Language · Computer Science 2020-07-09 María Andrea Cruz Blandón , Okko Räsänen

Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning

Contrastive learning enables learning useful audio and speech representations without ground-truth labels by maximizing the similarity between latent representations of similar signal segments. In this framework various data augmentation…

Audio and Speech Processing · Electrical Eng. & Systems 2022-04-11 Salah Zaiem , Titouan Parcollet , Slim Essid

Contrastive Regularization for Semi-Supervised Learning

Consistency regularization on label predictions becomes a fundamental technique in semi-supervised learning, but it still requires a large number of training iterations for high performance. In this study, we analyze that the consistency…

Machine Learning · Computer Science 2022-06-10 Doyup Lee , Sungwoong Kim , Ildoo Kim , Yeongjae Cheon , Minsu Cho , Wook-Shin Han

Structured Probabilistic Coding

This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC), to learn compact and informative representations from input related to the target task. SPC is an encoder-only…

Computation and Language · Computer Science 2024-05-03 Dou Hu , Lingwei Wei , Yaxin Liu , Wei Zhou , Songlin Hu

Improving Audio Event Recognition with Consistency Regularization

Consistency regularization (CR), which enforces agreement between model predictions on augmented views, has found recent benefits in automatic speech recognition [1]. In this paper, we propose the use of consistency regularization for audio…

Sound · Computer Science 2025-09-15 Shanmuka Sadhu , Weiran Wang

Consistency Regularization for Cross-Lingual Fine-Tuning

Fine-tuning pre-trained cross-lingual language models can transfer task-specific supervision from one language to the others. In this work, we propose to improve cross-lingual fine-tuning with consistency regularization. Specifically, we…

Computation and Language · Computer Science 2021-06-16 Bo Zheng , Li Dong , Shaohan Huang , Wenhui Wang , Zewen Chi , Saksham Singhal , Wanxiang Che , Ting Liu , Xia Song , Furu Wei

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies

Self-supervised speech representations have been shown to be effective in a variety of speech applications. However, existing representation learning methods generally rely on the autoregressive model and/or observed global dependencies…

Computation and Language · Computer Science 2020-11-03 Alexander H. Liu , Yu-An Chung , James Glass

Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words

We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC). Our experiments show that with the existing…

Machine Learning · Computer Science 2024-09-13 Santiago Cuervo , Maciej Grabias , Jan Chorowski , Grzegorz Ciesielski , Adrian Łańcucki , Paweł Rychlikowski , Ricard Marxer

Vector-Quantized Autoregressive Predictive Coding

Autoregressive Predictive Coding (APC), as a self-supervised objective, has enjoyed success in learning representations from large amounts of unlabeled data, and the learned representations are rich for many downstream tasks. However, the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-19 Yu-An Chung , Hao Tang , James Glass

Cross-lingual Spoken Language Understanding with Regularized Representation Alignment

Despite the promising results of current cross-lingual models for spoken language understanding systems, they still suffer from imperfect cross-lingual representation alignments between the source and target languages, which makes the…

Computation and Language · Computer Science 2020-10-01 Zihan Liu , Genta Indra Winata , Peng Xu , Zhaojiang Lin , Pascale Fung

Joint Masked CPC and CTC Training for ASR

Self-supervised learning (SSL) has shown promise in learning representations of audio that are useful for automatic speech recognition (ASR). But, training SSL models like wav2vec~2.0 requires a two-stage pipeline. In this paper we…

Computation and Language · Computer Science 2021-02-16 Chaitanya Talnikar , Tatiana Likhomanenko , Ronan Collobert , Gabriel Synnaeve

Variable-rate hierarchical CPC leads to acoustic unit discovery in speech

The success of deep learning comes from its ability to capture the hierarchical structure of data by learning high-level representations defined in terms of low-level ones. In this paper we explore self-supervised learning of hierarchical…

Sound · Computer Science 2022-12-06 Santiago Cuervo , Adrian Łańcucki , Ricard Marxer , Paweł Rychlikowski , Jan Chorowski

Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization

In this study, we proposed a novel semi-supervised training method that uses unlabeled data with a class distribution that is completely different from the target data or data without a target label. To this end, we introduce a contrastive…

Sound · Computer Science 2021-09-30 Donmoon Lee , Kyogu Lee