English
Related papers

Related papers: Generative Pre-Training for Speech with Autoregres…

200 papers

Self-supervised speech representations have been shown to be effective in a variety of speech applications. However, existing representation learning methods generally rely on the autoregressive model and/or observed global dependencies…

Computation and Language · Computer Science 2020-11-03 Alexander H. Liu , Yu-An Chung , James Glass

Training objectives based on predictive coding have recently been shown to be very effective at learning meaningful representations from unlabeled speech. One example is Autoregressive Predictive Coding (Chung et al., 2019), which trains an…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-14 Yu-An Chung , James Glass

Autoregressive Predictive Coding (APC), as a self-supervised objective, has enjoyed success in learning representations from large amounts of unlabeled data, and the learned representations are rich for many downstream tasks. However, the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-19 Yu-An Chung , Hao Tang , James Glass

Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any…

Computation and Language · Computer Science 2020-07-09 María Andrea Cruz Blandón , Okko Räsänen

Building a good speech recognition system usually requires large amounts of transcribed data, which is expensive to collect. To tackle this problem, many unsupervised pre-training methods have been proposed. Among these methods, Masked…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-24 Dongwei Jiang , Wubo Li , Ruixiong Zhang , Miao Cao , Ne Luo , Yang Han , Wei Zou , Xiangang Li

We present a bidirectional unsupervised model pre-training (UPT) method and apply it to children's automatic speech recognition (ASR). An obstacle to improving child ASR is the scarcity of child speech databases. A common approach to…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-16 Ruchao Fan , Amber Afshan , Abeer Alwan

Contrastive Predictive Coding (CPC) is a representation learning method that maximizes the mutual information between intermediate latent representations and the output of a given model. It can be used to effectively initialize the encoder…

Computation and Language · Computer Science 2023-02-06 Aparna Khare , Minhua Wu , Saurabhchand Bhati , Jasha Droppo , Roland Maas

This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is…

Computation and Language · Computer Science 2019-06-20 Yu-An Chung , Wei-Ning Hsu , Hao Tang , James Glass

In this paper, we propose a novel way of addressing text-dependent automatic speaker verification (TD-ASV) by using a shared-encoder with task-specific decoders. An autoregressive predictive coding (APC) encoder is pre-trained in an…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-11 Vijay Ravi , Ruchao Fan , Amber Afshan , Huanhua Lu , Abeer Alwan

In this paper, we propose the use of self-supervised pretraining on a large unlabelled data set to improve the performance of a personalized voice activity detection (VAD) model in adverse conditions. We pretrain a long short-term memory…

Sound · Computer Science 2024-01-24 Holger Severin Bovbjerg , Jesper Jensen , Jan Østergaard , Zheng-Hua Tan

A large amount of recent research has the far-reaching goal of finding training methods for deep neural networks that can serve as alternatives to backpropagation (BP). A prominent example is predictive coding (PC), which is a…

Machine Learning · Computer Science 2022-11-08 Luca Pinchetti , Tommaso Salvatori , Yordan Yordanov , Beren Millidge , Yuhang Song , Thomas Lukasiewicz

We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations. Rather than producing individual predictions for each of the future…

Self-supervised learning has become an increasingly important paradigm in the domain of machine intelligence. Furthermore, evidence for self-supervised adaptation, such as contrastive formulations, has emerged in recent computational…

Neural and Evolutionary Computing · Computer Science 2025-03-31 Alexander Ororbia , Karl Friston , Rajesh P. N. Rao

This paper introduces Relative Predictive Coding (RPC), a new contrastive representation learning objective that maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance. The key to the…

Machine Learning · Computer Science 2021-04-14 Yao-Hung Hubert Tsai , Martin Q. Ma , Muqiao Yang , Han Zhao , Louis-Philippe Morency , Ruslan Salakhutdinov

Cross-lingual and multi-lingual training of Automatic Speech Recognition (ASR) has been extensively investigated in the supervised setting. This assumes the existence of a parallel corpus of speech and orthographic transcriptions. Recently,…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-10 Morgane Rivière , Armand Joulin , Pierre-Emmanuel Mazaré , Emmanuel Dupoux

While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose…

Machine Learning · Computer Science 2019-01-23 Aaron van den Oord , Yazhe Li , Oriol Vinyals

The dichotomy between the challenging nature of obtaining annotations for activities, and the more straightforward nature of data collection from wearables, has resulted in significant interest in the development of techniques that utilize…

Machine Learning · Computer Science 2022-11-14 Harish Haresamudram , Irfan Essa , Thomas Ploetz

Despite being the best known objective for learning speech representations, the HuBERT objective has not been further developed and improved. We argue that it is the lack of an underlying principle that stalls the development, and, in this…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-05 Sung-Lin Yeh , Peter Bell , Hao Tang

To extract robust deep representations from long sequential modeling of speech data, we propose a self-supervised learning approach, namely Contrastive Separative Coding (CSC). Our key finding is to learn such representations by separating…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-02 Jun Wang , Max W. Y. Lam , Dan Su , Dong Yu

Learning speaker-specific features is vital in many applications like speaker recognition, diarization and speech recognition. This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific…

Sound · Computer Science 2019-07-18 Arindam Jati , Panayiotis Georgiou
‹ Prev 1 2 3 10 Next ›