English
Related papers

Related papers: Improved Speech Representations with Multi-Target …

200 papers

Learning meaningful and general representations from unannotated speech that are applicable to a wide range of tasks remains challenging. In this paper we propose to use autoregressive predictive coding (APC), a recently proposed…

Audio and Speech Processing · Electrical Eng. & Systems 2020-01-28 Yu-An Chung , James Glass

This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is…

Computation and Language · Computer Science 2019-06-20 Yu-An Chung , Wei-Ning Hsu , Hao Tang , James Glass

While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose…

Machine Learning · Computer Science 2019-01-23 Aaron van den Oord , Yazhe Li , Oriol Vinyals

We introduce here a predictive coding based model that aims to generate accurate and sharp future frames. Inspired by the predictive coding hypothesis and related works, the total model is updated through a combination of bottom-up and…

Computer Vision and Pattern Recognition · Computer Science 2023-05-12 Chaofan Ling , Weihua Li , Junpei Zhong

Despite being the best known objective for learning speech representations, the HuBERT objective has not been further developed and improved. We argue that it is the lack of an underlying principle that stalls the development, and, in this…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-05 Sung-Lin Yeh , Peter Bell , Hao Tang

While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning - leveraging unlabeled examples to learn about the structure of a domain - remains a difficult…

Machine Learning · Computer Science 2017-03-02 William Lotter , Gabriel Kreiman , David Cox

Pre-training text representations has recently been shown to significantly improve the state-of-the-art in many natural language processing tasks. The central goal of pre-training is to learn text representations that are useful for…

Computation and Language · Computer Science 2020-04-14 Shangwen Lv , Yuechen Wang , Daya Guo , Duyu Tang , Nan Duan , Fuqing Zhu , Ming Gong , Linjun Shou , Ryan Ma , Daxin Jiang , Guihong Cao , Ming Zhou , Songlin Hu

Self-supervised speech representations have been shown to be effective in a variety of speech applications. However, existing representation learning methods generally rely on the autoregressive model and/or observed global dependencies…

Computation and Language · Computer Science 2020-11-03 Alexander H. Liu , Yu-An Chung , James Glass

Self-supervised learning of image representations by predicting future frames is a promising direction but still remains a challenge. This is because of the under-determined nature of frame prediction; multiple potential futures can arise…

Computer Vision and Pattern Recognition · Computer Science 2024-08-12 Huiwon Jang , Dongyoung Kim , Junsu Kim , Jinwoo Shin , Pieter Abbeel , Younggyo Seo

Deep representation learning is a subfield of machine learning that focuses on learning meaningful and useful representations of data through deep neural networks. However, existing methods for semantic classification typically employ…

Computer Vision and Pattern Recognition · Computer Science 2024-08-30 Kangjun Liu , Ke Chen , Kui Jia , Yaowei Wang

Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any…

Computation and Language · Computer Science 2020-07-09 María Andrea Cruz Blandón , Okko Räsänen

Autoregressive Predictive Coding (APC), as a self-supervised objective, has enjoyed success in learning representations from large amounts of unlabeled data, and the learned representations are rich for many downstream tasks. However, the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-19 Yu-An Chung , Hao Tang , James Glass

This work presents a novel objective function for the unsupervised training of neural network sentence encoders. It exploits signals from paragraph-level discourse coherence to train these models to understand text. Our objective is purely…

Computation and Language · Computer Science 2017-05-02 Yacine Jernite , Samuel R. Bowman , David Sontag

The unsupervised Pretraining method has been widely used in aiding human action recognition. However, existing methods focus on reconstructing the already present frames rather than generating frames which happen in future.In this paper, We…

Computer Vision and Pattern Recognition · Computer Science 2017-12-13 Yu Runsheng , Shi Zhenyu , Ma Qiongxiong , Qing Laiyun

The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind…

Machine Learning · Computer Science 2014-04-24 Yoshua Bengio , Aaron Courville , Pascal Vincent

Autoregressive language models, pretrained using large text corpora to do well on next word prediction, have been successful at solving many downstream tasks, even with zero-shot usage. However, there is little theoretical understanding of…

Computation and Language · Computer Science 2021-04-15 Nikunj Saunshi , Sadhika Malladi , Sanjeev Arora

Learning a compact representation of history is critical for planning and generalization in partially observable environments. While meta-reinforcement learning (RL) agents can attain near Bayes-optimal policies, they often fail to learn…

Artificial Intelligence · Computer Science 2025-10-28 Po-Chen Kuo , Han Hou , Will Dabney , Edgar Y. Walker

Neural models have become ubiquitous in automatic speech recognition systems. While neural networks are typically used as acoustic models in more complex systems, recent studies have explored end-to-end speech recognition systems based on…

Computation and Language · Computer Science 2017-09-15 Yonatan Belinkov , James Glass

Recent advances in pretraining general foundation models have significantly improved performance across diverse downstream tasks. While autoregressive (AR) generative models like GPT have revolutionized NLP, most visual generative…

Computer Vision and Pattern Recognition · Computer Science 2025-12-25 Jinghan Li , Yang Jin , Hao Jiang , Yadong Mu , Yang Song , Kun Xu

Existing privacy-preserving speech representation learning methods target a single application domain. In this paper, we present a novel framework to anonymize utterance-level speech embeddings generated by pre-trained encoders and show its…

Audio and Speech Processing · Electrical Eng. & Systems 2023-10-27 Minh Tran , Mohammad Soleymani
‹ Prev 1 2 3 10 Next ›