English
Related papers

Related papers: Variable-rate discrete representation learning

200 papers

An ability to model a generative process and learn a latent representation for speech in an unsupervised fashion will be crucial to process vast quantities of unlabelled speech data. Recently, deep probabilistic generative models such as…

Computation and Language · Computer Science 2017-09-25 Wei-Ning Hsu , Yu Zhang , James Glass

We consider the task of unsupervised extraction of meaningful latent representations of speech by applying autoencoding neural networks to speech waveforms. The goal is to learn a representation able to capture high level semantic content…

Machine Learning · Computer Science 2019-09-12 Jan Chorowski , Ron J. Weiss , Samy Bengio , Aäron van den Oord

Learning the latent representation of data in unsupervised fashion is a very interesting process that provides relevant features for enhancing the performance of a classifier. For speech emotion recognition tasks, generating effective…

Sound · Computer Science 2020-07-29 Siddique Latif , Rajib Rana , Junaid Qadir , Julien Epps

Syntactic information contains structures and rules about how text sentences are arranged. Incorporating syntax into text modeling methods can potentially benefit both representation learning and generation. Variational autoencoders (VAEs)…

Computation and Language · Computer Science 2019-08-28 Yijun Xiao , William Yang Wang

State-of-the-art Variational Auto-Encoders (VAEs) for learning disentangled latent representations give impressive results in discovering features like pitch, pause duration, and accent in speech data, leading to highly controllable…

Sound · Computer Science 2021-05-11 Shakti Kumar , Jithin Pradeep , Hussain Zaidi

Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. However, the sequential text generation common pitfall with VAEs is that the model tends to ignore latent variables with a strong auto-regressive…

Machine Learning · Computer Science 2021-02-26 Yang Zhao , Ping Yu , Suchismit Mahapatra , Qinliang Su , Changyou Chen

In this thesis, we develop methods to enhance the interpretability of recent representation learning techniques in natural language processing (NLP) while accounting for the unavailability of annotated data. We choose to leverage…

Computation and Language · Computer Science 2023-05-05 Ghazi Felhi

Recent advancements in learning Discrete Representations as opposed to continuous ones have led to state of art results in tasks that involve Language, Audio and Vision. Some latent factors such as words, phonemes and shapes are better…

Machine Learning · Computer Science 2020-04-14 Iordanis Fostiropoulos

Advancement in speech technology has brought convenience to our life. However, the concern is on the rise as speech signal contains multiple personal attributes, which would lead to either sensitive information leakage or bias toward…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-09 Yu-Lin Huang , Bo-Hao Su , Y. -W. Peter Hong , Chi-Chun Lee

In this work we present an unsupervised approach to summarize sentences in abstractive way using Variational Autoencoder (VAE). VAE are known to learn a semantically rich latent variable, representing high dimensional input. VAEs are…

Computation and Language · Computer Science 2018-09-24 Raphael Schumann

The Variational Autoencoder (VAE) is a powerful deep generative model that is now extensively used to represent high-dimensional complex data via a low-dimensional latent space learned in an unsupervised manner. In the original VAE model,…

Sound · Computer Science 2021-06-15 Xiaoyu Bie , Laurent Girin , Simon Leglaive , Thomas Hueber , Xavier Alameda-Pineda

Neural conversation models such as encoder-decoder models are easy to generate bland and generic responses. Some researchers propose to use the conditional variational autoencoder(CVAE) which maximizes the lower bound on the conditional…

Computation and Language · Computer Science 2019-11-25 Jun Gao , Wei Bi , Xiaojiang Liu , Junhui Li , Guodong Zhou , Shuming Shi

Variational Autoencoders (VAEs) are well-established as a principled approach to probabilistic unsupervised learning with neural networks. Typically, an encoder network defines the parameters of a Gaussian distributed latent space from…

Machine Learning · Computer Science 2025-05-16 Alan Jeffares , Liyuan Liu

Variational autoencoders (VAEs) have been used extensively to discover low-dimensional latent factors governing neural activity and animal behavior. However, without careful model selection, the uncovered latent factors may reflect noise in…

Machine Learning · Computer Science 2023-12-13 Julia Huiming Wang , Dexter Tsin , Tatiana Engel

In recent years, speech emotion recognition (SER) has been used in wide ranging applications, from healthcare to the commercial sector. In addition to signal processing approaches, methods for SER now also use deep learning techniques which…

Audio and Speech Processing · Electrical Eng. & Systems 2022-03-29 Sneha Das , Nicole Nadine Lønfeldt , Anne Katrine Pagsberg , Line H. Clemmensen

While sparse autoencoders (SAEs) successfully extract interpretable features from language models, applying them to audio generation faces unique challenges: audio's dense nature requires compression that obscures semantic meaning, and…

Machine Learning · Computer Science 2025-10-31 Nathan Paek , Yongyi Zang , Qihui Yang , Randal Leistikow

Variational autoencoders (VAEs) are powerful deep generative models widely used to represent high-dimensional complex data through a low-dimensional latent space learned in an unsupervised manner. In the original VAE model, the input data…

Machine Learning · Computer Science 2022-07-05 Laurent Girin , Simon Leglaive , Xiaoyu Bie , Julien Diard , Thomas Hueber , Xavier Alameda-Pineda

Neural latent variable models enable the discovery of interesting structure in speech audio data. This paper presents a comparison of two different approaches which are broadly based on predicting future time-steps or auto-encoding the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-28 Henry Zhou , Alexei Baevski , Michael Auli

While several self-supervised approaches for learning discrete speech representation have been proposed, it is unclear how these seemingly similar approaches relate to each other. In this paper, we consider a generative model with discrete…

Computation and Language · Computer Science 2022-11-01 Sung-Lin Yeh , Hao Tang

Speech signals, typically sampled at rates in the tens of thousands per second, contain redundancies, evoking inefficiencies in sequence modeling. High-dimensional speech features such as spectrograms are often used as the input for the…

‹ Prev 1 2 3 10 Next ›