Related papers: Feedback Recurrent AutoEncoder

Recurrent autoencoder with sequence-aware encoding

Recurrent Neural Networks (RNN) received a vast amount of attention last decade. Recently, the architectures of Recurrent AutoEncoders (RAE) found many applications in practice. RAE can extract the semantically valuable information, called…

Machine Learning · Computer Science 2021-06-14 Robert Susik

Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition

For many Automatic Speech Recognition (ASR) tasks audio features as spectrograms show better results than Mel-frequency Cepstral Coefficients (MFCC), but in practice they are hard to use due to a complex dimensionality of a feature space.…

Sound · Computer Science 2024-10-07 Olga Iakovenko , Ivan Bondarenko

Recurrence Boosts Diversity! Revisiting Recurrent Latent Variable in Transformer-Based Variational AutoEncoder for Diverse Text Generation

Variational Auto-Encoder (VAE) has been widely adopted in text generation. Among many variants, recurrent VAE learns token-wise latent variables with each conditioned on the preceding ones, which captures sequential variability better in…

Computation and Language · Computer Science 2022-11-24 Jinyi Hu , Xiaoyuan Yi , Wenhao Li , Maosong Sun , Xing Xie

Feedback Recurrent Autoencoder for Video Compression

Recent advances in deep generative modeling have enabled efficient modeling of high dimensional data distributions and opened up a new horizon for solving data compression problems. Specifically, autoencoder based learned image or video…

Machine Learning · Computer Science 2020-04-10 Adam Golinski , Reza Pourreza , Yang Yang , Guillaume Sautiere , Taco S Cohen

Autoregressive Video Autoencoder with Decoupled Temporal and Spatial Context

Video autoencoders compress videos into compact latent representations for efficient reconstruction, playing a vital role in enhancing the quality and efficiency of video generation. However, existing video autoencoders often entangle…

Computer Vision and Pattern Recognition · Computer Science 2025-12-15 Cuifeng Shen , Lumin Xu , Xingguo Zhu , Gengdai Liu

RADE: A Neural Codec for Transmitting Speech over HF Radio Channels

Speech compression is commonly used to send voice over radio channels in applications such as mobile telephony and two-way push-to-talk (PTT) radio. In classical systems, the speech codec is combined with forward error correction,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-29 David Rowe , Jean-Marc Valin

Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems

In this paper we introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function that can generate more meaningful electroencephalography (EEG) features from raw EEG features to…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-05 Gautam Krishna , Co Tran , Mason Carnahan , Ahmed Tewfik

Sequential Variational Autoencoders for Collaborative Filtering

Variational autoencoders were proven successful in domains such as computer vision and speech processing. Their adoption for modeling user preferences is still unexplored, although recently it is starting to gain attention in the current…

Machine Learning · Computer Science 2018-11-27 Noveen Sachdeva , Giuseppe Manco , Ettore Ritacco , Vikram Pudi

Learning for Video Compression with Recurrent Auto-Encoder and Recurrent Probability Model

The past few years have witnessed increasing interests in applying deep learning to video compression. However, the existing approaches compress a video frame with only a few number of reference frames, which limits their ability to fully…

Image and Video Processing · Electrical Eng. & Systems 2021-03-18 Ren Yang , Fabian Mentzer , Luc Van Gool , Radu Timofte

RAVE: A variational autoencoder for fast and high-quality neural audio synthesis

Deep generative models applied to audio have improved by a large margin the state-of-the-art in many speech and music related tasks. However, as raw waveform modelling remains an inherently difficult task, audio generative models are either…

Machine Learning · Computer Science 2021-12-16 Antoine Caillon , Philippe Esling

Flipped-Adversarial AutoEncoders

We propose a flipped-Adversarial AutoEncoder (FAAE) that simultaneously trains a generative model G that maps an arbitrary latent code distribution to a data distribution and an encoder E that embodies an "inverse mapping" that encodes a…

Machine Learning · Computer Science 2018-04-05 Jiyi Zhang , Hung Dang , Hwee Kuan Lee , Ee-Chien Chang

A Generative-First Neural Audio Autoencoder

Neural autoencoders underpin generative models. Practical, large-scale use of neural autoencoders for generative modeling necessitates fast encoding, low latent rates, and a single model across representations. Existing approaches are…

Sound · Computer Science 2026-02-23 Jonah Casebeer , Ge Zhu , Zhepei Wang , Nicholas J. Bryan

Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks

Hybrid methods that utilize both content and rating information are commonly used in many recommender systems. However, most of them use either handcrafted features or the bag-of-words representation as a surrogate for the content…

Machine Learning · Computer Science 2016-11-03 Hao Wang , Xingjian Shi , Dit-Yan Yeung

Deep Recurrent Encoder: A scalable end-to-end network to model brain signals

Understanding how the brain responds to sensory inputs is challenging: brain recordings are partial, noisy, and high dimensional; they vary across sessions and subjects and they capture highly nonlinear dynamics. These challenges have led…

Neurons and Cognition · Quantitative Biology 2022-10-03 Omar Chehab , Alexandre Defossez , Jean-Christophe Loiseau , Alexandre Gramfort , Jean-Remi King

Sparse Autoencoders, Again?

Is there really much more to say about sparse autoencoders (SAEs)? Autoencoders in general, and SAEs in particular, represent deep architectures that are capable of modeling low-dimensional latent structure in data. Such structure could…

Machine Learning · Computer Science 2025-06-09 Yin Lu , Xuening Zhu , Tong He , David Wipf

Deep Compressive Autoencoder for Action Potential Compression in Large-Scale Neural Recording

Understanding the coordinated activity underlying brain computations requires large-scale, simultaneous recordings from distributed neuronal structures at a cellular-level resolution. One major hurdle to design high-bandwidth,…

Neural and Evolutionary Computing · Computer Science 2018-09-18 Tong Wu , Wenfeng Zhao , Edward Keefer , Zhi Yang

Variational Recurrent Auto-Encoders

In this paper we propose a model that combines the strengths of RNNs and SGVB: the Variational Recurrent Auto-Encoder (VRAE). Such a model can be used for efficient, large scale unsupervised learning on time series data, mapping the time…

Machine Learning · Statistics 2015-06-16 Otto Fabius , Joost R. van Amersfoort

A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music

The Variational Autoencoder (VAE) has proven to be an effective model for producing semantically meaningful latent representations for natural data. However, it has thus far seen limited application to sequential data, and, as we…

Machine Learning · Computer Science 2019-11-12 Adam Roberts , Jesse Engel , Colin Raffel , Curtis Hawthorne , Douglas Eck

Disentangled Sequential Autoencoder

We present a VAE architecture for encoding and generating high dimensional sequential data, such as video or audio. Our deep generative model learns a latent representation of the data which is split into a static and dynamic part, allowing…

Machine Learning · Computer Science 2018-06-13 Yingzhen Li , Stephan Mandt

A Recurrent Variational Autoencoder for Speech Enhancement

This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE). The deep generative speech model is trained using clean speech signals only, and it is combined with a nonnegative matrix…

Machine Learning · Computer Science 2020-02-11 Simon Leglaive , Xavier Alameda-Pineda , Laurent Girin , Radu Horaud