Related papers: Convolutional Variational Autoencoders for Spectro…

Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems

In this paper we introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function that can generate more meaningful electroencephalography (EEG) features from raw EEG features to…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-05 Gautam Krishna , Co Tran , Mason Carnahan , Ahmed Tewfik

Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoders

Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then…

Sound · Computer Science 2020-12-18 Mostafa Sadeghi , Simon Leglaive , Xavier Alameda-PIneda , Laurent Girin , Radu Horaud

A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling

The Variational Autoencoder (VAE) is a powerful deep generative model that is now extensively used to represent high-dimensional complex data via a low-dimensional latent space learned in an unsupervised manner. In the original VAE model,…

Sound · Computer Science 2021-06-15 Xiaoyu Bie , Laurent Girin , Simon Leglaive , Thomas Hueber , Xavier Alameda-Pineda

A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders

Recent studies have explored the use of deep generative models of speech spectra based of variational autoencoders (VAEs), combined with unsupervised noise models, to perform speech enhancement. These studies developed iterative algorithms…

Sound · Computer Science 2019-05-15 Manuel Pariente , Antoine Deleforge , Emmanuel Vincent

Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks

The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiovascular assessment. Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals (typically a…

Machine Learning · Computer Science 2024-10-31 Christopher J. Harvey , Sumaiya Shomaji , Zijun Yao , Amit Noheria

ECG Latent Feature Extraction with Autoencoders for Downstream Prediction Tasks

The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiac assessment. Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals (typically a…

Machine Learning · Computer Science 2025-08-04 Christopher Harvey , Sumaiya Shomaji , Zijun Yao , Amit Noheria

Wavelet-based Variational Autoencoders for High-Resolution Image Generation

Variational Autoencoders (VAEs) are powerful generative models capable of learning compact latent representations. However, conventional VAEs often generate relatively blurry images due to their assumption of an isotropic Gaussian latent…

Computer Vision and Pattern Recognition · Computer Science 2025-04-21 Andrew Kiruluta

Self-Supervised Variational Auto-Encoders

Density estimation, compression and data generation are crucial tasks in artificial intelligence. Variational Auto-Encoders (VAEs) constitute a single framework to achieve these goals. Here, we present a novel class of generative models,…

Machine Learning · Statistics 2021-07-07 Ioannis Gatopoulos , Jakub M. Tomczak

Feedback Recurrent AutoEncoder

In this work, we propose a new recurrent autoencoder architecture, termed Feedback Recurrent AutoEncoder (FRAE), for online compression of sequential data with temporal dependency. The recurrent structure of FRAE is designed to efficiently…

Machine Learning · Computer Science 2020-02-18 Yang Yang , Guillaume Sautière , J. Jon Ryu , Taco S Cohen

Convolutional variational autoencoders for secure lossy image compression in remote sensing

The volume of remote sensing data is experiencing rapid growth, primarily due to the plethora of space and air platforms equipped with an array of sensors. Due to limited hardware and battery constraints the data is transmitted back to…

Image and Video Processing · Electrical Eng. & Systems 2024-04-18 Alessandro Giuliano , S. Andrew Gadsden , Waleed Hilal , John Yawney

Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition

Dysarthric speech recognition is a challenging task due to acoustic variability and limited amount of available data. Diverse conditions of dysarthric speakers account for the acoustic variability, which make the variability difficult to be…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-17 Xurong Xie , Rukiye Ruzi , Xunying Liu , Lan Wang

Learning Latent Subspaces in Variational Autoencoders

Variational autoencoders (VAEs) are widely used deep generative models capable of learning unsupervised latent representations of data. Such representations are often difficult to interpret or control. We consider the problem of…

Machine Learning · Computer Science 2018-12-18 Jack Klys , Jake Snell , Richard Zemel

A Hybrid Convolutional Variational Autoencoder for Text Generation

In this paper we explore the effect of architectural choices on learning a Variational Autoencoder (VAE) for text generation. In contrast to the previously introduced VAE model for text where both the encoder and decoder are RNNs, we…

Computation and Language · Computer Science 2017-02-09 Stanislau Semeniuta , Aliaksei Severyn , Erhardt Barth

On the Distillation Loss Functions of Speech VAE for Unified Reconstruction, Understanding, and Generation

Continuous speech representations based on Variational Autoencoders (VAEs) have emerged as a promising alternative to traditional spectrogram or discrete token based features for speech generation and reconstruction. Recent research has…

Sound · Computer Science 2026-05-26 Changhao Cheng , Wei Wang , Wangyou Zhang , Dongya Jia , Jian Wu , Zhuo Chen , Yanmin Qian

A Correspondence Variational Autoencoder for Unsupervised Acoustic Word Embeddings

We propose a new unsupervised model for mapping a variable-duration speech segment to a fixed-dimensional representation. The resulting acoustic word embeddings can form the basis of search, discovery, and indexing systems for low- and…

Audio and Speech Processing · Electrical Eng. & Systems 2020-12-07 Puyuan Peng , Herman Kamper , Karen Livescu

Deep Feature Consistent Variational Autoencoder

We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, which ensures the VAE's output to preserve the…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 Xianxu Hou , Linlin Shen , Ke Sun , Guoping Qiu

Hyperspectral Variational Autoencoders for Joint Data Compression and Component Extraction

Geostationary hyperspectral satellites generate terabytes of data daily, creating critical challenges for storage, transmission, and distribution to the scientific community. We present a variational autoencoder (VAE) approach that achieves…

Machine Learning · Computer Science 2025-11-25 Core Francisco Park , Manuel Perez-Carrasco , Caroline Nowlan , Cecilia Garraffo

Decoding EEG Speech Perception with Transformers and VAE-based Data Augmentation

Decoding speech from non-invasive brain signals, such as electroencephalography (EEG), has the potential to advance brain-computer interfaces (BCIs), with applications in silent communication and assistive technologies for individuals with…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-30 Terrance Yu-Hao Chen , Yulin Chen , Pontus Soederhaell , Sadrishya Agrawal , Kateryna Shapovalenko

Speech Audio Generation from dynamic MRI via a Knowledge Enhanced Conditional Variational Autoencoder

Dynamic Magnetic Resonance Imaging (MRI) of the vocal tract has become an increasingly adopted imaging modality for speech motor studies. Beyond image signals, systematic data loss, noise pollution, and audio file corruption can occur due…

Sound · Computer Science 2025-12-02 Yaxuan Li , Han Jiang , Yifei Ma , Shihua Qin , Jonghye Woo , Fangxu Xing

Sparse Autoencoders, Again?

Is there really much more to say about sparse autoencoders (SAEs)? Autoencoders in general, and SAEs in particular, represent deep architectures that are capable of modeling low-dimensional latent structure in data. Such structure could…

Machine Learning · Computer Science 2025-06-09 Yin Lu , Xuening Zhu , Tong He , David Wipf