English
Related papers

Related papers: A Deep Representation Learning-based Speech Enhanc…

200 papers

This paper focuses on leveraging deep representation learning (DRL) for speech enhancement (SE). In general, the performance of the deep neural network (DNN) is heavily dependent on the learning of data representation. However, the DRL's…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-28 Yang Xiang , Jesper Lisby Højvang , Morten Højfeldt Rasmussen , Mads Græsbøll Christensen

In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training speech enhancement (SE) method (PVAE) which indicated that the SE performance of the traditional deep neural network-based (DNN) method could…

Audio and Speech Processing · Electrical Eng. & Systems 2022-05-12 Yang Xiang , Jesper Lisby Højvang , Morten Højfeldt Rasmussen , Mads Græsbøll Christensen

Recently, variational autoencoder (VAE), a deep representation learning (DRL) model, has been used to perform speech enhancement (SE). However, to the best of our knowledge, current VAE-based SE methods only apply VAE to the model speech…

Audio and Speech Processing · Electrical Eng. & Systems 2022-01-25 Yang Xiang , Jesper Lisby Højvang , Morten Højfeldt Rasmussen , Mads Græsbøll Christensen

Recently, a complex variational autoencoder (VAE)-based single-channel speech enhancement system based on the DCCRN architecture has been proposed. In this system, a noise suppression VAE (NSVAE) learns to extract clean speech…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-03 Jiatong Li , Simon Doclo

As an extension of variational autoencoder (VAE), complex VAE uses complex Gaussian distributions to model latent variables and data. This work proposes a complex recurrent VAE framework, specifically in which complex-valued recurrent…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-28 Yuying Xie , Thomas Arildsen , Zheng-Hua Tan

In this paper we introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function that can generate more meaningful electroencephalography (EEG) features from raw EEG features to…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-05 Gautam Krishna , Co Tran , Mason Carnahan , Ahmed Tewfik

This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE). The deep generative speech model is trained using clean speech signals only, and it is combined with a nonnegative matrix…

Machine Learning · Computer Science 2020-02-11 Simon Leglaive , Xavier Alameda-Pineda , Laurent Girin , Radu Horaud

This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech. A standard approach to speech enhancement is to train a deep neural network…

Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then…

Sound · Computer Science 2020-12-18 Mostafa Sadeghi , Simon Leglaive , Xavier Alameda-PIneda , Laurent Girin , Radu Horaud

In recent years, applying Deep Learning (DL) techniques emerged as a common practice in the communication system, demonstrating promising results. The present paper proposes a new Convolutional Neural Network (CNN) based Variational…

Signal Processing · Electrical Eng. & Systems 2020-05-20 Raghu Vamshi Hemadri , Akshay Rayaluru , Rahul Jashvantbhai Pandya

An effective approach for voice conversion (VC) is to disentangle linguistic content from other components in the speech signal. The effectiveness of variational autoencoder (VAE) based VC (VAE-VC), for instance, strongly relies on this…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-09 Wen-Chin Huang , Hao Luo , Hsin-Te Hwang , Chen-Chou Lo , Yu-Huai Peng , Yu Tsao , Hsin-Min Wang

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-24 Yanxin Hu , Yun Liu , Shubo Lv , Mengtao Xing , Shimin Zhang , Yihui Fu , Jian Wu , Bihong Zhang , Lei Xie

Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus only on addressing audio information. In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent…

Sound · Computer Science 2022-04-19 Jen-Cheng Hou , Syu-Siang Wang , Ying-Hui Lai , Yu Tsao , Hsiu-Wen Chang , Hsin-Min Wang

Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus only on addressing audio information. In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent…

Sound · Computer Science 2018-01-25 Jen-Cheng Hou , Syu-Siang Wang , Ying-Hui Lai , Yu Tsao , Hsiu-Wen Chang , Hsin-Min Wang

Sequence-to-sequence (Seq2seq) models have played an important role in the recent success of various natural language processing methods, such as machine translation, text summarization, and speech recognition. However, current Seq2seq…

Computation and Language · Computer Science 2018-06-05 Myeongjun Jang , Seungwan Seo , Pilsung Kang

We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, which ensures the VAE's output to preserve the…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 Xianxu Hou , Linlin Shen , Ke Sun , Guoping Qiu

The Variational Autoencoder (VAE) is a powerful deep generative model that is now extensively used to represent high-dimensional complex data via a low-dimensional latent space learned in an unsupervised manner. In the original VAE model,…

Sound · Computer Science 2021-06-15 Xiaoyu Bie , Laurent Girin , Simon Leglaive , Thomas Hueber , Xavier Alameda-Pineda

Embracing the deep learning techniques for representation learning in clustering research has attracted broad attention in recent years, yielding a newly developed clustering paradigm, viz. the deep clustering (DC). Typically, the DC models…

Machine Learning · Computer Science 2022-01-17 Shuai Chang

Recently, a variational autoencoder (VAE)-based single-channel speech enhancement system using Bayesian permutation training has been proposed, which uses two pretrained VAEs to obtain latent representations for speech and noise. Based on…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-03 Jiatong Li , Simon Doclo

Automatic speaker verification (ASV) systems are highly vulnerable to presentation attacks, also called spoofing attacks. Replay is among the simplest attacks to mount - yet difficult to detect reliably. The generalization failure of…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-24 Bhusan Chettri , Tomi Kinnunen , Emmanouil Benetos
‹ Prev 1 2 3 10 Next ›