Related papers: Unsupervised Noise adaptation using Data Simulatio…

Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation

Domain mismatch between training and testing can lead to significant degradation in performance in many machine learning scenarios. Unfortunately, this is not a rare situation for automatic speech recognition deployments in real-world…

Computation and Language · Computer Science 2017-09-25 Wei-Ning Hsu , Yu Zhang , James Glass

Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems

Supervised speech enhancement relies on parallel databases of degraded speech signals and their clean reference signals during training. This setting prohibits the use of real-world degraded speech data that may better represent the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-22 Yangyang Xia , Buye Xu , Anurag Kumar

Unsupervised Speech Enhancement using Data-defined Priors

The majority of deep learning-based speech enhancement methods require paired clean-noisy speech data. Collecting such data at scale in real-world conditions is infeasible, which has led the community to rely on synthetically generated…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Dominik Klement , Matthew Maciejewski , Sanjeev Khudanpur , Jan Černocký , Lukáš Burget

Unsupervised Adaptation with Domain Separation Networks for Robust Speech Recognition

Unsupervised domain adaptation of speech signal aims at adapting a well-trained source-domain acoustic model to the unlabeled data from target domain. This can be achieved by adversarial training of deep neural network (DNN) acoustic models…

Computation and Language · Computer Science 2019-05-01 Zhong Meng , Zhuo Chen , Vadim Mazalov , Jinyu Li , Yifan Gong

Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation

Cross-domain speech enhancement (SE) is often faced with severe challenges due to the scarcity of noise and background information in an unseen target domain, leading to a mismatch between training and test conditions. This study puts…

Sound · Computer Science 2024-09-04 Chien-Chun Wang , Li-Wei Chen , Hung-Shin Lee , Berlin Chen , Hsin-Min Wang

Unsupervised Domain Adaptation by Adversarial Learning for Robust Speech Recognition

In this paper, we investigate the use of adversarial learning for unsupervised adaptation to unseen recording conditions, more specifically, single microphone far-field speech. We adapt neural networks based acoustic models trained with…

Audio and Speech Processing · Electrical Eng. & Systems 2018-07-31 Pavel Denisov , Ngoc Thang Vu , Marc Ferras Font

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments an important…

Sound · Computer Science 2017-12-19 Santiago Pascual , Maruchan Park , Joan Serrà , Antonio Bonafonte , Kang-Hun Ahn

Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification

Training personalized speech enhancement models is innately a no-shot learning problem due to privacy constraints and limited access to noise-free speech from the target user. If there is an abundance of unlabeled noisy speech from the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-06 Aswin Sivaraman , Sunwoo Kim , Minje Kim

Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training

Data efficient voice cloning aims at synthesizing target speaker's voice with only a few enrollment samples at hand. To this end, speaker adaptation and speaker encoding are two typical methods based on base model trained from multiple…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-12 Jian Cong , Shan Yang , Lei Xie , Guoqiao Yu , Guanglu Wan

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on…

Sound · Computer Science 2018-09-24 Zixing Zhang , Jürgen Geiger , Jouni Pohjalainen , Amr El-Desoky Mousa , Wenyu Jin , Björn Schuller

Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition

The current trend in automatic speech recognition is to leverage large amounts of labeled data to train supervised neural network models. Unfortunately, obtaining data for a wide range of domains to train robust models can be costly.…

Computation and Language · Computer Science 2018-06-14 Wei-Ning Hsu , Hao Tang , James Glass

Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition

Many speech enhancement methods try to learn the relationship between noisy and clean speech, obtained using an acoustic room simulator. We point out several limitations of enhancement methods relying on clean speech targets; the goal of…

Computation and Language · Computer Science 2018-12-26 Geonmin Kim , Hwaran Lee , Bo-Kyeong Kim , Sang-Hoon Oh , Soo-Young Lee

Improving Deep Speech Denoising by Noisy2Noisy Signal Mapping

Existing deep learning-based speech denoising approaches require clean speech signals to be available for training. This paper presents a deep learning-based approach to improve speech denoising in real-world audio environments by not…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-25 Nasim Alamdari , Arian Azarang , Nasser Kehtarnavaz

Noise Adaptive Speech Enhancement using Domain Adversarial Training

In this study, we propose a novel noise adaptive speech enhancement (SE) system, which employs a domain adversarial training (DAT) approach to tackle the issue of a noise type mismatch between the training and testing conditions. Such a…

Sound · Computer Science 2019-07-02 Chien-Feng Liao , Yu Tsao , Hung-Yi Lee , Hsin-Min Wang

Speech Denoising Without Clean Training Data: A Noise2Noise Approach

This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio-denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.…

Sound · Computer Science 2021-09-21 Madhav Mahesh Kashyap , Anuj Tambwekar , Krishnamoorthy Manohara , S Natarajan

Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

The intelligibility of speech severely degrades in the presence of environmental noise and reverberation. In this paper, we propose a novel deep learning based system for modifying the speech signal to increase its intelligibility under the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-17 Haoyu Li , Junichi Yamagishi

Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain Adaptation

It is a strong prerequisite to access source data freely in many existing unsupervised domain adaptation approaches. However, source data is agnostic in many practical scenarios due to the constraints of expensive data transmission and data…

Computer Vision and Pattern Recognition · Computer Science 2021-02-24 Weijie Chen , Luojun Lin , Shicai Yang , Di Xie , Shiliang Pu , Yueting Zhuang , Wenqi Ren

Generative Pseudo-label Refinement for Unsupervised Domain Adaptation

We investigate and characterize the inherent resilience of conditional Generative Adversarial Networks (cGANs) against noise in their conditioning labels, and exploit this fact in the context of Unsupervised Domain Adaptation (UDA). In UDA,…

Computer Vision and Pattern Recognition · Computer Science 2020-01-10 Pietro Morerio , Riccardo Volpi , Ruggero Ragonesi , Vittorio Murino

Unsupervised Speech Domain Adaptation Based on Disentangled Representation Learning for Robust Speech Recognition

In general, the performance of automatic speech recognition (ASR) systems is significantly degraded due to the mismatch between training and test environments. Recently, a deep-learning-based image-to-image translation technique to…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-15 Jong-Hyeon Park , Myungwoo Oh , Hyung-Min Park

Speech Denoising with Auditory Models

Contemporary speech enhancement predominantly relies on audio transforms that are trained to reconstruct a clean speech waveform. The development of high-performing neural network sound recognition systems has raised the possibility of…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-18 Mark R. Saddler , Andrew Francl , Jenelle Feather , Kaizhi Qian , Yang Zhang , Josh H. McDermott