English
Related papers

Related papers: Unsupervised Noise adaptation using Data Simulatio…

200 papers

Domain mismatch between training and testing can lead to significant degradation in performance in many machine learning scenarios. Unfortunately, this is not a rare situation for automatic speech recognition deployments in real-world…

Computation and Language · Computer Science 2017-09-25 Wei-Ning Hsu , Yu Zhang , James Glass

Supervised speech enhancement relies on parallel databases of degraded speech signals and their clean reference signals during training. This setting prohibits the use of real-world degraded speech data that may better represent the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-22 Yangyang Xia , Buye Xu , Anurag Kumar

The majority of deep learning-based speech enhancement methods require paired clean-noisy speech data. Collecting such data at scale in real-world conditions is infeasible, which has led the community to rely on synthetically generated…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Dominik Klement , Matthew Maciejewski , Sanjeev Khudanpur , Jan Černocký , Lukáš Burget

Unsupervised domain adaptation of speech signal aims at adapting a well-trained source-domain acoustic model to the unlabeled data from target domain. This can be achieved by adversarial training of deep neural network (DNN) acoustic models…

Computation and Language · Computer Science 2019-05-01 Zhong Meng , Zhuo Chen , Vadim Mazalov , Jinyu Li , Yifan Gong

Cross-domain speech enhancement (SE) is often faced with severe challenges due to the scarcity of noise and background information in an unseen target domain, leading to a mismatch between training and test conditions. This study puts…

Sound · Computer Science 2024-09-04 Chien-Chun Wang , Li-Wei Chen , Hung-Shin Lee , Berlin Chen , Hsin-Min Wang

In this paper, we investigate the use of adversarial learning for unsupervised adaptation to unseen recording conditions, more specifically, single microphone far-field speech. We adapt neural networks based acoustic models trained with…

Audio and Speech Processing · Electrical Eng. & Systems 2018-07-31 Pavel Denisov , Ngoc Thang Vu , Marc Ferras Font

Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments an important…

Sound · Computer Science 2017-12-19 Santiago Pascual , Maruchan Park , Joan Serrà , Antonio Bonafonte , Kang-Hun Ahn

Training personalized speech enhancement models is innately a no-shot learning problem due to privacy constraints and limited access to noise-free speech from the target user. If there is an abundance of unlabeled noisy speech from the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-06 Aswin Sivaraman , Sunwoo Kim , Minje Kim

Data efficient voice cloning aims at synthesizing target speaker's voice with only a few enrollment samples at hand. To this end, speaker adaptation and speaker encoding are two typical methods based on base model trained from multiple…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-12 Jian Cong , Shan Yang , Lei Xie , Guoqiao Yu , Guanglu Wan

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on…

The current trend in automatic speech recognition is to leverage large amounts of labeled data to train supervised neural network models. Unfortunately, obtaining data for a wide range of domains to train robust models can be costly.…

Computation and Language · Computer Science 2018-06-14 Wei-Ning Hsu , Hao Tang , James Glass

Many speech enhancement methods try to learn the relationship between noisy and clean speech, obtained using an acoustic room simulator. We point out several limitations of enhancement methods relying on clean speech targets; the goal of…

Computation and Language · Computer Science 2018-12-26 Geonmin Kim , Hwaran Lee , Bo-Kyeong Kim , Sang-Hoon Oh , Soo-Young Lee

Existing deep learning-based speech denoising approaches require clean speech signals to be available for training. This paper presents a deep learning-based approach to improve speech denoising in real-world audio environments by not…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-25 Nasim Alamdari , Arian Azarang , Nasser Kehtarnavaz

In this study, we propose a novel noise adaptive speech enhancement (SE) system, which employs a domain adversarial training (DAT) approach to tackle the issue of a noise type mismatch between the training and testing conditions. Such a…

Sound · Computer Science 2019-07-02 Chien-Feng Liao , Yu Tsao , Hung-Yi Lee , Hsin-Min Wang

This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio-denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.…

Sound · Computer Science 2021-09-21 Madhav Mahesh Kashyap , Anuj Tambwekar , Krishnamoorthy Manohara , S Natarajan

The intelligibility of speech severely degrades in the presence of environmental noise and reverberation. In this paper, we propose a novel deep learning based system for modifying the speech signal to increase its intelligibility under the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-17 Haoyu Li , Junichi Yamagishi

It is a strong prerequisite to access source data freely in many existing unsupervised domain adaptation approaches. However, source data is agnostic in many practical scenarios due to the constraints of expensive data transmission and data…

Computer Vision and Pattern Recognition · Computer Science 2021-02-24 Weijie Chen , Luojun Lin , Shicai Yang , Di Xie , Shiliang Pu , Yueting Zhuang , Wenqi Ren

We investigate and characterize the inherent resilience of conditional Generative Adversarial Networks (cGANs) against noise in their conditioning labels, and exploit this fact in the context of Unsupervised Domain Adaptation (UDA). In UDA,…

Computer Vision and Pattern Recognition · Computer Science 2020-01-10 Pietro Morerio , Riccardo Volpi , Ruggero Ragonesi , Vittorio Murino

In general, the performance of automatic speech recognition (ASR) systems is significantly degraded due to the mismatch between training and test environments. Recently, a deep-learning-based image-to-image translation technique to…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-15 Jong-Hyeon Park , Myungwoo Oh , Hyung-Min Park

Contemporary speech enhancement predominantly relies on audio transforms that are trained to reconstruct a clean speech waveform. The development of high-performing neural network sound recognition systems has raised the possibility of…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-18 Mark R. Saddler , Andrew Francl , Jenelle Feather , Kaizhi Qian , Yang Zhang , Josh H. McDermott
‹ Prev 1 2 3 10 Next ›