English
Related papers

Related papers: Complex-Cycle-Consistent Diffusion Model for Monau…

200 papers

Recently, self-supervised learning (SSL) techniques have been introduced to solve the monaural speech enhancement problem. Due to the lack of using clean phase information, the enhancement performance is limited in most SSL methods.…

Sound · Computer Science 2021-12-22 Yi Li , Yang Sun , Syed Mohsen Naqvi

With the development of deep learning, speech enhancement has been greatly optimized in terms of speech quality. Previous methods typically focus on the discriminative supervised learning or generative modeling, which tends to introduce…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-31 Nan Xu , Zhaolong Huang , Xiaonan Zhi

Recently, the application of diffusion probabilistic models has advanced speech enhancement through generative approaches. However, existing diffusion-based methods have focused on the generation process in high-dimensional waveform or…

Sound · Computer Science 2025-01-20 Shengkui Zhao , Zexu Pan , Kun Zhou , Yukun Ma , Chong Zhang , Bin Ma

The goal of speech enhancement (SE) is to eliminate the background interference from the noisy speech signal. Generative models such as diffusion models (DM) have been applied to the task of SE because of better generalization in unseen…

Sound · Computer Science 2023-09-06 Wen Wang , Dongchao Yang , Qichen Ye , Bowen Cao , Yuexian Zou

Speech enhancement algorithms based on deep learning have been improved in terms of speech intelligibility and perceptual quality greatly. Many methods focus on enhancing the amplitude spectrum while reconstructing speech using the mixture…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-10 Qinglong Li , Fei Gao , Haixin Guan , Kaichi Ma

Diffusion probabilistic models have demonstrated an outstanding capability to model natural images and raw audio waveforms through a paired diffusion and reverse processes. The unique property of the reverse process (namely, eliminating…

Audio and Speech Processing · Electrical Eng. & Systems 2021-11-23 Yen-Ju Lu , Yu Tsao , Shinji Watanabe

Recently, phase processing is attracting increasinginterest in speech enhancement community. Some researchersintegrate phase estimations module into speech enhancementmodels by using complex-valued short-time Fourier transform(STFT)…

Sound · Computer Science 2019-01-03 Xingjian Du , Mengyao Zhu , Xuan Shi , Xinpeng Zhang , Wen Zhang , Jingdong Chen

Diffusion speech enhancement on discrete audio codec features gain immense attention due to their improved speech component reconstruction capability. However, they usually suffer from high inference computational complexity due to multiple…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-30 Yihui Fu , Tim Fingscheidt

With recent advances of diffusion model, generative speech enhancement (SE) has attracted a surge of research interest due to its great potential for unseen testing noises. However, existing efforts mainly focus on inherent properties of…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-05 Yuchen Hu , Chen Chen , Ruizhe Li , Qiushi Zhu , Eng Siong Chng

In speech enhancement (SE), phase estimation is important for perceptual quality, so many methods take clean speech's complex short-time Fourier transform (STFT) spectrum or the complex ideal ratio mask (cIRM) as the learning target. To…

Audio and Speech Processing · Electrical Eng. & Systems 2023-10-12 Yuewei Zhang , Huanbin Zou , Jie Zhu

Deep generative models have demonstrated remarkable success in medical image synthesis. However, ensuring conditioning faithfulness and high-quality synthetic images for direct or counterfactual generation remains a challenge. In this work,…

Computer Vision and Pattern Recognition · Computer Science 2025-10-31 Fangrui Huang , Alan Wang , Binxu Li , Bailey Trang , Ridvan Yesiloglu , Tianyu Hua , Wei Peng , Ehsan Adeli

Diffusion models have recently achieved impressive results in reconstructing images from noisy inputs, and similar ideas have been applied to speech enhancement by treating time-frequency representations as images. With the ubiquity of…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-21 Renana Opochinsky , Sharon Gannot

Feature mapping using deep neural networks is an effective approach for single-channel speech enhancement. Noisy features are transformed to the enhanced ones through a mapping network and the mean square errors between the enhanced and…

Audio and Speech Processing · Electrical Eng. & Systems 2019-05-02 Zhong Meng , Jinyu Li , Yifan Gong , Biing-Hwang , Juang

A primary challenge when deploying speaker recognition systems in real-world applications is performance degradation caused by environmental mismatch. We propose a diffusion-based method that takes speaker embeddings extracted from a…

Audio and Speech Processing · Electrical Eng. & Systems 2025-05-23 KiHyun Nam , Jungwoo Heo , Jee-weon Jung , Gangin Park , Chaeyoung Jung , Ha-Jin Yu , Joon Son Chung

There are many deterministic mathematical operations (e.g. compression, clipping, downsampling) that degrade speech quality considerably. In this paper we introduce a neural network architecture, based on a modification of the DiffWave…

Sound · Computer Science 2021-09-03 Jianwei Zhang , Suren Jayasuriya , Visar Berisha

For the lack of adequate paired noisy-clean speech corpus in many real scenarios, non-parallel training is a promising task for DNN-based speech enhancement methods. However, because of the severe mismatch between input and target speeches,…

Sound · Computer Science 2022-02-15 Guochen Yu , Andong Li , Yutian Wang , Yinuo Guo , Hui Wang , Chengshi Zheng

Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs. While generative models have shown strong potential in speech synthesis, they are…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-11 Yen-Ju Lu , Zhong-Qiu Wang , Shinji Watanabe , Alexander Richard , Cheng Yu , Yu Tsao

Diffusion models have shown promising results in speech enhancement, using a task-adapted diffusion process for the conditional generation of clean speech given a noisy mixture. However, at test time, the neural network used for score…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-17 Bunlong Lay , Jean-Marie Lemercier , Julius Richter , Timo Gerkmann

Cycle-consistent generative adversarial networks (CycleGAN) have shown their promising performance for speech enhancement (SE), while one intractable shortcoming of these CycleGAN-based SE systems is that the noise components propagate…

Sound · Computer Science 2021-09-07 Guochen Yu , Yutian Wang , Hui Wang , Qin Zhang , Chengshi Zheng

Most deep learning-based models for speech enhancement have mainly focused on estimating the magnitude of spectrogram while reusing the phase from noisy speech for reconstruction. This is due to the difficulty of estimating the phase of…

Sound · Computer Science 2019-04-03 Hyeong-Seok Choi , Jang-Hyun Kim , Jaesung Huh , Adrian Kim , Jung-Woo Ha , Kyogu Lee
‹ Prev 1 2 3 10 Next ›