English
Related papers

Related papers: Diffiner: A Versatile Diffusion-based Generative R…

200 papers

Although recent speech processing technologies have achieved significant improvements in objective metrics, there still remains a gap in human perceptual quality. This paper proposes Diffiner, a novel solution that utilizes the powerful…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-11 Masato Hirano , Ryosuke Sawata , Naoki Murata , Shusuke Takahashi , Yuki Mitsufuji

With recent advances of diffusion model, generative speech enhancement (SE) has attracted a surge of research interest due to its great potential for unseen testing noises. However, existing efforts mainly focus on inherent properties of…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-05 Yuchen Hu , Chen Chen , Ruizhe Li , Qiushi Zhu , Eng Siong Chng

Diffusion-based generative speech enhancement (SE) has recently received attention, but reverse diffusion remains time-consuming. One solution is to initialize the reverse diffusion process with enhanced features estimated by a predictive…

In this work, we build upon our previous publication and use diffusion-based generative models for speech enhancement. We present a detailed overview of the diffusion process that is based on a stochastic differential equation and delve…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-14 Julius Richter , Simon Welker , Jean-Marie Lemercier , Bunlong Lay , Timo Gerkmann

Diffusion-based generative models have recently gained attention in speech enhancement (SE), providing an alternative to conventional supervised methods. These models transform clean speech training samples into Gaussian noise centered at…

Computer Vision and Pattern Recognition · Computer Science 2023-09-20 Jean-Eudes Ayilo , Mostafa Sadeghi , Romain Serizel

Speech-related applications deliver inferior performance in complex noise environments. Therefore, this study primarily addresses this problem by introducing speech-enhancement (SE) systems based on deep neural networks (DNNs) applied to a…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-26 Syu-Siang Wang , Yu-You Liang , Jeih-weih Hung , Yu Tsao , Hsin-Min Wang , Shih-Hau Fang

The goal of speech enhancement (SE) is to eliminate the background interference from the noisy speech signal. Generative models such as diffusion models (DM) have been applied to the task of SE because of better generalization in unseen…

Sound · Computer Science 2023-09-06 Wen Wang , Dongchao Yang , Qichen Ye , Bowen Cao , Yuexian Zou

Speech enhancement (SE) is the foundational task of enhancing the clarity and quality of speech in the presence of non-stationary additive noise. While deterministic deep learning models have been commonly employed for SE, recent research…

Audio and Speech Processing · Electrical Eng. & Systems 2025-03-11 Sonal Kumar , Sreyan Ghosh , Utkarsh Tyagi , Anton Jeran Ratnarajah , Chandra Kiran Reddy Evuru , Ramani Duraiswami , Dinesh Manocha

Speech pre-processing techniques such as denoising, de-reverberation, and separation, are commonly employed as front-ends for various downstream speech processing tasks. However, these methods can sometimes be inadequate, resulting in…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-17 Sirui Li , Shuai Wang , Zhijun Liu , Zhongjie Jiang , Yannan Wang , Haizhou Li

Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-17 Jean-Marie Lemercier , Julius Richter , Simon Welker , Timo Gerkmann

Diffusion models have shown promising results in speech enhancement, using a task-adapted diffusion process for the conditional generation of clean speech given a noisy mixture. However, at test time, the neural network used for score…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-17 Bunlong Lay , Jean-Marie Lemercier , Julius Richter , Timo Gerkmann

Speech separation, the task of isolating multiple speech sources from a mixed audio signal, remains challenging in noisy environments. In this paper, we propose a generative correction method to enhance the output of a discriminative…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-12 Helin Wang , Jesus Villalba , Laureano Moro-Velazquez , Jiarui Hai , Thomas Thebaud , Najim Dehak

This paper introduces a novel speech enhancement (SE) approach based on a denoising diffusion probabilistic model (DDPM), termed Guided diffusion for speech enhancement (GDiffuSE). In contrast to conventional methods that directly map noisy…

Sound · Computer Science 2026-03-03 Efrayim Yanir , David Burshtein , Sharon Gannot

In this paper, we explore a principal way to enhance the quality of object masks produced by different segmentation models. We propose a model-agnostic solution called SegRefiner, which offers a novel perspective on this problem by…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Mengyu Wang , Henghui Ding , Jun Hao Liew , Jiajun Liu , Yao Zhao , Yunchao Wei

We present DiffusionBERT, a new generative masked language model based on discrete diffusion models. Diffusion models and many pre-trained language models have a shared training objective, i.e., denoising, making it possible to combine the…

Computation and Language · Computer Science 2022-12-02 Zhengfu He , Tianxiang Sun , Kuanning Wang , Xuanjing Huang , Xipeng Qiu

This paper addresses unsupervised diffusion-based single-channel speech enhancement (SE). Prior work in this direction combines a score-based diffusion model trained on clean speech with a Gaussian noise model whose covariance is structured…

Sound · Computer Science 2026-05-26 Jean-Eudes Ayilo , Mostafa Sadeghi , Romain Serizel , Xavier Alameda-Pineda

Recently, diffusion-based generative models have demonstrated remarkable performance in speech enhancement tasks. However, these methods still encounter challenges, including the lack of structural information and poor performance in low…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-16 Siyi Wang , Siyi Liu , Andrew Harper , Paul Kendrick , Mathieu Salzmann , Milos Cernak

Diffusion-based generative models have recently achieved remarkable results in speech and vocal enhancement due to their ability to model complex speech data distributions. While these models generalize well to unseen acoustic environments,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-23 Yudong Yang , Zhan Liu , Wenyi Yu , Guangzhi Sun , Qiuqiang Kong , Chao Zhang

While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances…

Computation and Language · Computer Science 2024-05-02 Jiasheng Ye , Zaixiang Zheng , Yu Bao , Lihua Qian , Mingxuan Wang

Unlike discriminative approaches in autonomous driving that predict a fixed set of candidate trajectories of the ego vehicle, generative methods, such as diffusion models, learn the underlying distribution of future motion, enabling more…

Computer Vision and Pattern Recognition · Computer Science 2025-11-24 Liuhan Yin , Runkun Ju , Guodong Guo , Erkang Cheng
‹ Prev 1 2 3 10 Next ›