Related papers: Diffusion Models for Audio Restoration

Diffusion models for audio semantic communication

Directly sending audio signals from a transmitter to a receiver across a noisy channel may absorb consistent bandwidth and be prone to errors when trying to recover the transmitted bits. On the contrary, the recent semantic communication…

Sound · Computer Science 2023-09-15 Eleonora Grassucci , Christian Marinoni , Andrea Rodriguez , Danilo Comminiello

Diffusion-based Signal Refiner for Speech Enhancement and Separation

Although recent speech processing technologies have achieved significant improvements in objective metrics, there still remains a gap in human perceptual quality. This paper proposes Diffiner, a novel solution that utilizes the powerful…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-11 Masato Hirano , Ryosuke Sawata , Naoki Murata , Shusuke Takahashi , Yuki Mitsufuji

Diffusion Posterior Proximal Sampling for Image Restoration

Diffusion models have demonstrated remarkable efficacy in generating high-quality samples. Existing diffusion-based image restoration algorithms exploit pre-trained diffusion models to leverage data priors, yet they still preserve elements…

Image and Video Processing · Electrical Eng. & Systems 2024-08-07 Hongjie Wu , Linchao He , Mingqin Zhang , Dongdong Chen , Kunming Luo , Mengting Luo , Ji-Zhe Zhou , Hu Chen , Jiancheng Lv

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

Diffusion models, a powerful and universal generative AI technology, have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. In these applications, diffusion models provide flexible…

Machine Learning · Computer Science 2024-04-12 Minshuo Chen , Song Mei , Jianqing Fan , Mengdi Wang

A Study on Speech Enhancement Based on Diffusion Probabilistic Model

Diffusion probabilistic models have demonstrated an outstanding capability to model natural images and raw audio waveforms through a paired diffusion and reverse processes. The unique property of the reverse process (namely, eliminating…

Audio and Speech Processing · Electrical Eng. & Systems 2021-11-23 Yen-Ju Lu , Yu Tsao , Shinji Watanabe

Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration

Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-17 Jean-Marie Lemercier , Julius Richter , Simon Welker , Timo Gerkmann

Audio Generation Through Score-Based Generative Modeling: Design Principles and Implementation

Diffusion models have emerged as powerful deep generative techniques, producing high-quality and diverse samples in applications in various domains including audio. While existing reviews provide overviews, there remains limited in-depth…

Sound · Computer Science 2026-01-16 Ge Zhu , Yutong Wen , Zhiyao Duan

VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance

Restoring degraded music signals is essential to enhance audio quality for downstream music manipulation. Recent diffusion-based music restoration methods have demonstrated impressive performance, and among them, diffusion posterior…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-14 Carlos Hernandez-Olivan , Koichi Saito , Naoki Murata , Chieh-Hsin Lai , Marco A. Martínez-Ramirez , Wei-Hsiang Liao , Yuki Mitsufuji

Cold Diffusion for Speech Enhancement

Diffusion models have recently shown promising results for difficult enhancement tasks such as the conditional and unconditional restoration of natural images and audio signals. In this work, we explore the possibility of leveraging a…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-24 Hao Yen , François G. Germain , Gordon Wichern , Jonathan Le Roux

Heuristically Adaptive Diffusion-Model Evolutionary Strategy

Diffusion Models represent a significant advancement in generative modeling, employing a dual-phase process that first degrades domain-specific information via Gaussian noise and restores it through a trainable model. This framework enables…

Neural and Evolutionary Computing · Computer Science 2024-11-21 Benedikt Hartl , Yanbo Zhang , Hananel Hazan , Michael Levin

Investigating the Design Space of Diffusion Models for Speech Enhancement

Diffusion models are a new class of generative models that have shown outstanding performance in image generation literature. As a consequence, studies have attempted to apply diffusion models to other tasks, such as speech enhancement. A…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-10 Philippe Gonzalez , Zheng-Hua Tan , Jan Østergaard , Jesper Jensen , Tommy Sonne Alstrøm , Tobias May

DiffLoss: unleashing diffusion model as constraint for training image restoration network

Image restoration aims to enhance low quality images, producing high quality images that exhibit natural visual characteristics and fine semantic attributes. Recently, the diffusion model has emerged as a powerful technique for image…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Jiangtong Tan , Feng Zhao

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Diffusion models have shown a great ability at bridging the performance gap between predictive and generative approaches for speech enhancement. We have shown that they may even outperform their predictive counterparts for non-additive…

Audio and Speech Processing · Electrical Eng. & Systems 2024-03-13 Jean-Marie Lemercier , Julius Richter , Simon Welker , Timo Gerkmann

Diffusion-Based Audio Inpainting

Audio inpainting aims to reconstruct missing segments in corrupted recordings. Most of existing methods produce plausible reconstructions when the gap lengths are short, but struggle to reconstruct gaps larger than about 100 ms. This paper…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-13 Eloi Moliner , Vesa Välimäki

Conditional Diffusion Probabilistic Model for Speech Enhancement

Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs. While generative models have shown strong potential in speech synthesis, they are…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-11 Yen-Ju Lu , Zhong-Qiu Wang , Shinji Watanabe , Alexander Richard , Cheng Yu , Yu Tsao

Removing Structured Noise with Diffusion Models

Solving ill-posed inverse problems requires careful formulation of prior beliefs over the signals of interest and an accurate description of their manifestation into noisy measurements. Handcrafted signal priors based on e.g. sparsity are…

Machine Learning · Computer Science 2025-08-14 Tristan S. W. Stevens , Hans van Gorp , Faik C. Meral , Junseob Shin , Jason Yu , Jean-Luc Robert , Ruud J. G. van Sloun

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

In this work, we build upon our previous publication and use diffusion-based generative models for speech enhancement. We present a detailed overview of the diffusion process that is based on a stochastic differential equation and delve…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-14 Julius Richter , Simon Welker , Jean-Marie Lemercier , Bunlong Lay , Timo Gerkmann

High-Resolution Speech Restoration with Latent Diffusion Model

Traditional speech enhancement methods often oversimplify the task of restoration by focusing on a single type of distortion. Generative models that handle multiple distortions frequently struggle with phone reconstruction and…

Sound · Computer Science 2025-02-11 Tushar Dhyani , Florian Lux , Michele Mancusi , Giorgio Fabbro , Fritz Hohl , Ngoc Thang Vu

A Comprehensive Survey on Diffusion Models and Their Applications

Diffusion Models are probabilistic models that create realistic samples by simulating the diffusion process, gradually adding and removing noise from data. These models have gained popularity in domains such as image processing, speech…

Computer Vision and Pattern Recognition · Computer Science 2024-08-21 Md Manjurul Ahsan , Shivakumar Raman , Yingtao Liu , Zahed Siddique

ArchiSound: Audio Generation with Diffusion

The recent surge in popularity of diffusion models for image generation has brought new attention to the potential of these models in other areas of media generation. One area that has yet to be fully explored is the application of…

Sound · Computer Science 2023-02-01 Flavio Schneider