English
Related papers

Related papers: Solving Audio Inverse Problems with a Diffusion Mo…

200 papers

This paper introduces MR-CQTdiff, a novel neural-network architecture for diffusion-based audio generation that leverages a multi-resolution Constant-$Q$ Transform (C$Q$T). The proposed architecture employs an efficient, invertible CQT…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-23 Maurício do V. M. da Costa , Eloi Moliner

Audio inpainting aims to reconstruct missing segments in corrupted recordings. Most of existing methods produce plausible reconstructions when the gap lengths are short, but struggle to reconstruct gaps larger than about 100 ms. This paper…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-13 Eloi Moliner , Vesa Välimäki

This paper introduces UnDiff, a diffusion probabilistic model capable of solving various speech inverse tasks. Being once trained for speech waveform generation in an unconditional manner, it can be adapted to different tasks including…

We propose self-diffusion, a novel framework for solving inverse problems without relying on pretrained generative models. Traditional diffusion-based approaches require training a model on a clean dataset to learn to reverse the forward…

Machine Learning · Computer Science 2025-12-09 Guanxiong Luo , Shoujin Huang , Yanlong Yang

We introduce Linearly Constrained Diffusion Implicit Models (CDIM), a fast and accurate approach to solving noisy linear inverse problems using diffusion models. Traditional diffusion-based inverse methods rely on numerous projection steps…

Machine Learning · Computer Science 2025-12-01 Vivek Jayaram , Ira Kemelmacher-Shlizerman , Steven M. Seitz , John Thickstun

Diffusion models are powerful tools for sampling from high-dimensional distributions by progressively transforming pure noise into structured data through a denoising process. When equipped with a guidance mechanism, these models can also…

Machine Learning · Computer Science 2026-05-04 Saeed Mohseni-Sehdeh , Walid Saad , Kei Sakaguchi , Tao Yu

In this report we describe an ongoing line of research for solving single-channel source separation problems. Many monaural signal decomposition techniques proposed in the literature operate on a feature space consisting of a time-frequency…

Sound · Computer Science 2015-04-29 Pablo Sprechmann , Joan Bruna , Yann LeCun

Score-based diffusion models have significantly advanced generative deep learning for image processing. Measurement conditioned models have also been applied to inverse problems such as CT reconstruction. However, the conventional approach,…

Medical Physics · Physics 2025-02-24 Matthew Tivnan , Dufan Wu , Quanzheng Li

Recently, the information content (IC) of predictions from a Generative Infinite-Vocabulary Transformer (GIVT) has been used to model musical expectancy and surprisal in audio. We investigate the effectiveness of such modelling using IC…

Sound · Computer Science 2025-08-08 Mathias Rose Bjare , Stefan Lattner , Gerhard Widmer

Diffusion models have recently emerged as powerful priors for solving inverse problems. While computed tomography (CT) is theoretically a linear inverse problem, it poses many practical challenges. These include correlated noise, artifact…

Image and Video Processing · Electrical Eng. & Systems 2026-02-24 Jiayang Shi , Daniel M. Pelt , K. Joost Batenburg

Generative models realized with machine learning techniques are powerful tools to infer complex and unknown data distributions from a finite number of training samples in order to produce new synthetic data. Diffusion models are an emerging…

Quantum Physics · Physics 2024-07-18 Marco Parigi , Stefano Martina , Filippo Caruso

Diffusion models have emerged as powerful deep generative techniques, producing high-quality and diverse samples in applications in various domains including audio. While existing reviews provide overviews, there remains limited in-depth…

Sound · Computer Science 2026-01-16 Ge Zhu , Yutong Wen , Zhiyao Duan

Diffusion models have recently emerged as powerful generative priors for solving inverse problems. However, training diffusion models in the pixel space are both data-intensive and computationally demanding, which restricts their…

Computer Vision and Pattern Recognition · Computer Science 2024-04-17 Bowen Song , Soo Min Kwon , Zecheng Zhang , Xinyu Hu , Qing Qu , Liyue Shen

Channel knowledge map (CKM) is a promising technology to enable environment-aware wireless communications and sensing with greatly enhanced performance, by offering location-specific channel prior information for future wireless networks.…

Signal Processing · Electrical Eng. & Systems 2025-04-25 Shen Fu , Yong Zeng , Zijian Wu , Di Wu , Shi Jin , Cheng-Xiang Wang , Xiqi Gao

The restoration of nonlinearly distorted audio signals, alongside the identification of the applied memoryless nonlinear operation, is studied. The paper focuses on the difficult but practically important case in which both the nonlinearity…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-13 Michal Švento , Eloi Moliner , Lauri Juvela , Alec Wright , Vesa Välimäki

With the development of audio playback devices and fast data transmission, the demand for high sound quality is rising for both entertainment and communications. In this quest for better sound quality, challenges emerge from distortions and…

Audio and Speech Processing · Electrical Eng. & Systems 2024-11-12 Jean-Marie Lemercier , Julius Richter , Simon Welker , Eloi Moliner , Vesa Välimäki , Timo Gerkmann

Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-17 Jean-Marie Lemercier , Julius Richter , Simon Welker , Timo Gerkmann

Solving ill-posed inverse problems requires careful formulation of prior beliefs over the signals of interest and an accurate description of their manifestation into noisy measurements. Handcrafted signal priors based on e.g. sparsity are…

Machine Learning · Computer Science 2025-08-14 Tristan S. W. Stevens , Hans van Gorp , Faik C. Meral , Junseob Shin , Jason Yu , Jean-Luc Robert , Ruud J. G. van Sloun

Cone-beam computed tomography (CBCT) is widely used for image-guided radiotherapy (IGRT). It provides real time visualization at low cost and dose. However, photon scattering and beam hindrance cause artifacts in CBCT. These include…

Medical Physics · Physics 2025-09-29 Alzahra Altalib , Chunhui Li , Alessandro Perelli

The audio denoising technique has captured widespread attention in the deep neural network field. Recently, the audio denoising problem has been converted into an image generation task, and deep learning-based approaches have been applied…

Sound · Computer Science 2024-06-14 Junhui Li , Pu Wang , Jialu Li , Youshan Zhang
‹ Prev 1 2 3 10 Next ›