Related papers: FastWave: Optimized Diffusion Model for Audio Supe…

FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow Matching

Audio super-resolution is challenging owing to its ill-posed nature. Recently, the application of diffusion models in audio super-resolution has shown promising results in alleviating this challenge. However, diffusion-based models have…

Audio and Speech Processing · Electrical Eng. & Systems 2025-03-12 Jun-Hak Yun , Seung-Bin Kim , Seong-Whan Lee

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

In this work, we introduce NU-Wave, the first neural audio upsampling model to produce waveforms of sampling rate 48kHz from coarse 16kHz or 24kHz inputs, while prior works could generate only up to 16kHz. NU-Wave is the first diffusion…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-10 Junhyeok Lee , Seungu Han

NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates

Conventionally, audio super-resolution models fixed the initial and the target sampling rates, which necessitate the model to be trained for each pair of sampling rates. We introduce NU-Wave 2, a diffusion model for neural audio upsampling…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-28 Seungu Han , Junhyeok Lee

Audio Super Resolution using Neural Networks

We introduce a new audio processing technique that increases the sampling rate of signals such as speech or music using deep convolutional neural networks. Our model is trained on pairs of low and high-quality audio examples; at test-time,…

Sound · Computer Science 2017-08-03 Volodymyr Kuleshov , S. Zayd Enam , Stefano Ermon

FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching

This work proposes an efficient method to enhance the quality of corrupted speech signals by leveraging both acoustic and visual cues. While existing diffusion-based approaches have demonstrated remarkable quality, their applicability is…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-14 Chaeyoung Jung , Suyeon Lee , Ji-Hoon Kim , Joon Son Chung

Inference-time Scaling for Diffusion-based Audio Super-resolution

Diffusion models have demonstrated remarkable success in generative tasks, including audio super-resolution (SR). In many applications like movie post-production and album mastering, substantial computational budgets are available for…

Sound · Computer Science 2025-08-05 Yizhu Jin , Zhen Ye , Zeyue Tian , Haohe Liu , Qiuqiang Kong , Yike Guo , Wei Xue

AudioSR: Versatile Audio Super-resolution at Scale

Audio super-resolution is a fundamental task that predicts high-frequency components for low-resolution audio, enhancing audio quality in digital applications. Previous methods have limitations such as the limited scope of audio types…

Sound · Computer Science 2023-09-15 Haohe Liu , Ke Chen , Qiao Tian , Wenwu Wang , Mark D. Plumbley

RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

Recent advancements in generative modeling have significantly enhanced the reconstruction of audio waveforms from various representations. While diffusion models are adept at this task, they are hindered by latency issues due to their…

Sound · Computer Science 2024-10-08 Peng Liu , Dongyang Dai , Zhiyong Wu

LatentFlowSR: High-Fidelity Audio Super-Resolution via Noise-Robust Latent Flow Matching

Audio super-resolution aims to recover missing high-frequency details from bandwidth-limited low-resolution audio, thereby improving the naturalness and perceptual quality of the reconstructed signal. However, most existing methods directly…

Sound · Computer Science 2026-04-13 Fei Liu , Yang Ai , Hui-Peng Du , Yu-Fei Shi , Zhen-Hua Ling

WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution

Audio super-resolution is the task of constructing a high-resolution (HR) audio from a low-resolution (LR) audio by adding the missing band. Previous methods based on convolutional neural networks and mean squared error training objective…

Sound · Computer Science 2021-06-17 Kexun Zhang , Yi Ren , Changliang Xu , Zhou Zhao

Wavelet Flow: Fast Training of High Resolution Normalizing Flows

Normalizing flows are a class of probabilistic generative models which allow for both fast density computation and efficient sampling and are effective at modelling complex distributions like images. A drawback among current methods is…

Computer Vision and Pattern Recognition · Computer Science 2020-10-28 Jason J. Yu , Konstantinos G. Derpanis , Marcus A. Brubaker

Diffusion Models for Audio Restoration

With the development of audio playback devices and fast data transmission, the demand for high sound quality is rising for both entertainment and communications. In this quest for better sound quality, challenges emerge from distortions and…

Audio and Speech Processing · Electrical Eng. & Systems 2024-11-12 Jean-Marie Lemercier , Julius Richter , Simon Welker , Eloi Moliner , Vesa Välimäki , Timo Gerkmann

DeepInv: A Novel Self-supervised Learning Approach for Fast and Accurate Diffusion Inversion

Diffusion inversion is a task of recovering the noise of an image in a diffusion model, which is vital for controllable diffusion image editing. At present, diffusion inversion still remains a challenging task due to the lack of viable…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Ziyue Zhang , Luxi Lin , Xiaolin Hu , Chao Chang , HuaiXi Wang , Yiyi Zhou , Rongrong Ji

AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion

Diffusion models have significantly improved the quality and diversity of audio generation but are hindered by slow inference speed. Rectified flow enhances inference speed by learning straight-line ordinary differential equation (ODE)…

Sound · Computer Science 2025-05-29 Junqi Zhao , Jinzheng Zhao , Haohe Liu , Yun Chen , Lu Han , Xubo Liu , Mark Plumbley , Wenwu Wang

Fast Image Super-Resolution via Consistency Rectified Flow

Diffusion models (DMs) have demonstrated remarkable success in real-world image super-resolution (SR), yet their reliance on time-consuming multi-step sampling largely hinders their practical applications. While recent efforts have…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Jiaqi Xu , Wenbo Li , Haoze Sun , Fan Li , Zhixin Wang , Long Peng , Jingjing Ren , Haoran Yang , Xiaowei Hu , Renjing Pei , Pheng-Ann Heng

Wavelet Diffusion Models are fast and scalable Image Generators

Diffusion models are rising as a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances. However, their slow training and inference speed is a huge bottleneck, blocking them from being used…

Computer Vision and Pattern Recognition · Computer Science 2023-03-24 Hao Phung , Quan Dao , Anh Tran

Spectral Progressive Diffusion for Efficient Image and Video Generation

Diffusion models have been shown to implicitly generate visual content autoregressively in the frequency domain, where low-frequency components are generated earlier in the denoising process while high-frequency details emerge only in later…

Computer Vision and Pattern Recognition · Computer Science 2026-05-21 Howard Xiao , Brian Chao , Lior Yariv , Gordon Wetzstein

Self-Guided Diffusion Model for Accelerating Computational Fluid Dynamics

Machine learning methods, such as diffusion models, are widely explored as a promising way to accelerate high-fidelity fluid dynamics computation via a super-resolution process from faster-to-compute low-fidelity input. However, existing…

Computational Engineering, Finance, and Science · Computer Science 2025-12-24 Ruoyan Li , Zijie Huang , Haixin Wang , Guancheng Wan , Yizhou Sun , Wei Wang

Combined Generative and Predictive Modeling for Speech Super-resolution

Speech super-resolution (SR) is the task that restores high-resolution speech from low-resolution input. Existing models employ simulated data and constrained experimental settings, which limit generalization to real-world SR. Predictive…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-26 Heming Wang , Eric W. Healy , DeLiang Wang

QLingNet: An efficient and flexible modeling framework for subsonic airfoils

Artificial intelligence techniques are considered an effective means to accelerate flow field simulations. However, current deep learning methods struggle to achieve generalization to flow field resolutions while ensuring computational…

Fluid Dynamics · Physics 2024-05-15 Kuijun Zuo , Zhengyin Ye , Linyang Zhu , Xianxu Yuan , Weiwei Zhang