Related papers: Learning to Denoise Historical Music

Sound texture synthesis using convolutional neural networks

The following article introduces a new parametric synthesis algorithm for sound textures inspired by existing methods used for visual textures. Using a 2D Convolutional Neural Network (CNN), a sound signal is modified until the temporal…

Sound · Computer Science 2019-05-10 Hugo Caracalla , Axel Roebel

Temporal envelope and fine structure cues for dysarthric speech detection using CNNs

Deep learning-based techniques for automatic dysarthric speech detection have recently attracted interest in the research community. State-of-the-art techniques typically learn neurotypical and dysarthric discriminative representations by…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-04 Ina Kodrasi

A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music

Recently, it has become easier to obtain speech data from various media such as the internet or YouTube, but directly utilizing them to train a neural text-to-speech (TTS) model is difficult. The proportion of clean speech is insufficient…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-05 Hanbin Bae , Jae-Sung Bae , Young-Sun Joo , Young-Ik Kim , Hoon-Young Cho

Music Generation with Deep Learning

The use of deep learning to solve problems in literary arts has been a recent trend that has gained a lot of attention and automated generation of music has been an active area. This project deals with the generation of music using raw…

Sound · Computer Science 2016-12-16 Vasanth Kalingeri , Srikanth Grandhe

SyncNet: correlating objective for time delay estimation in audio signals

This study addresses the task of performing robust and reliable time-delay estimation in signals in noisy and reverberating environments. In contrast to the popular signal processing based methods, this paper proposes to transform the input…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-03 Akshay Raina , Vipul Arora

Dilated Deep Residual Network for Image Denoising

Variations of deep neural networks such as convolutional neural network (CNN) have been successfully applied to image denoising. The goal is to automatically learn a mapping from a noisy image to a clean image given training data consisting…

Computer Vision and Pattern Recognition · Computer Science 2017-09-29 Tianyang Wang , Mingxuan Sun , Kaoning Hu

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

In this article, we explore the potential of using latent diffusion models, a family of powerful generative models, for the task of reconstructing naturalistic music from electroencephalogram (EEG) recordings. Unlike simpler music with…

Sound · Computer Science 2025-01-10 Emilian Postolache , Natalia Polouliakh , Hiroaki Kitano , Akima Connelly , Emanuele Rodolà , Luca Cosmo , Taketo Akama

Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain

Score-based generative models (SGMs) have recently shown impressive results for difficult generative tasks such as the unconditional and conditional generation of natural images and audio signals. In this work, we extend these models to the…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-08 Simon Welker , Julius Richter , Timo Gerkmann

Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from fMRI

Drawing inspiration from the hierarchical processing of the human auditory system, which transforms sound from low-level acoustic features to high-level semantic understanding, we introduce a novel coarse-to-fine audio reconstruction…

Sound · Computer Science 2024-05-30 Che Liu , Changde Du , Xiaoyu Chen , Huiguang He

Fourier Diffusion Models: A Method to Control MTF and NPS in Score-Based Stochastic Image Generation

Score-based stochastic denoising models have recently been demonstrated as powerful machine learning tools for conditional and unconditional image generation. The existing methods are based on a forward stochastic process wherein the…

Medical Physics · Physics 2023-03-24 Matthew Tivnan , Jacopo Teneggi , Tzu-Cheng Lee , Ruoqiao Zhang , Kirsten Boedeker , Liang Cai , Grace J. Gang , Jeremias Sulam , J. Webster Stayman

Exploring Quality and Generalizability in Parameterized Neural Audio Effects

Deep neural networks have shown promise for music audio signal processing applications, often surpassing prior approaches, particularly as end-to-end models in the waveform domain. Yet results to date have tended to be constrained by low…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-11 William Mitchell , Scott H. Hawley

Dense residual Transformer for image denoising

Image denoising is an important low-level computer vision task, which aims to reconstruct a noise-free and high-quality image from a noisy image. With the development of deep learning, convolutional neural network (CNN) has been gradually…

Computer Vision and Pattern Recognition · Computer Science 2022-05-17 Chao Yao , Shuo Jin , Meiqin Liu , Xiaojuan Ban

Audio-to-Score Conversion Model Based on Whisper methodology

This thesis develops a Transformer model based on Whisper, which extracts melodies and chords from music audio and records them into ABC notation. A comprehensive data processing workflow is customized for ABC notation, including data…

Sound · Computer Science 2024-10-23 Hongyao Zhang , Bohang Sun

Learning noise-induced transitions by multi-scaling reservoir computing

Noise is usually regarded as adversarial to extract the effective dynamics from time series, such that the conventional data-driven approaches usually aim at learning the dynamics by mitigating the noisy effect. However, noise can have a…

Adaptation and Self-Organizing Systems · Physics 2023-09-12 Zequn Lin , Zhaofan Lu , Zengru Di , Ying Tang

Dereverberation Using Binary Residual Masking with Time-Domain Consistency

Vocal dereverberation remains a challenging task in audio processing, particularly for real-time applications where both accuracy and efficiency are crucial. Traditional deep learning approaches often struggle to suppress reverberation…

Sound · Computer Science 2025-10-02 Daniel G. Williams

Machine Unlearning for Robust DNNs: Attribution-Guided Partitioning and Neuron Pruning in Noisy Environments

Deep neural networks (DNNs) have achieved remarkable success across diverse domains, but their performance can be severely degraded by noisy or corrupted training data. Conventional noise mitigation methods often rely on explicit…

Machine Learning · Computer Science 2025-06-16 Deliang Jin , Gang Chen , Shuo Feng , Yufeng Ling , Haoran Zhu

Real-Time Target Sound Extraction

We present the first neural network model to achieve real-time and streaming target sound extraction. To accomplish this, we propose Waveformer, an encoder-decoder architecture with a stack of dilated causal convolution layers as the…

Sound · Computer Science 2023-04-20 Bandhav Veluri , Justin Chan , Malek Itani , Tuochao Chen , Takuya Yoshioka , Shyamnath Gollakota

Deep Learning-based Imaging in Radio Interferometry

The sparse layouts of radio interferometers result in an incomplete sampling of the sky in Fourier space which leads to artifacts in the reconstructed images. Cleaning these systematic effects is essential for the scientific use of…

Instrumentation and Methods for Astrophysics · Physics 2023-07-27 Kevin Schmidt , Felix Geyer , Stefan Fröse , Paul-Simon Blomenkamp , Marcus Brüggen , Francesco de Gasperin , Dominik Elsässer , Wolfgang Rhode

Diff-TTS: A Denoising Diffusion Model for Text-to-Speech

Although neural text-to-speech (TTS) models have attracted a lot of attention and succeeded in generating human-like speech, there is still room for improvements to its naturalness and architectural efficiency. In this work, we propose a…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-06 Myeonghun Jeong , Hyeongju Kim , Sung Jun Cheon , Byoung Jin Choi , Nam Soo Kim

DDX7: Differentiable FM Synthesis of Musical Instrument Sounds

FM Synthesis is a well-known algorithm used to generate complex timbre from a compact set of design primitives. Typically featuring a MIDI interface, it is usually impractical to control it from an audio source. On the other hand,…

Sound · Computer Science 2022-08-15 Franco Caspe , Andrew McPherson , Mark Sandler