Related papers: Dictionary Update for NMF-based Voice Conversion U…

Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization

Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e.g., Wiener filtering, supervised approaches, such as algorithms…

Sound · Computer Science 2017-09-19 Nasser Mohammadiha , Paris Smaragdis , Arne Leijon

Unsupervised Low Latency Speech Enhancement with RT-GCC-NMF

In this paper, we present RT-GCC-NMF: a real-time (RT), two-channel blind speech enhancement algorithm that combines the non-negative matrix factorization (NMF) dictionary learning algorithm with the generalized cross-correlation (GCC)…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-08 Sean U. N. Wood , Jean Rouat

End-to-end Non-Negative Autoencoders for Sound Source Separation

Discriminative models for source separation have recently been shown to produce impressive results. However, when operating on sources outside of the training set, these models can not perform as well and are cumbersome to update. Classical…

Sound · Computer Science 2019-11-04 Shrikant Venkataramani , Efthymios Tzinis , Paris Smaragdis

Non-negative Matrix Factorization with Linear Constraints for Single-Channel Speech Enhancement

This paper investigates a non-negative matrix factorization (NMF)-based approach to the semi-supervised single-channel speech enhancement problem where only non-stationary additive noise signals are given. The proposed method relies on…

Sound · Computer Science 2013-09-25 Nikolay Lyubimov , Mikhail Kotov

A variance modeling framework based on variational autoencoders for speech enhancement

In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach. We explore the use of neural networks as an alternative to a popular speech variance model based on supervised…

Sound · Computer Science 2019-02-06 Simon Leglaive , Laurent Girin , Radu Horaud

Rethinking Non-Negative Matrix Factorization with Implicit Neural Representations

Non-negative Matrix Factorization (NMF) is a powerful technique for analyzing regularly-sampled data, i.e., data that can be stored in a matrix. For audio, this has led to numerous applications using time-frequency (TF) representations like…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-10 Krishna Subramani , Paris Smaragdis , Takuya Higuchi , Mehrez Souden

Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model

High-quality speech corpora are essential foundations for most speech applications. However, such speech data are expensive and limited since they are collected in professional recording environments. In this work, we propose an…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-11 Haoyu Li , Yang Ai , Junichi Yamagishi

Learning to Refine Source Representations for Neural Machine Translation

Neural machine translation (NMT) models generally adopt an encoder-decoder architecture for modeling the entire translation process. The encoder summarizes the representation of input sentence from scratch, which is potentially a problem if…

Computation and Language · Computer Science 2018-12-27 Xinwei Geng , Longyue Wang , Xing Wang , Bing Qin , Ting Liu , Zhaopeng Tu

Vocoder-free End-to-End Voice Conversion with Transformer Network

Mel-frequency filter bank (MFB) based approaches have the advantage of learning speech compared to raw spectrum since MFB has less feature size. However, speech generator with MFB approaches require additional vocoder that needs a huge…

Audio and Speech Processing · Electrical Eng. & Systems 2020-12-01 June-Woo Kim , Ho-Young Jung , Minho Lee

Wavelet speech enhancement based on nonnegative matrix factorization

For most of the state-of-the-art speech enhancement techniques, a spectrogram is usually preferred than the respective time-domain raw data since it reveals more compact presentation together with conspicuous temporal information over a…

Sound · Computer Science 2016-08-24 Syu-Siang Wang , Alan Chern , Yu Tsao , Jeih-weih Hung , Xugang Lu , Ying-Hui Lai , Borching Su

DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF

This paper describes a practical dual-process speech enhancement system that adapts environment-sensitive frame-online beamforming (front-end) with help from environment-free block-online source separation (back-end). To use minimum…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-25 Aditya Arie Nugraha , Kouhei Sekiguchi , Mathieu Fontaine , Yoshiaki Bando , Kazuyoshi Yoshii

Joint Sound Source Separation and Speaker Recognition

Non-negative Matrix Factorization (NMF) has already been applied to learn speaker characterizations from single or non-simultaneous speech for speaker recognition applications. It is also known for its good performance in (blind) source…

Sound · Computer Science 2016-05-02 Jeroen Zegers , Hugo Van hamme

Complex NMF under phase constraints based on signal modeling: application to audio source separation

Nonnegative Matrix Factorization (NMF) is a powerful tool for decomposing mixtures of audio signals in the Time-Frequency (TF) domain. In the source separation framework, the phase recovery for each extracted component is necessary for…

Sound · Computer Science 2016-11-17 Paul Magron , Roland Badeau , Bertrand David

Improved Factorized Neural Transducer Model For text-only Domain Adaptation

Adapting End-to-End ASR models to out-of-domain datasets with text data is challenging. Factorized neural Transducer (FNT) aims to address this issue by introducing a separate vocabulary decoder to predict the vocabulary. Nonetheless, this…

Computation and Language · Computer Science 2024-06-07 Junzhe Liu , Jianwei Yu , Xie Chen

On the Importance of Word Boundaries in Character-level Neural Machine Translation

Neural Machine Translation (NMT) models generally perform translation using a fixed-size lexical vocabulary, which is an important bottleneck on their generalization capability and overall translation quality. The standard approach to…

Computation and Language · Computer Science 2019-10-22 Duygu Ataman , Orhan Firat , Mattia A. Di Gangi , Marcello Federico , Alexandra Birch

Encoder-decoder multimodal speaker change detection

The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications. Several studies solved the SCD task using audio inputs only and have shown limited performance.…

Sound · Computer Science 2023-06-02 Jee-weon Jung , Soonshin Seo , Hee-Soo Heo , Geonmin Kim , You Jin Kim , Young-ki Kwon , Minjae Lee , Bong-Jin Lee

Speech Dereverberation Using Nonnegative Convolutive Transfer Function and Spectro temporal Modeling

This paper presents two single channel speech dereverberation methods to enhance the quality of speech signals that have been recorded in an enclosed space. For both methods, the room acoustics are modeled using a nonnegative approximation…

Sound · Computer Science 2017-09-19 Nasser Mohammadiha , Simon Doclo

Nonnegative Matrix Factorization applied to reordered pixels of single images based on patches to achieve structured nonnegative dictionaries

Recent improvements in computing allow for the processing and analysis of very large datasets in a variety of fields. Often the analysis requires the creation of low-rank approximations to the datasets leading to efficient storage. This…

Computer Vision and Pattern Recognition · Computer Science 2015-06-29 Richard M. Charles , Kye M. Taylor , James H. Curry

Neural Network Alternatives to Convolutive Audio Models for Source Separation

Convolutive Non-Negative Matrix Factorization model factorizes a given audio spectrogram using frequency templates with a temporal dimension. In this paper, we present a convolutional auto-encoder model that acts as a neural network…

Sound · Computer Science 2017-09-26 Shrikant Venkataramani , Y. Cem Subakan , Paris Smaragdis

GSVD-NMF: Recovering Missing Features in Non-negative Matrix Factorization

Non-negative matrix factorization (NMF) is an important tool in signal processing and widely used to separate mixed sources into their components. Algorithms for NMF require that the user choose the number of components in advance, and if…

Machine Learning · Computer Science 2025-01-10 Youdong Guo , Timothy E. Holy