English
Related papers

Related papers: Dictionary Update for NMF-based Voice Conversion U…

200 papers

Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e.g., Wiener filtering, supervised approaches, such as algorithms…

Sound · Computer Science 2017-09-19 Nasser Mohammadiha , Paris Smaragdis , Arne Leijon

In this paper, we present RT-GCC-NMF: a real-time (RT), two-channel blind speech enhancement algorithm that combines the non-negative matrix factorization (NMF) dictionary learning algorithm with the generalized cross-correlation (GCC)…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-08 Sean U. N. Wood , Jean Rouat

Discriminative models for source separation have recently been shown to produce impressive results. However, when operating on sources outside of the training set, these models can not perform as well and are cumbersome to update. Classical…

Sound · Computer Science 2019-11-04 Shrikant Venkataramani , Efthymios Tzinis , Paris Smaragdis

This paper investigates a non-negative matrix factorization (NMF)-based approach to the semi-supervised single-channel speech enhancement problem where only non-stationary additive noise signals are given. The proposed method relies on…

Sound · Computer Science 2013-09-25 Nikolay Lyubimov , Mikhail Kotov

In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach. We explore the use of neural networks as an alternative to a popular speech variance model based on supervised…

Sound · Computer Science 2019-02-06 Simon Leglaive , Laurent Girin , Radu Horaud

Non-negative Matrix Factorization (NMF) is a powerful technique for analyzing regularly-sampled data, i.e., data that can be stored in a matrix. For audio, this has led to numerous applications using time-frequency (TF) representations like…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-10 Krishna Subramani , Paris Smaragdis , Takuya Higuchi , Mehrez Souden

High-quality speech corpora are essential foundations for most speech applications. However, such speech data are expensive and limited since they are collected in professional recording environments. In this work, we propose an…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-11 Haoyu Li , Yang Ai , Junichi Yamagishi

Neural machine translation (NMT) models generally adopt an encoder-decoder architecture for modeling the entire translation process. The encoder summarizes the representation of input sentence from scratch, which is potentially a problem if…

Computation and Language · Computer Science 2018-12-27 Xinwei Geng , Longyue Wang , Xing Wang , Bing Qin , Ting Liu , Zhaopeng Tu

Mel-frequency filter bank (MFB) based approaches have the advantage of learning speech compared to raw spectrum since MFB has less feature size. However, speech generator with MFB approaches require additional vocoder that needs a huge…

Audio and Speech Processing · Electrical Eng. & Systems 2020-12-01 June-Woo Kim , Ho-Young Jung , Minho Lee

For most of the state-of-the-art speech enhancement techniques, a spectrogram is usually preferred than the respective time-domain raw data since it reveals more compact presentation together with conspicuous temporal information over a…

Sound · Computer Science 2016-08-24 Syu-Siang Wang , Alan Chern , Yu Tsao , Jeih-weih Hung , Xugang Lu , Ying-Hui Lai , Borching Su

This paper describes a practical dual-process speech enhancement system that adapts environment-sensitive frame-online beamforming (front-end) with help from environment-free block-online source separation (back-end). To use minimum…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-25 Aditya Arie Nugraha , Kouhei Sekiguchi , Mathieu Fontaine , Yoshiaki Bando , Kazuyoshi Yoshii

Non-negative Matrix Factorization (NMF) has already been applied to learn speaker characterizations from single or non-simultaneous speech for speaker recognition applications. It is also known for its good performance in (blind) source…

Sound · Computer Science 2016-05-02 Jeroen Zegers , Hugo Van hamme

Nonnegative Matrix Factorization (NMF) is a powerful tool for decomposing mixtures of audio signals in the Time-Frequency (TF) domain. In the source separation framework, the phase recovery for each extracted component is necessary for…

Sound · Computer Science 2016-11-17 Paul Magron , Roland Badeau , Bertrand David

Adapting End-to-End ASR models to out-of-domain datasets with text data is challenging. Factorized neural Transducer (FNT) aims to address this issue by introducing a separate vocabulary decoder to predict the vocabulary. Nonetheless, this…

Computation and Language · Computer Science 2024-06-07 Junzhe Liu , Jianwei Yu , Xie Chen

Neural Machine Translation (NMT) models generally perform translation using a fixed-size lexical vocabulary, which is an important bottleneck on their generalization capability and overall translation quality. The standard approach to…

Computation and Language · Computer Science 2019-10-22 Duygu Ataman , Orhan Firat , Mattia A. Di Gangi , Marcello Federico , Alexandra Birch

The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications. Several studies solved the SCD task using audio inputs only and have shown limited performance.…

This paper presents two single channel speech dereverberation methods to enhance the quality of speech signals that have been recorded in an enclosed space. For both methods, the room acoustics are modeled using a nonnegative approximation…

Sound · Computer Science 2017-09-19 Nasser Mohammadiha , Simon Doclo

Recent improvements in computing allow for the processing and analysis of very large datasets in a variety of fields. Often the analysis requires the creation of low-rank approximations to the datasets leading to efficient storage. This…

Computer Vision and Pattern Recognition · Computer Science 2015-06-29 Richard M. Charles , Kye M. Taylor , James H. Curry

Convolutive Non-Negative Matrix Factorization model factorizes a given audio spectrogram using frequency templates with a temporal dimension. In this paper, we present a convolutional auto-encoder model that acts as a neural network…

Sound · Computer Science 2017-09-26 Shrikant Venkataramani , Y. Cem Subakan , Paris Smaragdis

Non-negative matrix factorization (NMF) is an important tool in signal processing and widely used to separate mixed sources into their components. Algorithms for NMF require that the user choose the number of components in advance, and if…

Machine Learning · Computer Science 2025-01-10 Youdong Guo , Timothy E. Holy
‹ Prev 1 2 3 10 Next ›