English
Related papers

Related papers: Audio Spotforming Using Nonnegative Tensor Factori…

200 papers

We augment the nonnegative matrix factorization method for audio source separation with cues about directionality of sound propagation. This improves separation quality greatly and removes the need for training data, with only a twofold…

Machine Learning · Statistics 2017-05-30 Noah D. Stein

Beamforming with desired directivity patterns using compact microphone arrays is essential in many audio applications. Directivity patterns achievable using traditional beamformers depend on the number of microphones and the array aperture.…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-24 Weilong Huang , Srikanth Raj Chetupalli , Mhd Modar Halimeh , Oliver Thiergart , Emanuël A. P. Habets

Traditional NMF-based signal decomposition relies on the factorization of spectral data, which is typically computed by means of short-time frequency transform. In this paper we propose to relax the choice of a pre-fixed transform and learn…

Machine Learning · Computer Science 2017-12-18 Dylan Fagot , Cédric Févotte , Herwig Wendt

A novel non-negative matrix factorization (NMF) based subband decomposition in frequency spatial domain for acoustic source localization using a microphone array is introduced. The proposed method decomposes source and noise subband and…

Sound · Computer Science 2016-10-18 Suwon Shon , Seongkyu Mun , David Han , Hanseok Ko

This paper investigates a non-negative matrix factorization (NMF)-based approach to the semi-supervised single-channel speech enhancement problem where only non-stationary additive noise signals are given. The proposed method relies on…

Sound · Computer Science 2013-09-25 Nikolay Lyubimov , Mikhail Kotov

To extract the voice of a target speaker when mixed with a variety of other sounds, such as white and ambient noises or the voices of interfering speakers, we extend the Transformer network to attend the most relevant information with…

In this study, a novel non-negative tensor factorization (NTF)-based method for vibration-based local damage detection in rolling element bearings is proposed. As the diagnostic signal registered from a faulty machine is non-stationary, the…

Signal Processing · Electrical Eng. & Systems 2024-03-20 Mateusz Gabor , Rafal Zdunek , Radoslaw Zimroz , Jacek Wodecki , Agnieszka Wylomanska

We propose a completely unsupervised method to understand audio scenes observed with random microphone arrangements by decomposing the scene into its constituent sources and their relative presence in each microphone. To this end, we…

Sound · Computer Science 2019-09-30 Jonah Casebeer , Michael Colomb , Paris Smaragdis

Conventional NMF methods for source separation factorize the matrix of spectral magnitudes. Spectral Phase is not included in the decomposition process of these methods. However, phase of the speech mixture is generally used in…

Sound · Computer Science 2014-11-26 Chaitanya Ahuja , Karan Nathwani , Rajesh M. Hegde

Non-negative Matrix Factorization (NMF) is a powerful technique for analyzing regularly-sampled data, i.e., data that can be stored in a matrix. For audio, this has led to numerous applications using time-frequency (TF) representations like…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-10 Krishna Subramani , Paris Smaragdis , Takuya Higuchi , Mehrez Souden

Target speech extraction is a technique to extract the target speaker's voice from mixture signals using a pre-recorded enrollment utterance that characterize the voice characteristics of the target speaker. One major difficulty of target…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-17 Hiroshi Sato , Tsubasa Ochiai , Marc Delcroix , Keisuke Kinoshita , Takafumi Moriya , Naoki Makishima , Mana Ihori , Tomohiro Tanaka , Ryo Masumura

Anti-spoofing is the task of speech authentication. That is, identifying genuine human speech compared to spoofed speech. The main focus of this paper is to suggest new representations for genuine and spoofed speech, based on the…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-28 Matan Karo , Arie Yeredor , Itshak Lapidot

In this work, we address the problem of binaural target-speaker extraction in the presence of multiple simultane-ous talkers. We propose a novel approach that leverages the individual listener's Head-Related Transfer Function (HRTF) to…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-25 Yoav Ellinson , Sharon Gannot

Neural front-ends are an appealing alternative to traditional, fixed feature extraction pipelines for automatic speech recognition (ASR) systems since they can be directly trained to fit the acoustic model. However, their performance often…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-01 Peter Vieting , Maximilian Kannen , Benedikt Hilmes , Ralf Schlüter , Hermann Ney

Non-negative matrix factorization (NMF) and non-negative tensor factorization (NTF) decompose non-negative high-dimensional data into non-negative low-rank components. NMF and NTF methods are popular for their intrinsic interpretability and…

Machine Learning · Computer Science 2024-12-02 Alexander Sietsema , Zerrin Vural , James Chapman , Yotam Yaniv , Deanna Needell

Sound source localisation is used in many consumer devices, to isolate audio from individual speakers and reject noise. Localization is frequently accomplished by ``beamforming'', which combines phase-shifted audio streams to increase power…

Sound · Computer Science 2025-02-13 Saeid Haghighatshoar , Dylan R Muir

Speaker identification is the process of determining which registered speaker provides a given utterance. Speaker identification required to make a claim on the identity of speaker from the Ns trained speaker in its user database. In this…

Multimedia · Computer Science 2010-04-27 Ibrahim A. Albidewi , Yap Teck Ann

Speaker Diarization (SD) aims at grouping speech segments that belong to the same speaker. This task is required in many speech-processing applications, such as rich meeting transcription. In this context, distant microphone arrays usually…

Sound · Computer Science 2024-06-06 Theo Mariotte , Anthony Larcher , Silvio Montresor , Jean-Hugh Thomas

We present Vibrato Nonnegative Tensor Factorization, an algorithm for single-channel unsupervised audio source separation with an application to separating instrumental or vocal sources with nonstationary pitch from music recordings. Our…

Sound · Computer Science 2016-06-02 Elliot Creager , Noah D. Stein , Roland Badeau , Philippe Depalle

Recently, attention-based transformers have become a de facto standard in many deep learning applications including natural language processing, computer vision, signal processing, etc.. In this paper, we propose a transformer-based…

Sound · Computer Science 2024-09-04 Tathagata Bandyopadhyay
‹ Prev 1 2 3 10 Next ›