English
Related papers

Related papers: Spatial Aware Multi-Task Learning Based Speech Sep…

200 papers

This paper introduces an area-based source separation method designed for virtual meeting scenarios. The aim is to preserve speech signals from an unspecified number of sources within a defined spatial area in front of a linear microphone…

Audio and Speech Processing · Electrical Eng. & Systems 2024-08-20 Martin Strauss , Okan Köpüklü

A three-stage approach is proposed for speaker counting and speech separation in noisy and reverberant environments. In the spatial feature extraction, a spatial coherence matrix (SCM) is computed using whitened relative transfer functions…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-08 Yicheng Hsu , Mingsian Bai

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

In this paper, we present a novel multi-channel speech extraction system to simultaneously extract multiple clean individual sources from a mixture in noisy and reverberant environments. The proposed method is built on an improved…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-17 Jisi Zhang , Catalin Zorila , Rama Doddipatla , Jon Barker

Humans can robustly recognize and localize objects by integrating visual and auditory cues. While machines are able to do the same now with images, less work has been done with sounds. This work develops an approach for dense semantic…

Computer Vision and Pattern Recognition · Computer Science 2020-03-10 Arun Balajee Vasudevan , Dengxin Dai , Luc Van Gool

This paper addresses the challenge of audio-visual single-microphone speech separation and enhancement in the presence of real-world environmental noise. Our approach is based on generative inverse sampling, where we model clean speech and…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-03 Yochai Yemini , Yoav Ellinson , Rami Ben-Ari , Sharon Gannot , Ethan Fetaya

We consider the problem of audio voice separation for binaural applications, such as earphones and hearing aids. While today's neural networks perform remarkably well (separating $4+$ sources with 2 microphones) they assume a known or fixed…

Sound · Computer Science 2022-07-18 Zhongweiyang Xu , Romit Roy Choudhury

We consider the task of region-based source separation of reverberant multi-microphone recordings. We assume pre-defined spatial regions with a single active source per region. The objective is to estimate the signals from the individual…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-14 Julian Wechsler , Srikanth Raj Chetupalli , Wolfgang Mack , Emanuël A. P. Habets

Speaker Diarization (SD) aims at grouping speech segments that belong to the same speaker. This task is required in many speech-processing applications, such as rich meeting transcription. In this context, distant microphone arrays usually…

Sound · Computer Science 2024-06-06 Theo Mariotte , Anthony Larcher , Silvio Montresor , Jean-Hugh Thomas

Speech data collected in real-world scenarios often encounters two issues. First, multiple sources may exist simultaneously, and the number of sources may vary with time. Second, the existence of background noise in recording is inevitable.…

Sound · Computer Science 2020-05-21 Yuan-Kuei Wu , Chao-I Tuan , Hung-yi Lee , Yu Tsao

Imagine being able to listen to the birds chirping in a park without hearing the chatter from other hikers, or being able to block out traffic noise on a busy street while still being able to hear emergency sirens and car honks. We…

Sound · Computer Science 2023-11-02 Bandhav Veluri , Malek Itani , Justin Chan , Takuya Yoshioka , Shyamnath Gollakota

This paper proposes a neural network based speech separation method using spatially distributed microphones. Unlike with traditional microphone array settings, neither the number of microphones nor their spatial arrangement is known in…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-01 Dongmei Wang , Zhuo Chen , Takuya Yoshioka

Speech separation seeks to isolate individual speech signals from a multi-talk speech mixture. Despite much progress, a system well-trained on synthetic data often experiences performance degradation on out-of-domain data, such as…

Sound · Computer Science 2025-03-18 Wupeng Wang , Zexu Pan , Jingru Lin , Shuai Wang , Haizhou Li

We introduce the active audio-visual source separation problem, where an agent must move intelligently in order to better isolate the sounds coming from an object of interest in its environment. The agent hears multiple audio sources…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Sagnik Majumder , Ziad Al-Halah , Kristen Grauman

Teleconferencing is becoming essential during the COVID-19 pandemic. However, in real-world applications, speech quality can deteriorate due to, for example, background interference, noise, or reverberation. To solve this problem, target…

Audio and Speech Processing · Electrical Eng. & Systems 2022-05-02 Yicheng Hsu , Yonghan Lee , Mingsian R. Bai

This paper addresses the issue of active speaker detection (ASD) in noisy environments and formulates a robust active speaker detection (rASD) problem. Existing ASD approaches leverage both audio and visual modalities, but non-speech sounds…

Multimedia · Computer Science 2024-04-02 Siva Sai Nagender Vasireddy , Chenxu Zhang , Xiaohu Guo , Yapeng Tian

In this paper we describe a speaker diarization system that enables localization and identification of all speakers present in a conversation or meeting. We propose a novel systematic approach to tackle several long-standing challenges in…

Sound · Computer Science 2021-07-21 Siqi Zheng , Weilong Huang , Xianliang Wang , Hongbin Suo , Jinwei Feng , Zhijie Yan

Speech activity detection (SAD), which often rests on the fact that the noise is "more" stationary than speech, is particularly challenging in non-stationary environments, because the time variance of the acoustic scene makes it difficult…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-29 Jens Heitkaemper , Joerg Schmalenstroeer , Reinhold Haeb-Umbach

Auditory scene analysis (ASA) aims to retrieve information from the acoustic environment, by carrying out three main tasks: sound source location, separation, and classification. These tasks are traditionally executed with a linear data…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-21 Caleb Rascon , Luis Gato-Diaz , Eduardo García-Alarcón

Speech separation involves extracting an individual speaker's voice from a multi-speaker audio signal. The increasing complexity of real-world environments, where multiple speakers might converse simultaneously, underscores the importance…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-09 Renana Opochinsky , Mordehay Moradi , Sharon Gannot
‹ Prev 1 2 3 10 Next ›