English
Related papers

Related papers: BeamTransformer: Microphone Array-based Overlappin…

200 papers

The goal of this work is to develop a meeting transcription system that can recognize speech even when utterances of different speakers are overlapped. While speech overlaps have been regarded as a major obstacle in accurately transcribing…

Audio and Speech Processing · Electrical Eng. & Systems 2018-10-10 Takuya Yoshioka , Hakan Erdogan , Zhuo Chen , Xiong Xiao , Fil Alleva

Beamforming is a signal processing technique. It has been studied in many areas such as radar, sonar, seismology and wireless communications, to name but a few. It can be used for a myriad of purposes, such as detecting the presence of a…

Other Computer Science · Computer Science 2012-12-27 Hidri Adel , Meddeb Souad , Abdulqadir Alaqeeli , Amiri Hamid

Multi-channel acoustic signal processing is a well-established and powerful tool to exploit the spatial diversity between a target signal and non-target or noise sources for signal enhancement. However, the textbook solutions for optimal…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-14 Reinhold Haeb-Umbach , Tomohiro Nakatani , Marc Delcroix , Christoph Boeddeker , Tsubasa Ochiai

Multi-talker overlapped speech recognition remains a significant challenge, requiring not only speech recognition but also speaker diarization tasks to be addressed. In this paper, to better address these tasks, we first introduce speaker…

Sound · Computer Science 2023-12-19 Peng Shen , Xugang Lu , Hisashi Kawai

Recently, stunning improvements on multi-channel speech separation have been achieved by neural beamformers when direction information is available. However, most of them neglect to utilize speaker's 2-dimensional (2D) location cues…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-05 Yanjie Fu , Meng Ge , Honglong Wang , Nan Li , Haoran Yin , Longbiao Wang , Gaoyan Zhang , Jianwu Dang , Chengyun Deng , Fei Wang

This work introduces sequential neural beamforming, which alternates between neural network based spectral separation and beamforming based spatial separation. Our neural networks for separation use an advanced convolutional architecture…

Neural beamformers, which integrate both pre-separation and beamforming modules, have demonstrated impressive effectiveness in target speech extraction. Nevertheless, the performance of these beamformers is inherently limited by the…

Sound · Computer Science 2023-09-08 Aoqi Guo , Sichong Qian , Baoxiang Li , Dazhi Gao

In the analysis of acoustic scenes, often the occurring sounds have to be detected in time, recognized, and localized in space. Usually, each of these tasks is done separately. In this paper, a model-based approach to jointly carry them out…

Sound · Computer Science 2017-12-20 Rupayan Chakraborty , Climent Nadeu

This paper reviews pioneering works in microphone array processing and multichannel speech enhancement, highlighting historical achievements, technological evolution, commercialization aspects, and key challenges. It provides valuable…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-14 Gongping Huang , Jesper R. Jensen , Jingdong Chen , Jacob Benesty , Mads G. Christensen , Akihiko Sugiyama , Gary Elko , Tomas Gaensler

Enhancing the user's own-voice for head-worn microphone arrays is an important task in noisy environments to allow for easier speech communication and user-device interaction. However, a rarely addressed challenge is the change of the…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-15 Wiebke Middelberg , Jung-Suk Lee , Saeed Bagheri Sereshki , Ali Aroudi , Vladimir Tourbabin , Daniel D. E. Wong

With its strong modeling capacity that comes from a multi-head and multi-layer structure, Transformer is a very powerful model for learning a sequential representation and has been successfully applied to speech separation recently.…

Sound · Computer Science 2020-10-26 Sanyuan Chen , Yu Wu , Zhuo Chen , Takuya Yoshioka , Shujie Liu , Jinyu Li

Most deep learning-based multi-channel speech enhancement methods focus on designing a set of beamforming coefficients to directly filter the low signal-to-noise ratio signals received by microphones, which hinders the performance of these…

Sound · Computer Science 2022-02-08 Wenzhe Liu , Andong Li , Chengshi Zheng , Xiaodong Li

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Microphone array post-filters have demonstrated their ability to greatly reduce noise at the output of a beamformer. However, current techniques only consider a single source of interest, most of the time assuming stationary background…

Sound · Computer Science 2016-03-11 Jean-Marc Valin , Jean Rouat , François Michaud

Beamformers often trade off white noise gain against the ability to suppress interferers. With distributed microphone arrays, this trade-off becomes crucial as different arrays capture vastly different magnitude and phase differences for…

Sound · Computer Science 2025-07-09 Manan Mittal , Ryan M. Corey , Andrew C. Singer

Speech enhancement is a fundamental challenge in signal processing, particularly when robustness is required across diverse acoustic conditions and microphone setups. Deep learning methods have been successful for speech enhancement, but…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-28 Yuval Bar Ilan , Boaz Rafaely , Vladimir Tourbabin

Far-field speech processing is an important and challenging problem. In this paper, we propose \textit{deep ad-hoc beamforming}, a deep-learning-based multichannel speech enhancement framework based on ad-hoc microphone arrays, to address…

Sound · Computer Science 2021-02-10 Xiao-Lei Zhang

Microphone arrays are usually assumed to have rigid geometries: the microphones may move with respect to the sound field but remain fixed relative to each other. However, many useful arrays, such as those in wearable devices, have sensors…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-12 Ryan M. Corey , Andrew C. Singer

In this work, we investigate the generalization of a multi-channel learning-based replay speech detector, which employs adaptive beamforming and detection, across different microphone arrays. In general, deep neural network-based microphone…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-09 Michael Neri , Tuomas Virtanen

High quality speech capture has been widely studied for both voice communication and human computer interface reasons. To improve the capture performance, we can often find multi-microphone speech enhancement techniques deployed on various…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-15 Yang Yang , Shao-Fu Shih , Hakan Erdogan , Jamie Menjay Lin , Chehung Lee , Yunpeng Li , George Sung , Matthias Grundmann
‹ Prev 1 2 3 10 Next ›