Related papers: BeamTransformer: Microphone Array-based Overlappin…

Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks

The goal of this work is to develop a meeting transcription system that can recognize speech even when utterances of different speakers are overlapped. While speech overlaps have been regarded as a major obstacle in accurately transcribing…

Audio and Speech Processing · Electrical Eng. & Systems 2018-10-10 Takuya Yoshioka , Hakan Erdogan , Zhuo Chen , Xiong Xiao , Fil Alleva

Beamforming Techniques for Multichannel audio Signal Separation

Beamforming is a signal processing technique. It has been studied in many areas such as radar, sonar, seismology and wireless communications, to name but a few. It can be used for a myriad of purposes, such as detecting the presence of a…

Other Computer Science · Computer Science 2012-12-27 Hidri Adel , Meddeb Souad , Abdulqadir Alaqeeli , Amiri Hamid

Microphone Array Signal Processing and Deep Learning for Speech Enhancement

Multi-channel acoustic signal processing is a well-established and powerful tool to exploit the spatial diversity between a target signal and non-target or noise sources for signal enhancement. However, the textbook solutions for optimal…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-14 Reinhold Haeb-Umbach , Tomohiro Nakatani , Marc Delcroix , Christoph Boeddeker , Tsubasa Ochiai

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition

Multi-talker overlapped speech recognition remains a significant challenge, requiring not only speech recognition but also speaker diarization tasks to be addressed. In this paper, to better address these tasks, we first introduce speaker…

Sound · Computer Science 2023-12-19 Peng Shen , Xugang Lu , Hisashi Kawai

Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation

Recently, stunning improvements on multi-channel speech separation have been achieved by neural beamformers when direction information is available. However, most of them neglect to utilize speaker's 2-dimensional (2D) location cues…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-05 Yanjie Fu , Meng Ge , Honglong Wang , Nan Li , Haoran Yin , Longbiao Wang , Gaoyan Zhang , Jianwu Dang , Chengyun Deng , Fei Wang

Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement

This work introduces sequential neural beamforming, which alternates between neural network based spectral separation and beamforming based spatial separation. Our neural networks for separation use an advanced convolutional architecture…

Sound · Computer Science 2020-11-05 Zhong-Qiu Wang , Hakan Erdogan , Scott Wisdom , Kevin Wilson , Desh Raj , Shinji Watanabe , Zhuo Chen , John R. Hershey

Dual-path Transformer Based Neural Beamformer for Target Speech Extraction

Neural beamformers, which integrate both pre-separation and beamforming modules, have demonstrated impressive effectiveness in target speech extraction. Nevertheless, the performance of these beamformers is inherently limited by the…

Sound · Computer Science 2023-09-08 Aoqi Guo , Sichong Qian , Baoxiang Li , Dazhi Gao

Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays

In the analysis of acoustic scenes, often the occurring sounds have to be detected in time, recognized, and localized in space. Usually, each of these tasks is done separately. In this paper, a model-based approach to jointly carry them out…

Sound · Computer Science 2017-12-20 Rupayan Chakraborty , Climent Nadeu

Advances in Microphone Array Processing and Multichannel Speech Enhancement

This paper reviews pioneering works in microphone array processing and multichannel speech enhancement, highlighting historical achievements, technological evolution, commercialization aspects, and key challenges. It provides valuable…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-14 Gongping Huang , Jesper R. Jensen , Jingdong Chen , Jacob Benesty , Mads G. Christensen , Akihiko Sugiyama , Gary Elko , Tomas Gaensler

Microphone Occlusion Mitigation for Own-Voice Enhancement in Head-Worn Microphone Arrays Using Switching-Adaptive Beamforming

Enhancing the user's own-voice for head-worn microphone arrays is an important task in noisy environments to allow for easier speech communication and user-device interaction. However, a rarely addressed challenge is the change of the…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-15 Wiebke Middelberg , Jung-Suk Lee , Saeed Bagheri Sereshki , Ali Aroudi , Vladimir Tourbabin , Daniel D. E. Wong

Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer

With its strong modeling capacity that comes from a multi-head and multi-layer structure, Transformer is a very powerful model for learning a sequential representation and has been successfully applied to speech separation recently.…

Sound · Computer Science 2020-10-26 Sanyuan Chen , Yu Wu , Zhuo Chen , Takuya Yoshioka , Shujie Liu , Jinyu Li

A Neural Beam Filter for Real-time Multi-channel Speech Enhancement

Most deep learning-based multi-channel speech enhancement methods focus on designing a set of beamforming coefficients to directly filter the low signal-to-noise ratio signals received by microphones, which hinders the performance of these…

Sound · Computer Science 2022-02-08 Wenzhe Liu , Andong Li , Chengshi Zheng , Xiaodong Li

Distributed speech separation in spatially unconstrained microphone arrays

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Microphone array post-filter for separation of simultaneous non-stationary sources

Microphone array post-filters have demonstrated their ability to greatly reduce noise at the output of a beamformer. However, current techniques only consider a single source of interest, most of the time assuming stationary background…

Sound · Computer Science 2016-03-11 Jean-Marc Valin , Jean Rouat , François Michaud

Beamforming with Random Projections: Upper and Lower Bounds

Beamformers often trade off white noise gain against the ability to suppress interferers. With distributed microphone arrays, this trade-off becomes crucial as different arrays capture vastly different magnitude and phase differences for…

Sound · Computer Science 2025-07-09 Manan Mittal , Ryan M. Corey , Andrew C. Singer

HyBeam: Hybrid Microphone-Beamforming Array-Agnostic Speech Enhancement for Wearables

Speech enhancement is a fundamental challenge in signal processing, particularly when robustness is required across diverse acoustic conditions and microphone setups. Deep learning methods have been successful for speech enhancement, but…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-28 Yuval Bar Ilan , Boaz Rafaely , Vladimir Tourbabin

Deep Ad-hoc Beamforming

Far-field speech processing is an important and challenging problem. In this paper, we propose \textit{deep ad-hoc beamforming}, a deep-learning-based multichannel speech enhancement framework based on ad-hoc microphone arrays, to address…

Sound · Computer Science 2021-02-10 Xiao-Lei Zhang

Motion-Tolerant Beamforming with Deformable Microphone Arrays

Microphone arrays are usually assumed to have rigid geometries: the microphones may move with respect to the sound field but remain fixed relative to each other. However, many useful arrays, such as those in wearable devices, have sensors…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-12 Ryan M. Corey , Andrew C. Singer

Impact of Microphone Array Mismatches to Learning-based Replay Speech Detection

In this work, we investigate the generalization of a multi-channel learning-based replay speech detector, which employs adaptive beamforming and detection, across different microphone arrays. In general, deep neural network-based microphone…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-09 Michael Neri , Tuomas Virtanen

Guided Speech Enhancement Network

High quality speech capture has been widely studied for both voice communication and human computer interface reasons. To improve the capture performance, we can often find multi-microphone speech enhancement techniques deployed on various…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-15 Yang Yang , Shao-Fu Shih , Hakan Erdogan , Jamie Menjay Lin , Chehung Lee , Yunpeng Li , George Sung , Matthias Grundmann