English
Related papers

Related papers: Automatic context window composition for distant s…

200 papers

Despite the significant progress made in the last years, state-of-the-art speech recognition technologies provide a satisfactory performance only in the close-talking condition. Robustness of distant speech recognition in adverse acoustic…

Audio and Speech Processing · Electrical Eng. & Systems 2017-10-11 Mirco Ravanelli , Maurizio Omologo

Distant speech recognition is a challenge, particularly due to the corruption of speech signals by reverberation caused by large distances between the speaker and microphone. In order to cope with a wide range of reverberations in…

Computation and Language · Computer Science 2016-08-18 Jeehye Lee , Myungin Lee , Joon-Hyuk Chang

Despite the remarkable progress recently made in distant speech recognition, state-of-the-art technology still suffers from a lack of robustness, especially when adverse acoustic conditions characterized by non-stationary noises and…

Computation and Language · Computer Science 2017-03-24 Mirco Ravanelli , Philemon Brakel , Maurizio Omologo , Yoshua Bengio

We present a systematic analysis on the performance of a phonetic recogniser when the window of input features is not symmetric with respect to the current frame. The recogniser is based on Context Dependent Deep Neural Networks (CD-DNNs)…

Computation and Language · Computer Science 2016-06-30 Akash Kumar Dhaka , Giampiero Salvi

Time-frequency masking or spectrum prediction computed via short symmetric windows are commonly used in low-latency deep neural network (DNN) based source separation. In this paper, we propose the usage of an asymmetric analysis-synthesis…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-23 Shanshan Wang , Gaurav Naithani , Archontis Politis , Tuomas Virtanen

We propose a spatial diffuseness feature for deep neural network (DNN)-based automatic speech recognition to improve recognition accuracy in reverberant and noisy environments. The feature is computed in real-time from multiple microphone…

Computation and Language · Computer Science 2015-09-02 Andreas Schwarz , Christian Huemmer , Roland Maas , Walter Kellermann

Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence. Among the other achievements, building computers that understand speech represents a…

Computation and Language · Computer Science 2017-12-19 Mirco Ravanelli

We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low-dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding.…

Computation and Language · Computer Science 2017-09-07 Pranay Dighe , Gil Luyet , Afsaneh Asaei , Herve Bourlard

Deep neural network (DNN)-based speech enhancement algorithms in microphone arrays have now proven to be efficient solutions to speech understanding and speech recognition in noisy environments. However, in the context of ad-hoc microphone…

Signal Processing · Electrical Eng. & Systems 2020-11-04 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Establishing semantic correspondence is a core problem in computer vision and remains challenging due to large intra-class variations and lack of annotated data. In this paper, we aim to incorporate global semantic context in a flexible…

Computer Vision and Pattern Recognition · Computer Science 2019-09-10 Shuaiyi Huang , Qiuyue Wang , Songyang Zhang , Shipeng Yan , Xuming He

In the sentence classification task, context formed from sentences adjacent to the sentence being classified can provide important information for classification. This context is, however, often ignored. Where methods do make use of…

Information Retrieval · Computer Science 2018-09-05 Xingyi Song , Johann Petrak , Angus Roberts

This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for robust binaural localisation of multiple sources in reverberant environments. DNNs are used to learn the relationship…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-08 Ning Ma , Tobias May , Guy J. Brown

Frequency modulation features capture the fine structure of speech formants that constitute beneficial and supplementary to the traditional energy-based cepstral features. Improvements have been demonstrated mainly in GMM-HMM systems for…

Sound · Computer Science 2019-09-04 Isidoros Rodomagoulakis , Petros Maragos

This paper presents our latest investigations on dialog act (DA) classification on automatically generated transcriptions. We propose a novel approach that combines convolutional neural networks (CNNs) and conditional random fields (CRFs)…

Computation and Language · Computer Science 2019-03-01 Daniel Ortega , Chia-Yu Li , Gisela Vallejo , Pavel Denisov , Ngoc Thang Vu

Neural networks (NNs) have been widely applied in speech processing tasks, and, in particular, those employing microphone arrays. Nevertheless, most existing NN architectures can only deal with fixed and position-specific microphone arrays.…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-14 Yochai Yemini , Ethan Fetaya , Haggai Maron , Sharon Gannot

Speech enhancement models should meet very low latency requirements typically smaller than 5 ms for hearing assistive devices. While various low-latency techniques have been proposed, comparing these methods in a controlled setup using DNNs…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-17 Haibin Wu , Sebastian Braun

We propose and analyze the use of an explicit time-context window for neural network-based spectral masking speech enhancement to leverage signal context dependencies between neighboring frames. In particular, we concentrate on soft masking…

Audio and Speech Processing · Electrical Eng. & Systems 2024-08-29 Luan Vinícius Fiorio , Boris Karanov , Bruno Defraene , Johan David , Wim van Houtum , Frans Widdershoven , Ronald M. Aarts

Recurrent neural networks (RNNs) have shown significant improvements in recent years for speech enhancement. However, the model complexity and inference time cost of RNNs are much higher than deep feed-forward neural networks (DNNs).…

Sound · Computer Science 2020-11-12 Cunhang Fan , Bin Liu , Jianhua Tao , Jiangyan Yi , Zhengqi Wen , Leichao Song

Despite significant efforts over the last few years to build a robust automatic speech recognition (ASR) system for different acoustic settings, the performance of the current state-of-the-art technologies significantly degrades in noisy…

Audio and Speech Processing · Electrical Eng. & Systems 2019-10-17 Salar Jafarlou , Soheil Khorram , Vinay Kothapally , John H. L. Hansen

We propose a novel method for generating scene-aware training data for far-field automatic speech recognition. We use a deep learning-based estimator to non-intrusively compute the sub-band reverberation time of an environment from its…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-23 Zhenyu Tang , Dinesh Manocha
‹ Prev 1 2 3 10 Next ›