Related papers: Feature Normalisation for Robust Speech Recognitio…

Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition

In this paper, a modification to the training process of the popular SPLICE algorithm has been proposed for noise robust speech recognition. The modification is based on feature correlations, and enables this stereo-based algorithm to…

Machine Learning · Computer Science 2014-02-12 D. S. Pavan Kumar , N. Vishnu Prasad , Vikas Joshi , S. Umesh

Normalized Features for Improving the Generalization of DNN Based Speech Enhancement

Enhancing noisy speech is an important task to restore its quality and to improve its intelligibility. In traditional non-machine-learning (ML) based approaches the parameters required for noise reduction are estimated blindly from the…

Sound · Computer Science 2018-01-16 Robert Rehr , Timo Gerkmann

Speech Recognition Front End Without Information Loss

Speech representation and modelling in high-dimensional spaces of acoustic waveforms, or a linear transformation thereof, is investigated with the aim of improving the robustness of automatic speech recognition to additive noise. The…

Computation and Language · Computer Science 2015-03-31 Matthew Ager , Zoran Cvetkovic , Peter Sollich

Robust Speaker Recognition Using Speech Enhancement And Attention Model

In this paper, a novel architecture for speaker recognition is proposed by cascading speech enhancement and speaker processing. Its aim is to improve speaker recognition performance when speech signals are corrupted by noise. Instead of…

Computation and Language · Computer Science 2020-05-25 Yanpei Shi , Qiang Huang , Thomas Hain

Robust Raw Waveform Speech Recognition Using Relevance Weighted Representations

Speech recognition in noisy and channel distorted scenarios is often challenging as the current acoustic modeling schemes are not adaptive to the changes in the signal distribution in the presence of noise. In this work, we develop a novel…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-03 Purvi Agrawal , Sriram Ganapathy

Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features

We propose an algorithm to extract noise-robust acoustic features from noisy speech. We use Total Variability Modeling in combination with Non-negative Matrix Factorization (NMF) to learn a total variability subspace and adapt NMF…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-17 Kunal Dhawan , Colin Vaz , Ruchir Travadi , Shrikanth Narayanan

Noise-Conditioned Mixture-of-Experts Framework for Robust Speaker Verification

Robust speaker verification under noisy conditions remains an open challenge. Conventional deep learning methods learn a robust unified speaker representation space against diverse background noise and achieve significant improvement. In…

Sound · Computer Science 2026-03-11 Bin Gu , Haitao Zhao , Jibo Wei

Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization

Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e.g., Wiener filtering, supervised approaches, such as algorithms…

Sound · Computer Science 2017-09-19 Nasser Mohammadiha , Paris Smaragdis , Arne Leijon

Enhancing Automatic Speech Recognition Through Integrated Noise Detection Architecture

This research presents a novel approach to enhancing automatic speech recognition systems by integrating noise detection capabilities directly into the recognition architecture. Building upon the wav2vec2 framework, the proposed method…

Sound · Computer Science 2025-12-11 Karamvir Singh

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Supervised fine-tuning (SFT) plays a crucial role in adapting large language models (LLMs) to specific domains or tasks. However, as demonstrated by empirical experiments, the collected data inevitably contains noise in practical…

Computation and Language · Computer Science 2024-12-20 Junyu Luo , Xiao Luo , Kaize Ding , Jingyang Yuan , Zhiping Xiao , Ming Zhang

Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement

Large, pre-trained representation models trained using self-supervised learning have gained popularity in various fields of machine learning because they are able to extract high-quality salient features from input data. As such, they have…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-16 Hejung Yang , Hong-Goo Kang

Speech Denoising with Auditory Models

Contemporary speech enhancement predominantly relies on audio transforms that are trained to reconstruct a clean speech waveform. The development of high-performing neural network sound recognition systems has raised the possibility of…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-18 Mark R. Saddler , Andrew Francl , Jenelle Feather , Kaizhi Qian , Yang Zhang , Josh H. McDermott

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Compensation for channel mismatch and noise interference is essential for robust automatic speech recognition. Enhanced speech has been introduced into the multi-condition training of acoustic models to improve their generalization ability.…

Sound · Computer Science 2022-11-24 Hung-Shin Lee , Pin-Yuan Chen , Yao-Fei Cheng , Yu Tsao , Hsin-Min Wang

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on…

Sound · Computer Science 2018-09-24 Zixing Zhang , Jürgen Geiger , Jouni Pohjalainen , Amr El-Desoky Mousa , Wenyu Jin , Björn Schuller

Robust Speech Representation Learning via Flow-based Embedding Regularization

Over the recent years, various deep learning-based methods were proposed for extracting a fixed-dimensional embedding vector from speech signals. Although the deep learning-based embedding extraction methods have shown good performance in…

Audio and Speech Processing · Electrical Eng. & Systems 2021-12-08 Woo Hyun Kang , Jahangir Alam , Abderrahim Fathan

Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction

Noise robustness is essential for deploying automatic speech recognition (ASR) systems in real-world environments. One way to reduce the effect of noise interference is to employ a preprocessing module that conducts speech enhancement, and…

Sound · Computer Science 2021-11-01 Heming Wang , Yao Qian , Xiaofei Wang , Yiming Wang , Chengyi Wang , Shujie Liu , Takuya Yoshioka , Jinyu Li , DeLiang Wang

Adaptive Weighted Nonnegative Matrix Factorization for Robust Feature Representation

Nonnegative matrix factorization (NMF) has been widely used to dimensionality reduction in machine learning. However, the traditional NMF does not properly handle outliers, so that it is sensitive to noise. In order to improve the…

Machine Learning · Computer Science 2022-06-08 Tingting Shen , Junhang Li , Can Tong , Qiang He , Chen Li , Yudong Yao , Yueyang Teng

Self-supervised Speaker Recognition Training Using Human-Machine Dialogues

Speaker recognition, recognizing speaker identities based on voice alone, enables important downstream applications, such as personalization and authentication. Learning speaker representations, in the context of supervised learning,…

Machine Learning · Computer Science 2022-07-13 Metehan Cekic , Ruirui Li , Zeya Chen , Yuguang Yang , Andreas Stolcke , Upamanyu Madhow

Assessing the Generalization Gap of Learning-Based Speech Enhancement Systems in Noisy and Reverberant Environments

The acoustic variability of noisy and reverberant speech mixtures is influenced by multiple factors, such as the spectro-temporal characteristics of the target speaker and the interfering noise, the signal-to-noise ratio (SNR) and the room…

Audio and Speech Processing · Electrical Eng. & Systems 2023-11-09 Philippe Gonzalez , Tommy Sonne Alstrøm , Tobias May

Denoising GER: A Noise-Robust Generative Error Correction with LLM for Speech Recognition

In recent years, large language models (LLM) have made significant progress in the task of generation error correction (GER) for automatic speech recognition (ASR) post-processing. However, in complex noisy environments, they still face…

Sound · Computer Science 2025-09-05 Yanyan Liu , Minqiang Xu , Yihao Chen , Liang He , Lei Fang , Sian Fang , Lin Liu