Related papers: TR01: Time-continuous Sparse Imputation

Enhancing Automatic Speech Recognition Through Integrated Noise Detection Architecture

This research presents a novel approach to enhancing automatic speech recognition systems by integrating noise detection capabilities directly into the recognition architecture. Building upon the wav2vec2 framework, the proposed method…

Sound · Computer Science 2025-12-11 Karamvir Singh

An Adaptive Psychoacoustic Model for Automatic Speech Recognition

Compared with automatic speech recognition (ASR), the human auditory system is more adept at handling noise-adverse situations, including environmental noise and channel distortion. To mimic this adeptness, auditory models have been widely…

Computation and Language · Computer Science 2016-09-16 Peng Dai , Xue Teng , Frank Rudzicz , Ing Yann Soon

Text-Independent Speaker Recognition for Low SNR Environments with Encryption

Recognition systems are commonly designed to authenticate users at the access control levels of a system. A number of voice recognition methods have been developed using a pitch estimation process which are very vulnerable in low Signal to…

Sound · Computer Science 2020-09-08 Aman Chadha , Divya Jyoti , M. Mani Roja

Reliability analysis for data-driven noisy models using active learning

Reliability analysis aims at estimating the failure probability of an engineering system. It often requires multiple runs of a limit-state function, which usually relies on computationally intensive simulations. Traditionally, these…

Computation · Statistics 2024-01-22 Anderson V. Pires , Maliki Moustapha , Stefano Marelli , Bruno Sudret

Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems

Automatic speech recognition systems are part of people's daily lives, embedded in personal assistants and mobile phones, helping as a facilitator for human-machine interaction while allowing access to information in a practically intuitive…

Sound · Computer Science 2021-10-05 Julio Cesar Duarte , Sérgio Colcher

Speech Denoising with Auditory Models

Contemporary speech enhancement predominantly relies on audio transforms that are trained to reconstruct a clean speech waveform. The development of high-performing neural network sound recognition systems has raised the possibility of…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-18 Mark R. Saddler , Andrew Francl , Jenelle Feather , Kaizhi Qian , Yang Zhang , Josh H. McDermott

Noisy Speech Based Temporal Decomposition to Improve Fundamental Frequency Estimation

This paper introduces a novel method to separate noisy speech into low or high frequency frames, in order to improve fundamental frequency (F0) estimation accuracy. In this proposal, the target signal is analyzed by means of the ensemble…

Audio and Speech Processing · Electrical Eng. & Systems 2021-12-21 A. Queiroz , R. Coelho

Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification

This paper presents a fully automated approach for identifying speech anomalies from voice recordings to aid in the assessment of speech impairments. By combining Connectionist Temporal Classification (CTC) and encoder-decoder-based…

Sound · Computer Science 2023-08-04 Laurin Wagner , Mario Zusag , Theresa Bloder

Speech Denoising Without Clean Training Data: A Noise2Noise Approach

This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio-denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.…

Sound · Computer Science 2021-09-21 Madhav Mahesh Kashyap , Anuj Tambwekar , Krishnamoorthy Manohara , S Natarajan

Non-intrusive speech intelligibility prediction using automatic speech recognition derived measures

The estimation of speech intelligibility is still far from being a solved problem. Especially one aspect is problematic: most of the standard models require a clean reference signal in order to estimate intelligibility. This is an issue of…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-29 Mahdie Karbasi , Stefan Bleeck , Dorothea Kolossa

Sparse Auto-Regressive: Robust Estimation of AR Parameters

In this paper I present a new approach for regression of time series using their own samples. This is a celebrated problem known as Auto-Regression. Dealing with outlier or missed samples in a time series makes the problem of estimation…

Artificial Intelligence · Computer Science 2015-08-19 Mohsen Joneidi

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition

Recent advances in deep neural networks have achieved unprecedented success in visual speech recognition. However, there remains substantial disparity between current methods and their deployment in resource-constrained devices. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Adriana Fernandez-Lopez , Honglie Chen , Pingchuan Ma , Alexandros Haliassos , Stavros Petridis , Maja Pantic

A modeling and algorithmic framework for (non)social (co)sparse audio restoration

We propose a unified modeling and algorithmic framework for audio restoration problem. It encompasses analysis sparse priors as well as more classical synthesis sparse priors, and regular sparsity as well as various forms of structured…

Sound · Computer Science 2017-12-01 Clément Gaultier , Nancy Bertin , Srđan Kitić , Rémi Gribonval

Learning Noise-Invariant Representations for Robust Speech Recognition

Despite rapid advances in speech recognition, current models remain brittle to superficial perturbations to their inputs. Small amounts of noise can destroy the performance of an otherwise state-of-the-art model. To harden models against…

Audio and Speech Processing · Electrical Eng. & Systems 2018-07-19 Davis Liang , Zhiheng Huang , Zachary C. Lipton

Robust Raw Waveform Speech Recognition Using Relevance Weighted Representations

Speech recognition in noisy and channel distorted scenarios is often challenging as the current acoustic modeling schemes are not adaptive to the changes in the signal distribution in the presence of noise. In this work, we develop a novel…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-03 Purvi Agrawal , Sriram Ganapathy

Sequential Randomized Smoothing for Adversarially Robust Speech Recognition

While Automatic Speech Recognition has been shown to be vulnerable to adversarial attacks, defenses against these attacks are still lagging. Existing, naive defenses can be partially broken with an adaptive attack. In classification tasks,…

Computation and Language · Computer Science 2022-01-12 Raphael Olivier , Bhiksha Raj

Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models

Continuous speech can be converted into a discrete sequence by deriving discrete units from the hidden features of self-supervised learned (SSL) speech models. Although SSL models are becoming larger and trained on more data, they are often…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-06 Jakob Poncelet , Yujun Wang , Hugo Van hamme

Diffusion Buffer: Online Diffusion-based Speech Enhancement with Sub-Second Latency

Diffusion models are a class of generative models that have been recently used for speech enhancement with remarkable success but are computationally expensive at inference time. Therefore, these models are impractical for processing…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-15 Bunlong Lay , Rostislav Makarov , Timo Gerkmann

Spectral Masking with Explicit Time-Context Windowing for Neural Network-Based Monaural Speech Enhancement

We propose and analyze the use of an explicit time-context window for neural network-based spectral masking speech enhancement to leverage signal context dependencies between neighboring frames. In particular, we concentrate on soft masking…

Audio and Speech Processing · Electrical Eng. & Systems 2024-08-29 Luan Vinícius Fiorio , Boris Karanov , Bruno Defraene , Johan David , Wim van Houtum , Frans Widdershoven , Ronald M. Aarts

Listening to Sounds of Silence for Speech Denoising

We introduce a deep learning model for speech denoising, a long-standing challenge in audio analysis arising in numerous applications. Our approach is based on a key observation about human speech: there is often a short pause between each…

Sound · Computer Science 2020-10-26 Ruilin Xu , Rundi Wu , Yuko Ishiwaka , Carl Vondrick , Changxi Zheng