Related papers: Explanations for Automatic Speech Recognition

Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition

Explainable AI (XAI) techniques have been widely used to help explain and understand the output of deep learning models in fields such as image classification and Natural Language Processing. Interest in using XAI techniques to explain deep…

Computation and Language · Computer Science 2023-05-30 Xiaoliang Wu , Peter Bell , Ajitha Rajan

Measuring the Accuracy of Automatic Speech Recognition Solutions

For d/Deaf and hard of hearing (DHH) people, captioning is an essential accessibility tool. Significant developments in artificial intelligence (AI) mean that Automatic Speech Recognition (ASR) is now a part of many popular applications.…

Computation and Language · Computer Science 2024-08-30 Korbinian Kuhn , Verena Kersken , Benedikt Reuter , Niklas Egger , Gottfried Zimmermann

Unsupervised Automatic Speech Recognition: A Review

Automatic Speech Recognition (ASR) systems can be trained to achieve remarkable performance given large amounts of manually transcribed speech, but large labeled data sets can be difficult or expensive to acquire for all languages of…

Computation and Language · Computer Science 2022-03-22 Hanan Aldarmaki , Asad Ullah , Nazar Zaki

Visualizing Automatic Speech Recognition -- Means for a Better Understanding?

Automatic speech recognition (ASR) is improving ever more at mimicking human speech processing. The functioning of ASR, however, remains to a large extent obfuscated by the complex structure of the deep neural networks (DNNs) they are based…

Machine Learning · Computer Science 2022-02-03 Karla Markert , Romain Parracone , Mykhailo Kulakov , Philip Sperl , Ching-Yu Kao , Konstantin Böttinger

Speech Enhancement Modeling Towards Robust Speech Recognition System

Form about four decades human beings have been dreaming of an intelligent machine which can master the natural speech. In its simplest form, this machine should consist of two subsystems, namely automatic speech recognition (ASR) and speech…

Sound · Computer Science 2013-05-08 Urmila Shrawankar , V. M. Thakare

AudioMNIST: Exploring Explainable Artificial Intelligence for Audio Analysis on a Simple Benchmark

Explainable Artificial Intelligence (XAI) is targeted at understanding how models perform feature selection and derive their classification decisions. This paper explores post-hoc explanations for deep neural networks in the audio domain.…

Sound · Computer Science 2023-11-28 Sören Becker , Johanna Vielhaben , Marcel Ackermann , Klaus-Robert Müller , Sebastian Lapuschkin , Wojciech Samek

Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model

In this work, we introduce a simple yet efficient post-processing model for automatic speech recognition (ASR). Our model has Transformer-based encoder-decoder architecture which "translates" ASR model output into grammatically and…

Computation and Language · Computer Science 2019-10-24 Oleksii Hrinchuk , Mariya Popova , Boris Ginsburg

Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models

This paper explores speculative speech recognition (SSR), where we empower conventional automatic speech recognition (ASR) with speculation capabilities, allowing the recognizer to run ahead of audio. We introduce a metric for measuring SSR…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-08 Bolaji Yusuf , Murali Karthick Baskar , Andrew Rosenberg , Bhuvana Ramabhadran

Ada-SISE: Adaptive Semantic Input Sampling for Efficient Explanation of Convolutional Neural Networks

Explainable AI (XAI) is an active research area to interpret a neural network's decision by ensuring transparency and trust in the task-specified learned models. Recently, perturbation-based model analysis has shown better interpretation,…

Computer Vision and Pattern Recognition · Computer Science 2021-02-17 Mahesh Sudhakar , Sam Sattarzadeh , Konstantinos N. Plataniotis , Jongseong Jang , Yeonjeong Jeong , Hyunwoo Kim

Non-Intrusive Automatic Speech Recognition Refinement: A Survey

Automatic Speech Recognition (ASR) is an integral component of modern technology, powering applications such as voice-activated assistants, transcription services, and accessibility tools. Yet ASR systems continue to struggle with the…

Audio and Speech Processing · Electrical Eng. & Systems 2026-05-20 Mohammad Reza Peyghan , Saman Soleimani Roudi , Saeedreza Zouashkiani , Sajjad Amini , Fatemeh Rajabi , Shahrokh Ghaemmaghami

Breaking the Transcription Bottleneck: Fine-tuning ASR Models for Extremely Low-Resource Fieldwork Languages

Automatic Speech Recognition (ASR) has reached impressive accuracy for high-resource languages, yet its utility in linguistic fieldwork remains limited. Recordings collected in fieldwork contexts present unique challenges, including…

Computation and Language · Computer Science 2025-06-25 Siyu Liang , Gina-Anne Levow

Automatic Speech Recognition And Limited Vocabulary: A Survey

Automatic Speech Recognition (ASR) is an active field of research due to its large number of applications and the proliferation of interfaces or computing devices that can support speech processing. However, the bulk of applications are…

Artificial Intelligence · Computer Science 2022-03-03 Jean Louis K. E. Fendji , Diane C. M. Tala , Blaise O. Yenke , Marcellin Atemkeng

Adapting Foundation Speech Recognition Models to Impaired Speech: A Semantic Re-chaining Approach for Personalization of German Speech

Speech impairments caused by conditions such as cerebral palsy or genetic disorders pose significant challenges for automatic speech recognition (ASR) systems. Despite recent advances, ASR models like Whisper struggle with non-normative…

Computation and Language · Computer Science 2025-06-30 Niclas Pokel , Pehuén Moure , Roman Boehringer , Yingqiang Gao

ASR Error Correction and Domain Adaptation Using Machine Translation

Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an increasingly viable service for companies of any size building speech-based products. While these ASR systems are trained on large amounts of data, domain mismatch…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-18 Anirudh Mani , Shruti Palaskar , Nimshi Venkat Meripo , Sandeep Konam , Florian Metze

Adapting End-to-End Speech Recognition for Readable Subtitles

Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time.…

Computation and Language · Computer Science 2020-05-26 Danni Liu , Jan Niehues , Gerasimos Spanakis

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In…

Computation and Language · Computer Science 2024-01-08 Kevin Everson , Yile Gu , Huck Yang , Prashanth Gurunath Shivakumar , Guan-Ting Lin , Jari Kolehmainen , Ivan Bulyko , Ankur Gandhe , Shalini Ghosh , Wael Hamza , Hung-yi Lee , Ariya Rastrow , Andreas Stolcke

Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications

Automatic Speech Recognition (ASR) systems in real-world settings need to handle imperfect audio, often degraded by hardware limitations or environmental noise, while accommodating diverse user groups. In human-robot interaction (HRI),…

Robotics · Computer Science 2025-08-26 Theresa Pekarek Rosin , Julia Gachot , Henri-Leon Kordt , Matthias Kerzel , Stefan Wermter

Better Transcription of UK Supreme Court Hearings

Transcription of legal proceedings is very important to enable access to justice. However, speech transcription is an expensive and slow process. In this paper we describe part of a combined research and industrial project for building an…

Audio and Speech Processing · Electrical Eng. & Systems 2022-12-23 Hadeel Saadany , Catherine Breslin , Constantin Orăsan , Sophie Walker

Argumentative Explanations for Pattern-Based Text Classifiers

Recent works in Explainable AI mostly address the transparency issue of black-box models or create explanations for any kind of models (i.e., they are model-agnostic), while leaving explanations of interpretable models largely…

Artificial Intelligence · Computer Science 2022-05-24 Piyawat Lertvittayakumjorn , Francesca Toni

Learning ASR-Robust Contextualized Embeddings for Spoken Language Understanding

Employing pre-trained language models (LM) to extract contextualized word representations has achieved state-of-the-art performance on various NLP tasks. However, applying this technique to noisy transcripts generated by automatic speech…

Computation and Language · Computer Science 2020-11-03 Chao-Wei Huang , Yun-Nung Chen