Related papers: Optimizing Contextual Speech Recognition Using Vec…

Context Biasing for Pronunciation-Orthography Mismatch in Automatic Speech Recognition

Neural sequence-to-sequence systems deliver state-of-the-art performance for automatic speech recognition. When using appropriate modeling units, e.g., byte-pair encoding, these systems are in principle open vocabulary systems. In practice,…

Computation and Language · Computer Science 2026-03-05 Christian Huber , Alexander Waibel

A Neural Model for Contextual Biasing Score Learning and Filtering

Contextual biasing improves automatic speech recognition (ASR) by integrating external knowledge, such as user-specific phrases or entities, during decoding. In this work, we use an attention-based biasing decoder to produce scores for…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-29 Wanting Huang , Weiran Wang

Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm

Contextual biasing refers to the problem of biasing the automatic speech recognition (ASR) systems towards rare entities that are relevant to the specific user or application scenarios. We propose algorithms for contextual biasing based on…

Computation and Language · Computer Science 2023-10-03 Weiran Wang , Zelin Wu , Diamantino Caseiro , Tsendsuren Munkhdalai , Khe Chai Sim , Pat Rondon , Golan Pundak , Gan Song , Rohit Prabhavalkar , Zhong Meng , Ding Zhao , Tara Sainath , Pedro Moreno Mengibar

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Contextual biasing is an important and challenging task for end-to-end automatic speech recognition (ASR) systems, which aims to achieve better recognition performance by biasing the ASR system to particular context phrases such as person…

Computation and Language · Computer Science 2022-09-08 Xiaoqiang Wang , Yanqing Liu , Jinyu Li , Veljko Miljanic , Sheng Zhao , Hosam Khalil

Robust Acoustic and Semantic Contextual Biasing in Neural Transducers for Speech Recognition

Attention-based contextual biasing approaches have shown significant improvements in the recognition of generic and/or personal rare-words in End-to-End Automatic Speech Recognition (E2E ASR) systems like neural transducers. These…

Computation and Language · Computer Science 2023-05-10 Xuandi Fu , Kanthashree Mysore Sathyendra , Ankur Gandhe , Jing Liu , Grant P. Strimel , Ross McGowan , Athanasios Mouchtaris

CB-Conformer: Contextual biasing Conformer for biased word recognition

Due to the mismatch between the source and target domains, how to better utilize the biased word information to improve the performance of the automatic speech recognition model in the target domain becomes a hot research topic. Previous…

Sound · Computer Science 2023-04-26 Yaoxun Xu , Baiji Liu , Qiaochu Huang and , Xingchen Song , Zhiyong Wu , Shiyin Kang , Helen Meng

Attention-based Natural Language Person Retrieval

Following the recent progress in image classification and captioning using deep learning, we develop a novel natural language person retrieval system based on an attention mechanism. More specifically, given the description of a person, the…

Computer Vision and Pattern Recognition · Computer Science 2017-05-26 Tao Zhou , Muhao Chen , Jie Yu , Demetri Terzopoulos

Retrieve and Copy: Scaling ASR Personalization to Large Catalogs

Personalization of automatic speech recognition (ASR) models is a widely studied topic because of its many practical applications. Most recently, attention-based contextual biasing techniques are used to improve the recognition of rare…

Computation and Language · Computer Science 2023-11-15 Sai Muralidhar Jayanthi , Devang Kulshreshtha , Saket Dingliwal , Srikanth Ronanki , Sravan Bodapati

Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation

Existing research suggests that automatic speech recognition (ASR) models can benefit from additional contexts (e.g., contact lists, user specified vocabulary). Rare words and named entities can be better recognized with contexts. In this…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-16 Ruizhe Huang , Mahsa Yarmohammadi , Sanjeev Khudanpur , Daniel Povey

Multi-Modal Retrieval For Large Language Model Based Speech Recognition

Retrieval is a widely adopted approach for improving language models leveraging external information. As the field moves towards multi-modal large language models, it is important to extend the pure text based methods to incorporate other…

Computation and Language · Computer Science 2024-06-17 Jari Kolehmainen , Aditya Gourav , Prashanth Gurunath Shivakumar , Yile Gu , Ankur Gandhe , Ariya Rastrow , Grant Strimel , Ivan Bulyko

Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition

By incorporating additional contextual information, deep biasing methods have emerged as a promising solution for speech recognition of personalized words. However, for real-world voice assistants, always biasing on such personalized words…

Sound · Computer Science 2023-08-16 Tianyi Xu , Zhanheng Yang , Kaixun Huang , Pengcheng Guo , Ao Zhang , Biao Li , Changru Chen , Chao Li , Lei Xie

Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network

Contextual information plays a crucial role in speech recognition technologies and incorporating it into the end-to-end speech recognition models has drawn immense interest recently. However, previous deep bias methods lacked explicit…

Audio and Speech Processing · Electrical Eng. & Systems 2023-07-13 Kaixun Huang , Ao Zhang , Zhanheng Yang , Pengcheng Guo , Bingshen Mu , Tianyi Xu , Lei Xie

Deep CLAS: Deep Contextual Listen, Attend and Spell

Contextual-LAS (CLAS) has been shown effective in improving Automatic Speech Recognition (ASR) of rare words. It relies on phrase-level contextual modeling and attention-based relevance scoring without explicit contextual constraint which…

Computation and Language · Computer Science 2024-12-20 Mengzhi Wang , Shifu Xiong , Genshun Wan , Hang Chen , Jianqing Gao , Lirong Dai

Fast and Robust Unsupervised Contextual Biasing for Speech Recognition

Automatic speech recognition (ASR) system is becoming a ubiquitous technology. Although its accuracy is closing the gap with that of human level under certain settings, one area that can further improve is to incorporate user-specific…

Computation and Language · Computer Science 2020-05-05 Young Mo Kang , Yingbo Zhou

Self-Attentional Acoustic Models

Self-attention is a method of encoding sequences of vectors by relating these vectors to each-other based on pairwise similarities. These models have recently shown promising results for modeling discrete sequences, but they are non-trivial…

Computation and Language · Computer Science 2018-06-19 Matthias Sperber , Jan Niehues , Graham Neubig , Sebastian Stüker , Alex Waibel

Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model

Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent interest as ASR use becomes more widespread. We are releasing contextual biasing lists to accompany the Earnings21 dataset, creating a public…

Computation and Language · Computer Science 2022-09-07 Jennifer Drexler Fox , Natalie Delworth

Moving Towards Centers: Re-ranking with Attention and Memory for Re-identification

Re-ranking utilizes contextual information to optimize the initial ranking list of person or vehicle re-identification (re-ID), which boosts the retrieval performance at post-processing steps. This paper proposes a re-ranking network to…

Computer Vision and Pattern Recognition · Computer Science 2022-03-22 Yunhao Zhou , Yi Wang , Lap-Pui Chau

Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization

There is extensive interest in metric learning methods for image retrieval. Many metric learning loss functions focus on learning a correct ranking of training samples, but strongly overfit semantically inconsistent labels and require a…

Machine Learning · Computer Science 2023-06-05 Christopher Liao , Theodoros Tsiligkaridis , Brian Kulis

Sequence Shortening for Context-Aware Machine Translation

Context-aware Machine Translation aims to improve translations of sentences by incorporating surrounding sentences as context. Towards this task, two main architectures have been applied, namely single-encoder (based on concatenation) and…

Computation and Language · Computer Science 2024-02-05 Paweł Mąka , Yusuf Can Semerci , Jan Scholtes , Gerasimos Spanakis

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Transformer-based Large Language Models (LLMs) have become increasingly important. However, due to the quadratic time complexity of attention computation, scaling LLMs to longer contexts incurs extremely slow inference speed and high GPU…

Machine Learning · Computer Science 2025-01-03 Di Liu , Meng Chen , Baotong Lu , Huiqiang Jiang , Zhenhua Han , Qianxi Zhang , Qi Chen , Chengruidong Zhang , Bailu Ding , Kai Zhang , Chen Chen , Fan Yang , Yuqing Yang , Lili Qiu