English
Related papers

Related papers: Optimizing Contextual Speech Recognition Using Vec…

200 papers

Neural sequence-to-sequence systems deliver state-of-the-art performance for automatic speech recognition. When using appropriate modeling units, e.g., byte-pair encoding, these systems are in principle open vocabulary systems. In practice,…

Computation and Language · Computer Science 2026-03-05 Christian Huber , Alexander Waibel

Contextual biasing improves automatic speech recognition (ASR) by integrating external knowledge, such as user-specific phrases or entities, during decoding. In this work, we use an attention-based biasing decoder to produce scores for…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-29 Wanting Huang , Weiran Wang

Contextual biasing refers to the problem of biasing the automatic speech recognition (ASR) systems towards rare entities that are relevant to the specific user or application scenarios. We propose algorithms for contextual biasing based on…

Contextual biasing is an important and challenging task for end-to-end automatic speech recognition (ASR) systems, which aims to achieve better recognition performance by biasing the ASR system to particular context phrases such as person…

Computation and Language · Computer Science 2022-09-08 Xiaoqiang Wang , Yanqing Liu , Jinyu Li , Veljko Miljanic , Sheng Zhao , Hosam Khalil

Attention-based contextual biasing approaches have shown significant improvements in the recognition of generic and/or personal rare-words in End-to-End Automatic Speech Recognition (E2E ASR) systems like neural transducers. These…

Computation and Language · Computer Science 2023-05-10 Xuandi Fu , Kanthashree Mysore Sathyendra , Ankur Gandhe , Jing Liu , Grant P. Strimel , Ross McGowan , Athanasios Mouchtaris

Due to the mismatch between the source and target domains, how to better utilize the biased word information to improve the performance of the automatic speech recognition model in the target domain becomes a hot research topic. Previous…

Sound · Computer Science 2023-04-26 Yaoxun Xu , Baiji Liu , Qiaochu Huang and , Xingchen Song , Zhiyong Wu , Shiyin Kang , Helen Meng

Following the recent progress in image classification and captioning using deep learning, we develop a novel natural language person retrieval system based on an attention mechanism. More specifically, given the description of a person, the…

Computer Vision and Pattern Recognition · Computer Science 2017-05-26 Tao Zhou , Muhao Chen , Jie Yu , Demetri Terzopoulos

Personalization of automatic speech recognition (ASR) models is a widely studied topic because of its many practical applications. Most recently, attention-based contextual biasing techniques are used to improve the recognition of rare…

Computation and Language · Computer Science 2023-11-15 Sai Muralidhar Jayanthi , Devang Kulshreshtha , Saket Dingliwal , Srikanth Ronanki , Sravan Bodapati

Existing research suggests that automatic speech recognition (ASR) models can benefit from additional contexts (e.g., contact lists, user specified vocabulary). Rare words and named entities can be better recognized with contexts. In this…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-16 Ruizhe Huang , Mahsa Yarmohammadi , Sanjeev Khudanpur , Daniel Povey

Retrieval is a widely adopted approach for improving language models leveraging external information. As the field moves towards multi-modal large language models, it is important to extend the pure text based methods to incorporate other…

Computation and Language · Computer Science 2024-06-17 Jari Kolehmainen , Aditya Gourav , Prashanth Gurunath Shivakumar , Yile Gu , Ankur Gandhe , Ariya Rastrow , Grant Strimel , Ivan Bulyko

By incorporating additional contextual information, deep biasing methods have emerged as a promising solution for speech recognition of personalized words. However, for real-world voice assistants, always biasing on such personalized words…

Sound · Computer Science 2023-08-16 Tianyi Xu , Zhanheng Yang , Kaixun Huang , Pengcheng Guo , Ao Zhang , Biao Li , Changru Chen , Chao Li , Lei Xie

Contextual information plays a crucial role in speech recognition technologies and incorporating it into the end-to-end speech recognition models has drawn immense interest recently. However, previous deep bias methods lacked explicit…

Audio and Speech Processing · Electrical Eng. & Systems 2023-07-13 Kaixun Huang , Ao Zhang , Zhanheng Yang , Pengcheng Guo , Bingshen Mu , Tianyi Xu , Lei Xie

Contextual-LAS (CLAS) has been shown effective in improving Automatic Speech Recognition (ASR) of rare words. It relies on phrase-level contextual modeling and attention-based relevance scoring without explicit contextual constraint which…

Computation and Language · Computer Science 2024-12-20 Mengzhi Wang , Shifu Xiong , Genshun Wan , Hang Chen , Jianqing Gao , Lirong Dai

Automatic speech recognition (ASR) system is becoming a ubiquitous technology. Although its accuracy is closing the gap with that of human level under certain settings, one area that can further improve is to incorporate user-specific…

Computation and Language · Computer Science 2020-05-05 Young Mo Kang , Yingbo Zhou

Self-attention is a method of encoding sequences of vectors by relating these vectors to each-other based on pairwise similarities. These models have recently shown promising results for modeling discrete sequences, but they are non-trivial…

Computation and Language · Computer Science 2018-06-19 Matthias Sperber , Jan Niehues , Graham Neubig , Sebastian Stüker , Alex Waibel

Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent interest as ASR use becomes more widespread. We are releasing contextual biasing lists to accompany the Earnings21 dataset, creating a public…

Computation and Language · Computer Science 2022-09-07 Jennifer Drexler Fox , Natalie Delworth

Re-ranking utilizes contextual information to optimize the initial ranking list of person or vehicle re-identification (re-ID), which boosts the retrieval performance at post-processing steps. This paper proposes a re-ranking network to…

Computer Vision and Pattern Recognition · Computer Science 2022-03-22 Yunhao Zhou , Yi Wang , Lap-Pui Chau

There is extensive interest in metric learning methods for image retrieval. Many metric learning loss functions focus on learning a correct ranking of training samples, but strongly overfit semantically inconsistent labels and require a…

Machine Learning · Computer Science 2023-06-05 Christopher Liao , Theodoros Tsiligkaridis , Brian Kulis

Context-aware Machine Translation aims to improve translations of sentences by incorporating surrounding sentences as context. Towards this task, two main architectures have been applied, namely single-encoder (based on concatenation) and…

Computation and Language · Computer Science 2024-02-05 Paweł Mąka , Yusuf Can Semerci , Jan Scholtes , Gerasimos Spanakis

Transformer-based Large Language Models (LLMs) have become increasingly important. However, due to the quadratic time complexity of attention computation, scaling LLMs to longer contexts incurs extremely slow inference speed and high GPU…

‹ Prev 1 2 3 10 Next ›