Related papers: Learning Decoupling Features Through Orthogonality…

Discriminatory and orthogonal feature learning for noise robust keyword spotting

Keyword Spotting (KWS) is an essential component in a smart device for alerting the system when a user prompts it with a command. As these devices are typically constrained by computational and energy resources, the KWS model should be…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-24 Donghyeon Kim , Kyungdeuk Ko , David K. Han , Hanseok Ko

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

Keyword spotting (KWS) and speaker verification (SV) have been studied independently although it is known that acoustic and speaker domains are complementary. In this paper, we propose a multi-task network that performs KWS and SV…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-10 Myunghun Jung , Youngmoon Jung , Jahyun Goo , Hoirin Kim

Keyword Spotting for Hearing Assistive Devices Robust to External Speakers

Keyword spotting (KWS) is experiencing an upswing due to the pervasiveness of small electronic devices that allow interaction with them via speech. Often, KWS systems are speaker-independent, which means that any person --user or not--…

Sound · Computer Science 2019-06-27 Iván López-Espejo , Zheng-Hua Tan , Jesper Jensen

Contrastive Augmentation: An Unsupervised Learning Approach for Keyword Spotting in Speech Technology

This paper addresses the persistent challenge in Keyword Spotting (KWS), a fundamental component in speech technology, regarding the acquisition of substantial labeled data for training. Given the difficulty in obtaining large quantities of…

Sound · Computer Science 2024-09-04 Weinan Dai , Yifeng Jiang , Yuanjing Liu , Jinkun Chen , Xin Sun , Jinglei Tao

Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-Convolutions

Keyword Spotting (KWS) enables speech-based user interaction on smart devices. Always-on and battery-powered application scenarios for smart devices put constraints on hardware resources and power consumption, while also demanding high…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-05 Simon Mittermaier , Ludwig Kürzinger , Bernd Waschneck , Gerhard Rigoll

A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting

The recognition of rare named entities, such as personal names and terminologies, is challenging for automatic speech recognition (ASR) systems, especially when they are not frequently observed in the training data. In this paper, we…

Artificial Intelligence · Computer Science 2024-06-07 Yuang Li , Min Zhang , Chang Su , Yinglu Li , Xiaosong Qiao , Mengxin Ren , Miaomiao Ma , Daimeng Wei , Shimin Tao , Hao Yang

PCOV-KWS: Multi-task Learning for Personalized Customizable Open Vocabulary Keyword Spotting

As advancements in technologies like Internet of Things (IoT), Automatic Speech Recognition (ASR), Speaker Verification (SV), and Text-to-Speech (TTS) lead to increased usage of intelligent voice assistants, the demand for privacy and…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-20 Jianan Pan , Kejie Huang

Multi-task Learning with Cross Attention for Keyword Spotting

Keyword spotting (KWS) is an important technique for speech applications, which enables users to activate devices by speaking a keyword phrase. Although a phoneme classifier can be used for KWS, exploiting a large amount of transcribed data…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-23 Takuya Higuchi , Anmol Gupta , Chandra Dhir

Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting

Speech recognition is a sequence prediction problem. Besides employing various deep learning approaches for framelevel classification, sequence-level discriminative training has been proved to be indispensable to achieve the…

Computation and Language · Computer Science 2018-08-20 Zhehuai Chen , Yanmin Qian , Kai Yu

A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting

Robustness against noise is critical for keyword spotting (KWS) in real-world environments. To improve the robustness, a speech enhancement front-end is involved. Instead of treating the speech enhancement as a separated preprocessing…

Sound · Computer Science 2019-06-21 Yue Gu , Zhihao Du , Hui Zhang , Xueliang Zhang

Deep Spoken Keyword Spotting: An Overview

Spoken keyword spotting (KWS) deals with the identification of keywords in audio streams and has become a fast-growing technology thanks to the paradigm shift introduced by deep learning a few years ago. This has allowed the rapid embedding…

Sound · Computer Science 2021-11-23 Iván López-Espejo , Zheng-Hua Tan , John Hansen , Jesper Jensen

Keyword Spotting with Hyper-Matched Filters for Small Footprint Devices

Open-vocabulary keyword spotting (KWS) refers to the task of detecting words or terms within speech recordings, regardless of whether they were included in the training data. This paper introduces an open-vocabulary keyword spotting model…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-08 Yael Segal-Feldman , Ann R. Bradlow , Matthew Goldrick , Joseph Keshet

Understanding Audio Features via Trainable Basis Functions

In this paper we explore the possibility of maximizing the information represented in spectrograms by making the spectrogram basis functions trainable. We experiment with two different tasks, namely keyword spotting (KWS) and automatic…

Sound · Computer Science 2022-04-26 Kwan Yee Heung , Kin Wai Cheuk , Dorien Herremans

Speech Augmentation Based Unsupervised Learning for Keyword Spotting

In this paper, we investigated a speech augmentation based unsupervised learning approach for keyword spotting (KWS) task. KWS is a useful speech application, yet also heavily depends on the labeled data. We designed a CNN-Attention…

Sound · Computer Science 2022-05-31 Jian Luo , Jianzong Wang , Ning Cheng , Haobin Tang , Jing Xiao

U2-KWS: Unified Two-pass Open-vocabulary Keyword Spotting with Keyword Bias

Open-vocabulary keyword spotting (KWS), which allows users to customize keywords, has attracted increasingly more interest. However, existing methods based on acoustic models and post-processing train the acoustic model with ASR training…

Audio and Speech Processing · Electrical Eng. & Systems 2023-12-18 Ao Zhang , Pan Zhou , Kaixun Huang , Yong Zou , Ming Liu , Lei Xie

Effective User-defined Keyword Spotting with Dual-stage Matching, Multi-modal Enrollment, and Continual Adaptation

User-defined keyword spotting (KWS) is crucial for personalized voice interaction, yet existing methods face several challenges: (1) insufficient discriminability among confusable words, (2) performance inconsistency across speakers with…

Audio and Speech Processing · Electrical Eng. & Systems 2026-05-22 Zhiqi Ai , Han Cheng , Shiyi Mu , Xinnuo Li , Yongjin Zhou , Shugong Xu

Avoid Overfitting User Specific Information in Federated Keyword Spotting

Keyword spotting (KWS) aims to discriminate a specific wake-up word from other signals precisely and efficiently for different users. Recent works utilize various deep networks to train KWS models with all users' speech data centralized…

Machine Learning · Computer Science 2022-06-20 Xin-Chun Li , Jin-Lin Tang , Shaoming Song , Bingshuai Li , Yinchuan Li , Yunfeng Shao , Le Gan , De-Chuan Zhan

Does Single-channel Speech Enhancement Improve Keyword Spotting Accuracy? A Case Study

Noise robustness is a key aspect of successful speech applications. Speech enhancement (SE) has been investigated to improve automatic speech recognition accuracy; however, its effectiveness for keyword spotting (KWS) is still…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-23 Avamarie Brueggeman , Takuya Higuchi , Masood Delfarah , Stephen Shum , Vineet Garg

Domain-Incremental Continual Learning for Robust and Efficient Keyword Spotting in Resource Constrained Systems

Keyword Spotting (KWS) systems with small footprint models deployed on edge devices face significant accuracy and robustness challenges due to domain shifts caused by varying noise and recording conditions. To address this, we propose a…

Sound · Computer Science 2026-01-23 Prakash Dhungana , Sayed Ahmad Salehi

Multitaper mel-spectrograms for keyword spotting

Keyword spotting (KWS) is one of the speech recognition tasks most sensitive to the quality of the feature representation. However, the research on KWS has traditionally focused on new model topologies, putting little emphasis on other…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-08 Douglas Baptista de Souza , Khaled Jamal Bakri , Fernanda Ferreira , Juliana Inacio