English
Related papers

Related papers: Spoken Language Intent Detection using Confusion2V…

200 papers

Word vector representations enable machines to encode human language for spoken language understanding and processing. Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes…

Computation and Language · Computer Science 2022-05-04 Prashanth Gurunath Shivakumar , Panayiotis Georgiou , Shrikanth Narayanan

In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent…

In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In…

Employing pre-trained language models (LM) to extract contextualized word representations has achieved state-of-the-art performance on various NLP tasks. However, applying this technique to noisy transcripts generated by automatic speech…

Computation and Language · Computer Science 2020-11-03 Chao-Wei Huang , Yun-Nung Chen

Word vector representations are a crucial part of Natural Language Processing (NLP) and Human Computer Interaction. In this paper, we propose a novel word vector representation, Confusion2Vec, motivated from the human speech production and…

Computation and Language · Computer Science 2019-07-01 Prashanth Gurunath Shivakumar , Panayiotis Georgiou

This paper addresses the problem of automatic speech recognition (ASR) error detection and their use for improving spoken language understanding (SLU) systems. In this study, the SLU task consists in automatically extracting, from ASR…

Computation and Language · Computer Science 2017-05-29 Edwin Simonnet , Sahar Ghannay , Nathalie Camelin , Yannick Estève , Renato De Mori

Recently, deep end-to-end learning has been studied for intent classification in Spoken Language Understanding (SLU). However, end-to-end models require a large amount of speech data with intent labels, and highly optimized models are…

Computation and Language · Computer Science 2024-05-27 Suyoung Kim , Jiyeon Hwang , Ho-Young Jung

Comprehending the overall intent of an utterance helps a listener recognize the individual words spoken. Inspired by this fact, we perform a novel study of the impact of explicitly incorporating intent representations as additional…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-22 Swayambhu Nath Ray , Minhua Wu , Anirudh Raju , Pegah Ghahremani , Raghavendra Bilgi , Milind Rao , Harish Arsikere , Ariya Rastrow , Andreas Stolcke , Jasha Droppo

Spoken language understanding (SLU) tasks are usually solved by first transcribing an utterance with automatic speech recognition (ASR) and then feeding the output to a text-based model. Recent advances in self-supervised representation…

Audio and Speech Processing · Electrical Eng. & Systems 2021-12-01 Lasse Borgholt , Jakob Drachmann Havtorn , Mostafa Abdou , Joakim Edin , Lars Maaløe , Anders Søgaard , Christian Igel

Spoken Language Understanding (SLU) is the problem of extracting the meaning from speech utterances. It is typically addressed as a two-step problem, where an Automatic Speech Recognition (ASR) model is employed to convert speech into text,…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-04 Elisavet Palogiannidi , Ioannis Gkinis , George Mastrapas , Petr Mizera , Themos Stafylakis

Accurate prediction of the user intent to interact with a voice assistant (VA) on a device (e.g. on the phone) is critical for achieving naturalistic, engaging, and privacy-centric interactions with the VA. To this end, we present a novel…

Computation and Language · Computer Science 2022-10-24 Pranay Dighe , Prateeth Nayak , Oggi Rudovic , Erik Marchi , Xiaochuan Niu , Ahmed Tewfik

Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the…

A major focus of recent research in spoken language understanding (SLU) has been on the end-to-end approach where a single model can predict intents directly from speech inputs without intermediate transcripts. However, this approach…

Computation and Language · Computer Science 2021-06-15 Sujeong Cha , Wangrui Hou , Hyun Jung , My Phung , Michael Picheny , Hong-Kwang Kuo , Samuel Thomas , Edmilson Morais

This research presents a novel approach to enhancing automatic speech recognition systems by integrating noise detection capabilities directly into the recognition architecture. Building upon the wav2vec2 framework, the proposed method…

Sound · Computer Science 2025-12-11 Karamvir Singh

New-age conversational agent systems perform both speech emotion recognition (SER) and automatic speech recognition (ASR) using two separate and often independent approaches for real-world application in noisy environments. In this paper,…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-29 Lokesh Bansal , S. Pavankumar Dubagunta , Malolan Chetlur , Pushpak Jagtap , Aravind Ganapathiraju

Spoken Language Understanding (SLU) typically comprises of an automatic speech recognition (ASR) followed by a natural language understanding (NLU) module. The two modules process signals in a blocking sequential fashion, i.e., the NLU…

Computation and Language · Computer Science 2020-12-01 Prashanth Gurunath Shivakumar , Naveen Kumar , Panayiotis Georgiou , Shrikanth Narayanan

The goal of self-supervised learning (SSL) for automatic speech recognition (ASR) is to learn good speech representations from a large amount of unlabeled speech for the downstream ASR task. However, most SSL frameworks do not consider…

Computation and Language · Computer Science 2022-01-27 Yiming Wang , Jinyu Li , Heming Wang , Yao Qian , Chengyi Wang , Yu Wu

Automatic speech recognition (ASR) has been an essential component of computer assisted language learning (CALL) and computer assisted language testing (CALT) for many years. As this technology continues to develop rapidly, it is important…

Computation and Language · Computer Science 2025-04-01 Michael McGuire

Despite recent advancements in deep learning technologies, Child Speech Recognition remains a challenging task. Current Automatic Speech Recognition (ASR) models require substantial amounts of annotated data for training, which is scarce.…

Audio and Speech Processing · Electrical Eng. & Systems 2023-02-14 Rishabh Jain , Andrei Barcovschi , Mariam Yiwere , Dan Bigioi , Peter Corcoran , Horia Cucu

Wav2vec2.0 is a popular self-supervised pre-training framework for learning speech representations in the context of automatic speech recognition (ASR). It was shown that wav2vec2.0 has a good robustness against the domain shift, while the…

Audio and Speech Processing · Electrical Eng. & Systems 2022-05-10 Qiu-Shi Zhu , Jie Zhang , Zi-Qiang Zhang , Ming-Hui Wu , Xin Fang , Li-Rong Dai
‹ Prev 1 2 3 10 Next ›