Related papers: EEG based Continuous Speech Recognition using Tran…

Continuous Speech Recognition using EEG and Video

In this paper we investigate whether electroencephalography (EEG) features can be used to improve the performance of continuous visual speech recognition systems. We implemented a connectionist temporal classification (CTC) based end-to-end…

Machine Learning · Computer Science 2020-01-01 Gautam Krishna , Mason Carnahan , Co Tran , Ahmed H Tewfik

State-of-the-art Speech Recognition using EEG and Towards Decoding of Speech Spectrum From EEG

In this paper we first demonstrate continuous noisy speech recognition using electroencephalography (EEG) signals on English vocabulary using different types of state of the art end-to-end automatic speech recognition (ASR) models, we…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-06 Gautam Krishna , Yan Han , Co Tran , Mason Carnahan , Ahmed H Tewfik

Improving EEG based Continuous Speech Recognition

In this paper we introduce various techniques to improve the performance of electroencephalography (EEG) features based continuous speech recognition (CSR) systems. A connectionist temporal classification (CTC) based automatic speech…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-25 Gautam Krishna , Co Tran , Mason Carnahan , Yan Han , Ahmed H Tewfik

Advancing Speech Recognition With No Speech Or With Noisy Speech

In this paper we demonstrate end-to-end continuous speech recognition (CSR) using electroencephalography (EEG) signals with no speech signal as input. An attention model based automatic speech recognition (ASR) and connectionist temporal…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-17 Gautam Krishna , Co Tran , Mason Carnahan , Ahmed H Tewfik

Continuous Silent Speech Recognition using EEG

In this paper we explore continuous silent speech recognition using electroencephalography (EEG) signals. We implemented a connectionist temporal classification (CTC) automatic speech recognition (ASR) model to translate EEG signals…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-06 Gautam Krishna , Co Tran , Mason Carnahan , Ahmed Tewfik

Improving EEG based continuous speech recognition using GAN

In this paper we demonstrate that it is possible to generate more meaningful electroencephalography (EEG) features from raw EEG features using generative adversarial networks (GAN) to improve the performance of EEG based continuous speech…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-03 Gautam Krishna , Co Tran , Mason Carnahan , Ahmed Tewfik

A comparison of end-to-end models for long-form speech recognition

End-to-end automatic speech recognition (ASR) models, including both attention-based models and the recurrent neural network transducer (RNN-T), have shown superior performance compared to conventional systems. However, previous studies…

Audio and Speech Processing · Electrical Eng. & Systems 2019-11-07 Chung-Cheng Chiu , Wei Han , Yu Zhang , Ruoming Pang , Sergey Kishchenko , Patrick Nguyen , Arun Narayanan , Hank Liao , Shuyuan Zhang , Anjuli Kannan , Rohit Prabhavalkar , Zhifeng Chen , Tara Sainath , Yonghui Wu

Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems

In this paper we introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function that can generate more meaningful electroencephalography (EEG) features from raw EEG features to…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-05 Gautam Krishna , Co Tran , Mason Carnahan , Ahmed Tewfik

Speech Recognition with no speech or with noisy speech

The performance of automatic speech recognition systems(ASR) degrades in the presence of noisy speech. This paper demonstrates that using electroencephalography (EEG) can help automatic speech recognition systems overcome performance loss…

Machine Learning · Computer Science 2019-03-05 Gautam Krishna , Co Tran , Jianguo Yu , Ahmed H Tewfik

Semantic Mask for Transformer based End-to-End Speech Recognition

Attention-based encoder-decoder model has achieved impressive results for both automatic speech recognition (ASR) and text-to-speech (TTS) tasks. This approach takes advantage of the memorization capacity of neural networks to learn the…

Computation and Language · Computer Science 2020-03-17 Chengyi Wang , Yu Wu , Yujiao Du , Jinyu Li , Shujie Liu , Liang Lu , Shuo Ren , Guoli Ye , Sheng Zhao , Ming Zhou

Echo State Speech Recognition

We propose automatic speech recognition (ASR) models inspired by echo state network (ESN), in which a subset of recurrent neural networks (RNN) layers in the models are randomly initialized and untrained. Our study focuses on RNN-T and…

Computation and Language · Computer Science 2021-02-19 Harsh Shrivastava , Ankush Garg , Yuan Cao , Yu Zhang , Tara Sainath

Attention-based ASR with Lightweight and Dynamic Convolutions

End-to-end (E2E) automatic speech recognition (ASR) with sequence-to-sequence models has gained attention because of its simple model training compared with conventional hidden Markov model based ASR. Recently, several studies report the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-21 Yuya Fujita , Aswin Shanmugam Subramanian , Motoi Omachi , Shinji Watanabe

Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers

This paper addresses end-to-end automatic speech recognition (ASR) for long audio recordings such as lecture and conversational speeches. Most end-to-end ASR models are designed to recognize independent utterances, but contextual…

Computation and Language · Computer Science 2021-04-20 Takaaki Hori , Niko Moritz , Chiori Hori , Jonathan Le Roux

End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning

This paper presents our latest investigation on end-to-end automatic speech recognition (ASR) for overlapped speech. We propose to train an end-to-end system conditioned on speaker embeddings and further improved by transfer learning from…

Audio and Speech Processing · Electrical Eng. & Systems 2019-08-14 Pavel Denisov , Ngoc Thang Vu

End-to-End Speech Recognition: A Survey

In the last decade of automatic speech recognition (ASR) research, the introduction of deep learning brought considerable reductions in word error rate of more than 50% relative, compared to modeling without deep learning. In the wake of…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-07 Rohit Prabhavalkar , Takaaki Hori , Tara N. Sainath , Ralf Schlüter , Shinji Watanabe

Enhancing CTC-based speech recognition with diverse modeling units

In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning architectures like transformer. On top of E2E systems, researchers have achieved…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-12 Shiyi Han , Zhihong Lei , Mingbin Xu , Xingyu Na , Zhen Huang

Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation

End-to-end transformer-based automatic speech recognition (ASR) systems often capture multiple speech traits in their learned representations that are highly entangled, leading to a lack of interpretability. In this study, we propose the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-11-28 Pu Wang , Hugo Van hamme

A Simplified Fully Quantized Transformer for End-to-end Speech Recognition

While significant improvements have been made in recent years in terms of end-to-end automatic speech recognition (ASR) performance, such improvements were obtained through the use of very large neural networks, unfit for embedded use on…

Computation and Language · Computer Science 2020-03-25 Alex Bie , Bharat Venkitesh , Joao Monteiro , Md. Akmal Haidar , Mehdi Rezagholizadeh

Speech Synthesis using EEG

In this paper we demonstrate speech synthesis using different electroencephalography (EEG) feature sets recently introduced in [1]. We make use of a recurrent neural network (RNN) regression model to predict acoustic features directly from…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-05 Gautam Krishna , Co Tran , Yan Han , Mason Carnahan

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems

Automatic speech recognition (ASR) systems typically rely on an external endpointer (EP) model to identify speech boundaries. In this work, we propose a method to jointly train the ASR and EP tasks in a single end-to-end (E2E) multitask…

Sound · Computer Science 2023-02-16 Shaan Bijwadia , Shuo-yiin Chang , Bo Li , Tara Sainath , Chao Zhang , Yanzhang He