Related papers: Phoneme recognition in TIMIT with BLSTM-CTC

Speech Recognition with Deep Recurrent Neural Networks

Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output…

Neural and Evolutionary Computing · Computer Science 2013-03-26 Alex Graves , Abdel-rahman Mohamed , Geoffrey Hinton

Semi-Supervised Phoneme Recognition with Recurrent Ladder Networks

Ladder networks are a notable new concept in the field of semi-supervised learning by showing state-of-the-art results in image recognition tasks while being compatible with many existing neural architectures. We present the recurrent…

Computation and Language · Computer Science 2017-09-20 Marian Tietz , Tayfun Alpay , Johannes Twiefel , Stefan Wermter

Online Sequence Training of Recurrent Neural Networks with Connectionist Temporal Classification

Connectionist temporal classification (CTC) based supervised sequence training of recurrent neural networks (RNNs) has shown great success in many machine learning areas including end-to-end speech and handwritten character recognition. For…

Machine Learning · Computer Science 2017-02-03 Kyuyeon Hwang , Wonyong Sung

Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition Task

In this paper, we have investigated recurrent deep neural networks (DNNs) in combination with regularization techniques as dropout, zoneout, and regularization post-layer. As a benchmark, we chose the TIMIT phone recognition task due to its…

Computation and Language · Computer Science 2018-06-20 Jan Vanek , Josef Michalek , Josef Psutka

Study of Phonemes Confusions in Hierarchical Automatic Phoneme Recognition System

In this paper, we have analyzed the impact of confusions on the robustness of phoneme recognitions system. The confusions are detected at the pronunciation and the confusions matrices of the phoneme recognizer. The confusions show that some…

Computation and Language · Computer Science 2015-08-10 Rimah Amami , Noureddine Ellouze

Enhanced LSTM for Natural Language Inference

Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is very challenging. With the availability of large annotated data (Bowman et al., 2015), it has recently become feasible to…

Computation and Language · Computer Science 2020-03-04 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang , Diana Inkpen

Attention-Based Models for Speech Recognition

Recurrent sequence generators conditioned on input data through an attention mechanism have recently shown very good performance on a range of tasks in- cluding machine translation, handwriting synthesis and image caption gen- eration. We…

Computation and Language · Computer Science 2015-06-25 Jan Chorowski , Dzmitry Bahdanau , Dmitriy Serdyuk , Kyunghyun Cho , Yoshua Bengio

Sequence Transduction with Recurrent Neural Networks

Many machine learning tasks can be expressed as the transformation---or \emph{transduction}---of input sequences into output sequences: speech recognition, machine translation, protein secondary structure prediction and text-to-speech to…

Neural and Evolutionary Computing · Computer Science 2012-11-16 Alex Graves

A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding

Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for modeling and predicting sequential data, e.g. speech utterances or handwritten documents. In this study, we propose to use…

Computation and Language · Computer Science 2015-11-03 Peilu Wang , Yao Qian , Frank K. Soong , Lei He , Hai Zhao

Bi-Directional Lattice Recurrent Neural Networks for Confidence Estimation

The standard approach to mitigate errors made by an automatic speech recognition system is to use confidence scores associated with each predicted word. In the simplest case, these scores are word posterior probabilities whilst more complex…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-19 Qiujia Li , Preben Ness , Anton Ragni , Mark Gales

Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition

A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network.…

Computation and Language · Computer Science 2016-12-28 Zewang Zhang , Zheng Sun , Jiaqi Liu , Jingwen Chen , Zhao Huo , Xiao Zhang

Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings

In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single multilabel BLSTM RNN is trained to map…

Sound · Computer Science 2016-11-17 Giambattista Parascandolo , Heikki Huttunen , Tuomas Virtanen

Recurrent Memory Networks for Language Modeling

Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent…

Computation and Language · Computer Science 2016-04-25 Ke Tran , Arianna Bisazza , Christof Monz

A C-LSTM Neural Network for Text Classification

Neural network models have been demonstrated to be capable of achieving remarkable performance in sentence and document modeling. Convolutional neural network (CNN) and recurrent neural network (RNN) are two mainstream architectures for…

Computation and Language · Computer Science 2015-12-01 Chunting Zhou , Chonglin Sun , Zhiyuan Liu , Francis C. M. Lau

A Token-Wise Beam Search Algorithm for RNN-T

Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithms for speech recognition are iterating over the time axis, such that one time step is decoded before moving on to the next time step. Those algorithms result in a large…

Machine Learning · Computer Science 2023-10-09 Gil Keren

Image Classification using Sequence of Pixels

This study compares sequential image classification methods based on recurrent neural networks. We describe methods based on recurrent neural networks such as Long-Short-Term memory(LSTM), bidirectional Long-Short-Term memory(BiLSTM)…

Image and Video Processing · Electrical Eng. & Systems 2022-09-26 Gajraj Kuldeep

Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization

We present Bifocal RNN-T, a new variant of the Recurrent Neural Network Transducer (RNN-T) architecture designed for improved inference time latency on speech recognition tasks. The architecture enables a dynamic pivot for its runtime…

Audio and Speech Processing · Electrical Eng. & Systems 2021-08-05 Jonathan Macoskey , Grant P. Strimel , Ariya Rastrow

Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer

Recurrent Neural Network Transducer (RNN-T), like most end-to-end speech recognition model architectures, has an implicit neural network language model (NNLM) and cannot easily leverage unpaired text data during training. Previous work has…

Computation and Language · Computer Science 2020-10-28 Suyoun Kim , Yuan Shangguan , Jay Mahadeokar , Antoine Bruguier , Christian Fuegen , Michael L. Seltzer , Duc Le

English Broadcast News Speech Recognition by Humans and Machines

With recent advances in deep learning, considerable attention has been given to achieving automatic speech recognition performance close to human performance on tasks like conversational telephone speech (CTS) recognition. In this paper we…

Computation and Language · Computer Science 2019-05-01 Samuel Thomas , Masayuki Suzuki , Yinghui Huang , Gakuto Kurata , Zoltan Tuske , George Saon , Brian Kingsbury , Michael Picheny , Tom Dibert , Alice Kaiser-Schatzlein , Bern Samko

A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property

We present an architecture of a recurrent neural network (RNN) with a fully-connected deep neural network (DNN) as its feature extractor. The RNN is equipped with both causal temporal prediction and non-causal look-ahead, via…

Machine Learning · Computer Science 2014-03-07 Jianshu Chen , Li Deng