English
Related papers

Related papers: Phoneme recognition in TIMIT with BLSTM-CTC

200 papers

Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output…

Neural and Evolutionary Computing · Computer Science 2013-03-26 Alex Graves , Abdel-rahman Mohamed , Geoffrey Hinton

Ladder networks are a notable new concept in the field of semi-supervised learning by showing state-of-the-art results in image recognition tasks while being compatible with many existing neural architectures. We present the recurrent…

Computation and Language · Computer Science 2017-09-20 Marian Tietz , Tayfun Alpay , Johannes Twiefel , Stefan Wermter

Connectionist temporal classification (CTC) based supervised sequence training of recurrent neural networks (RNNs) has shown great success in many machine learning areas including end-to-end speech and handwritten character recognition. For…

Machine Learning · Computer Science 2017-02-03 Kyuyeon Hwang , Wonyong Sung

In this paper, we have investigated recurrent deep neural networks (DNNs) in combination with regularization techniques as dropout, zoneout, and regularization post-layer. As a benchmark, we chose the TIMIT phone recognition task due to its…

Computation and Language · Computer Science 2018-06-20 Jan Vanek , Josef Michalek , Josef Psutka

In this paper, we have analyzed the impact of confusions on the robustness of phoneme recognitions system. The confusions are detected at the pronunciation and the confusions matrices of the phoneme recognizer. The confusions show that some…

Computation and Language · Computer Science 2015-08-10 Rimah Amami , Noureddine Ellouze

Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is very challenging. With the availability of large annotated data (Bowman et al., 2015), it has recently become feasible to…

Computation and Language · Computer Science 2020-03-04 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang , Diana Inkpen

Recurrent sequence generators conditioned on input data through an attention mechanism have recently shown very good performance on a range of tasks in- cluding machine translation, handwriting synthesis and image caption gen- eration. We…

Computation and Language · Computer Science 2015-06-25 Jan Chorowski , Dzmitry Bahdanau , Dmitriy Serdyuk , Kyunghyun Cho , Yoshua Bengio

Many machine learning tasks can be expressed as the transformation---or \emph{transduction}---of input sequences into output sequences: speech recognition, machine translation, protein secondary structure prediction and text-to-speech to…

Neural and Evolutionary Computing · Computer Science 2012-11-16 Alex Graves

Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for modeling and predicting sequential data, e.g. speech utterances or handwritten documents. In this study, we propose to use…

Computation and Language · Computer Science 2015-11-03 Peilu Wang , Yao Qian , Frank K. Soong , Lei He , Hai Zhao

The standard approach to mitigate errors made by an automatic speech recognition system is to use confidence scores associated with each predicted word. In the simplest case, these scores are word posterior probabilities whilst more complex…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-19 Qiujia Li , Preben Ness , Anton Ragni , Mark Gales

A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network.…

Computation and Language · Computer Science 2016-12-28 Zewang Zhang , Zheng Sun , Jiaqi Liu , Jingwen Chen , Zhao Huo , Xiao Zhang

In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single multilabel BLSTM RNN is trained to map…

Sound · Computer Science 2016-11-17 Giambattista Parascandolo , Heikki Huttunen , Tuomas Virtanen

Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent…

Computation and Language · Computer Science 2016-04-25 Ke Tran , Arianna Bisazza , Christof Monz

Neural network models have been demonstrated to be capable of achieving remarkable performance in sentence and document modeling. Convolutional neural network (CNN) and recurrent neural network (RNN) are two mainstream architectures for…

Computation and Language · Computer Science 2015-12-01 Chunting Zhou , Chonglin Sun , Zhiyuan Liu , Francis C. M. Lau

Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithms for speech recognition are iterating over the time axis, such that one time step is decoded before moving on to the next time step. Those algorithms result in a large…

Machine Learning · Computer Science 2023-10-09 Gil Keren

This study compares sequential image classification methods based on recurrent neural networks. We describe methods based on recurrent neural networks such as Long-Short-Term memory(LSTM), bidirectional Long-Short-Term memory(BiLSTM)…

Image and Video Processing · Electrical Eng. & Systems 2022-09-26 Gajraj Kuldeep

We present Bifocal RNN-T, a new variant of the Recurrent Neural Network Transducer (RNN-T) architecture designed for improved inference time latency on speech recognition tasks. The architecture enables a dynamic pivot for its runtime…

Audio and Speech Processing · Electrical Eng. & Systems 2021-08-05 Jonathan Macoskey , Grant P. Strimel , Ariya Rastrow

Recurrent Neural Network Transducer (RNN-T), like most end-to-end speech recognition model architectures, has an implicit neural network language model (NNLM) and cannot easily leverage unpaired text data during training. Previous work has…

Computation and Language · Computer Science 2020-10-28 Suyoun Kim , Yuan Shangguan , Jay Mahadeokar , Antoine Bruguier , Christian Fuegen , Michael L. Seltzer , Duc Le

With recent advances in deep learning, considerable attention has been given to achieving automatic speech recognition performance close to human performance on tasks like conversational telephone speech (CTS) recognition. In this paper we…

We present an architecture of a recurrent neural network (RNN) with a fully-connected deep neural network (DNN) as its feature extractor. The RNN is equipped with both causal temporal prediction and non-causal look-ahead, via…

Machine Learning · Computer Science 2014-03-07 Jianshu Chen , Li Deng
‹ Prev 1 2 3 10 Next ›