English
Related papers

Related papers: Regularizing and Optimizing LSTM Language Models

200 papers

In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with…

Machine Learning · Statistics 2019-04-09 Artem M. Grachev , Dmitry I. Ignatov , Andrey V. Savchenko

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In…

Neural and Evolutionary Computing · Computer Science 2015-02-20 Wojciech Zaremba , Ilya Sutskever , Oriol Vinyals

Recurrent neural networks (RNNs) serve as a fundamental building block for many sequence tasks across natural language processing. Recent research has focused on recurrent dropout techniques or custom RNN cells in order to improve…

Computation and Language · Computer Science 2017-08-04 Stephen Merity , Bryan McCann , Richard Socher

Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture that has been designed to address the vanishing and exploding gradient problems of conventional RNNs. Unlike feedforward neural networks, RNNs have cyclic…

Neural and Evolutionary Computing · Computer Science 2014-02-06 Haşim Sak , Andrew Senior , Françoise Beaufays

Highly regularized LSTMs achieve impressive results on several benchmark datasets in language modeling. We propose a new regularization method based on decoding the last token in the context using the predicted distribution of the next…

Computation and Language · Computer Science 2019-01-25 Siddhartha Brahma

Recurrent neural networks (RNNs), including long short-term memory (LSTM) RNNs, have produced state-of-the-art results on a variety of speech recognition tasks. However, these models are often too large in size for deployment on mobile…

Machine Learning · Computer Science 2016-04-12 Zhiyun Lu , Vikas Sindhwani , Tara N. Sainath

We study the problem of compressing recurrent neural networks (RNNs). In particular, we focus on the compression of RNN acoustic models, which are motivated by the goal of building compact and accurate speech recognition systems which can…

Computation and Language · Computer Science 2016-05-03 Rohit Prabhavalkar , Ouais Alsharif , Antoine Bruguier , Ian McGraw

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can…

Computation and Language · Computer Science 2019-04-22 Anupiya Nugaliyadde , Kok Wai Wong , Ferdous Sohel , Hong Xie

Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks. However, these models are computationally…

Machine Learning · Statistics 2017-11-16 Shankar Kumar , Michael Nirschl , Daniel Holtmann-Rice , Hank Liao , Ananda Theertha Suresh , Felix Yu

The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously. Despite its successes on text representations, the…

Computation and Language · Computer Science 2020-03-03 Yijin Liu , Fandong Meng , Yufeng Chen , Jinan Xu , Jie Zhou

Language models, being at the heart of many NLP problems, are always of great interest to researchers. Neural language models come with the advantage of distributed representations and long range contexts. With its particular dynamics that…

Neural and Evolutionary Computing · Computer Science 2018-11-19 Thomas Cherian , Akshay Badola , Vineet Padmanabhan

This study presents a novel model for invertible sentence embeddings using a residual recurrent network trained on an unsupervised encoding task. Rather than the probabilistic outputs common to neural machine translation models, our…

Computation and Language · Computer Science 2023-04-07 Jeremy Wilkerson

Recursive neural networks (RNN) and their recently proposed extension recursive long short term memory networks (RLSTM) are models that compute representations for sentences, by recursively combining word embeddings according to an…

Artificial Intelligence · Computer Science 2016-03-02 Phong Le , Willem Zuidema

This paper develops a model that addresses sentence embedding, a hot topic in current natural language processing research, using recurrent neural networks with Long Short-Term Memory (LSTM) cells. Due to its ability to capture long term…

Computation and Language · Computer Science 2016-11-18 Hamid Palangi , Li Deng , Yelong Shen , Jianfeng Gao , Xiaodong He , Jianshu Chen , Xinying Song , Rabab Ward

We investigate the effective memory depth of RNN models by using them for $n$-gram language model (LM) smoothing. Experiments on a small corpus (UPenn Treebank, one million words of training data and 10k vocabulary) have found the LSTM cell…

Computation and Language · Computer Science 2017-06-21 Ciprian Chelba , Mohammad Norouzi , Samy Bengio

Recurrent Neural Network (RNN) and its variations such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), have become standard building blocks for learning online data of sequential nature in many research areas, including…

Computation and Language · Computer Science 2020-05-12 Enmao Diao , Jie Ding , Vahid Tarokh

Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling. The hidden layers in RNNs can be…

Computer Vision and Pattern Recognition · Computer Science 2018-11-20 Yu Pan , Jing Xu , Maolin Wang , Jinmian Ye , Fei Wang , Kun Bai , Zenglin Xu

Vanishing long-term gradients are a major issue in training standard recurrent neural networks (RNNs), which can be alleviated by long short-term memory (LSTM) models with memory cells. However, the extra parameters associated with the…

Computation and Language · Computer Science 2018-02-26 Chao Zhang , Philip Woodland

We introduce multiplicative LSTM (mLSTM), a recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. mLSTM is characterised by…

Neural and Evolutionary Computing · Computer Science 2017-10-13 Ben Krause , Liang Lu , Iain Murray , Steve Renals

Recurrent Neural Network Transducer (RNN-T), like most end-to-end speech recognition model architectures, has an implicit neural network language model (NNLM) and cannot easily leverage unpaired text data during training. Previous work has…

Computation and Language · Computer Science 2020-10-28 Suyoun Kim , Yuan Shangguan , Jay Mahadeokar , Antoine Bruguier , Christian Fuegen , Michael L. Seltzer , Duc Le
‹ Prev 1 2 3 10 Next ›