Related papers: Recurrent Neural Network Regularization

Revisiting Activation Regularization for Language RNNs

Recurrent neural networks (RNNs) serve as a fundamental building block for many sequence tasks across natural language processing. Recent research has focused on recurrent dropout techniques or custom RNN cells in order to improve…

Computation and Language · Computer Science 2017-08-04 Stephen Merity , Bryan McCann , Richard Socher

Recurrent Dropout without Memory Loss

This paper presents a novel approach to recurrent neural network (RNN) regularization. Differently from the widely adopted dropout method, which is applied to \textit{forward} connections of feed-forward architectures or RNNs, we propose to…

Computation and Language · Computer Science 2016-08-08 Stanislau Semeniuta , Aliaksei Severyn , Erhardt Barth

Structured in Space, Randomized in Time: Leveraging Dropout in RNNs for Efficient Training

Recurrent Neural Networks (RNNs), more specifically their Long Short-Term Memory (LSTM) variants, have been widely used as a deep learning tool for tackling sequence-based learning tasks in text and speech. Training of such LSTM…

Machine Learning · Computer Science 2021-06-24 Anup Sarma , Sonali Singh , Huaipan Jiang , Rui Zhang , Mahmut T Kandemir , Chita R Das

Dropout improves Recurrent Neural Networks for Handwriting Recognition

Recurrent neural networks (RNNs) with Long Short-Term memory cells currently hold the best known results in unconstrained handwriting recognition. We show that their performance can be greatly improved using dropout - a recently proposed…

Computer Vision and Pattern Recognition · Computer Science 2014-03-11 Vu Pham , Théodore Bluche , Christopher Kermorvant , Jérôme Louradour

Fraternal Dropout

Recurrent neural networks (RNNs) are important class of architectures among neural networks useful for language modeling and sequential prediction. However, optimizing RNNs is known to be harder compared to feed-forward neural networks. A…

Machine Learning · Statistics 2018-03-29 Konrad Zolna , Devansh Arpit , Dendi Suhubdy , Yoshua Bengio

Fine-tuning Handwriting Recognition systems with Temporal Dropout

This paper introduces a novel method to fine-tune handwriting recognition systems based on Recurrent Neural Networks (RNN). Long Short-Term Memory (LSTM) networks are good at modeling long sequences but they tend to overfit over time. To…

Computer Vision and Pattern Recognition · Computer Science 2021-02-02 Edgard Chammas , Chafic Mokbel

Tikhonov Regularization for Long Short-Term Memory Networks

It is a well-known fact that adding noise to the input data often improves network performance. While the dropout technique may be a cause of memory loss, when it is applied to recurrent connections, Tikhonov regularization, which can be…

Machine Learning · Computer Science 2017-08-11 Andrei Turkin

Regularizing and Optimizing LSTM Language Models

Recurrent neural networks (RNNs), such as long short-term memory networks (LSTMs), serve as a fundamental building block for many sequence learning tasks, including machine translation, language modeling, and question answering. In this…

Computation and Language · Computer Science 2017-08-09 Stephen Merity , Nitish Shirish Keskar , Richard Socher

Regularizing Recurrent Networks - On Injected Noise and Norm-based Methods

Advancements in parallel processing have lead to a surge in multilayer perceptrons' (MLP) applications and deep learning in the past decades. Recurrent Neural Networks (RNNs) give additional representational power to feedforward MLPs by…

Machine Learning · Statistics 2014-10-22 Saahil Ognawala , Justin Bayer

Macro-block dropout for improved regularization in training end-to-end speech recognition models

This paper proposes a new regularization algorithm referred to as macro-block dropout. The overfitting issue has been a difficult problem in training large neural network models. The dropout technique has proven to be simple yet very…

Machine Learning · Computer Science 2023-01-02 Chanwoo Kim , Sathish Indurti , Jinhwan Park , Wonyong Sung

State-Regularized Recurrent Neural Networks

Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, it is difficult to understand what exactly they learn. Second, they tend to work poorly on sequences requiring long-term…

Machine Learning · Computer Science 2019-05-08 Cheng Wang , Mathias Niepert

Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition Task

In this paper, we have investigated recurrent deep neural networks (DNNs) in combination with regularization techniques as dropout, zoneout, and regularization post-layer. As a benchmark, we chose the TIMIT phone recognition task due to its…

Computation and Language · Computer Science 2018-06-20 Jan Vanek , Josef Michalek , Josef Psutka

Adversarial Dropout for Recurrent Neural Networks

Successful application processing sequential data, such as text and speech, requires an improved generalization performance of recurrent neural networks (RNNs). Dropout techniques for RNNs were introduced to respond to these demands, but we…

Machine Learning · Computer Science 2019-04-23 Sungrae Park , Kyungwoo Song , Mingi Ji , Wonsung Lee , Il-Chul Moon

Learning to Execute

Recurrent Neural Networks (RNNs) with Long Short-Term Memory units (LSTM) are widely used because they are expressive and are easy to train. Our interest lies in empirically evaluating the expressiveness and the learnability of LSTMs in the…

Neural and Evolutionary Computing · Computer Science 2015-11-24 Wojciech Zaremba , Ilya Sutskever

Deep Learning with Kernel Flow Regularization for Time Series Forecasting

Long Short-Term Memory (LSTM) neural networks have been widely used for time series forecasting problems. However, LSTMs are prone to overfitting and performance reduction during test phases. Several different regularization techniques have…

Machine Learning · Computer Science 2021-09-27 Mahdy Shirdel , Reza Asadi , Duc Do , Micheal Hintlian

Curriculum Dropout

Dropout is a very effective way of regularizing neural networks. Stochastically "dropping out" units with a certain probability discourages over-specific co-adaptations of feature detectors, preventing overfitting and improving network…

Neural and Evolutionary Computing · Computer Science 2017-08-04 Pietro Morerio , Jacopo Cavazza , Riccardo Volpi , Rene Vidal , Vittorio Murino

Regularization techniques for fine-tuning in neural machine translation

We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset. In this scenario, overfitting is a major…

Computation and Language · Computer Science 2017-08-01 Antonio Valerio Miceli Barone , Barry Haddow , Ulrich Germann , Rico Sennrich

Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition

Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture that has been designed to address the vanishing and exploding gradient problems of conventional RNNs. Unlike feedforward neural networks, RNNs have cyclic…

Neural and Evolutionary Computing · Computer Science 2014-02-06 Haşim Sak , Andrew Senior , Françoise Beaufays

R-Drop: Regularized Dropout for Neural Networks

Dropout is a powerful and widely used technique to regularize the training of deep neural networks. In this paper, we introduce a simple regularization strategy upon dropout in model training, namely R-Drop, which forces the output…

Machine Learning · Computer Science 2021-11-01 Xiaobo Liang , Lijun Wu , Juntao Li , Yue Wang , Qi Meng , Tao Qin , Wei Chen , Min Zhang , Tie-Yan Liu

SoftTarget Regularization: An Effective Technique to Reduce Over-Fitting in Neural Networks

Deep neural networks are learning models with a very high capacity and therefore prone to over-fitting. Many regularization techniques such as Dropout, DropConnect, and weight decay all attempt to solve the problem of over-fitting by…

Machine Learning · Computer Science 2016-12-06 Armen Aghajanyan