Related papers: Accelerating recurrent neural network training usi…

Single stream parallelization of generalized LSTM-like RNNs on a GPU

Recurrent neural networks (RNNs) have shown outstanding performance on processing sequence data. However, they suffer from long training time, which demands parallel implementations of the training procedure. Parallelization of the training…

Neural and Evolutionary Computing · Computer Science 2015-11-25 Kyuyeon Hwang , Wonyong Sung

Curriculum Learning for Handwritten Text Line Recognition

Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition. At the same time, learning RNN by gradient descent leads to slow convergence, and training times are particularly long…

Machine Learning · Computer Science 2013-12-09 Jérôme Louradour , Christopher Kermorvant

An Optimized and Energy-Efficient Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks

Recurrent neural networks (RNN) have been successfully applied to various sequential decision-making tasks, natural language processing applications, and time-series predictions. Such networks are usually trained through back-propagation…

Machine Learning · Computer Science 2019-12-02 Julia El Zini , Yara Rizk , Mariette Awad

Learning Recurrent Binary/Ternary Weights

Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data. However, they are both complex and memory intensive due to their recursive nature. These limitations make RNNs difficult to embed on mobile…

Machine Learning · Computer Science 2019-01-28 Arash Ardakani , Zhengyun Ji , Sean C. Smithson , Brett H. Meyer , Warren J. Gross

Parallelizing Legendre Memory Unit Training

Recently, a new recurrent neural network (RNN) named the Legendre Memory Unit (LMU) was proposed and shown to achieve state-of-the-art performance on several benchmark datasets. Here we leverage the linear time-invariant (LTI) memory…

Machine Learning · Computer Science 2021-05-12 Narsimha Chilkuri , Chris Eliasmith

Accelerating recurrent neural network language model based online speech recognition system

This paper presents methods to accelerate recurrent neural network based language models (RNNLMs) for online speech recognition systems. Firstly, a lossy compression of the past hidden layer outputs (history vector) with caching is…

Computation and Language · Computer Science 2018-01-31 Kyungmin Lee , Chiyoun Park , Namhoon Kim , Jaewon Lee

Parallelizing Linear Recurrent Neural Nets Over Sequence Length

Recurrent neural networks (RNNs) are widely used to model sequential data but their non-linear dependencies between sequence elements prevent parallelizing training over sequence length. We show the training of RNNs with only linear…

Neural and Evolutionary Computing · Computer Science 2018-02-23 Eric Martin , Chris Cundy

A comprehensive study of batch construction strategies for recurrent neural networks in MXNet

In this work we compare different batch construction methods for mini-batch training of recurrent neural networks. While popular implementations like TensorFlow and MXNet suggest a bucketing approach to improve the parallelization…

Machine Learning · Computer Science 2017-05-09 Patrick Doetsch , Pavel Golik , Hermann Ney

Hybrid Data-Model Parallel Training for Sequence-to-Sequence Recurrent Neural Network Machine Translation

Reduction of training time is an important issue in many tasks like patent translation involving neural networks. Data parallelism and model parallelism are two common approaches for reducing training time using multiple graphics processing…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-09-10 Junya Ono , Masao Utiyama , Eiichiro Sumita

Parallelizing non-linear sequential models over the sequence length

Sequential models, such as Recurrent Neural Networks and Neural Ordinary Differential Equations, have long suffered from slow training due to their inherent sequential nature. For many years this bottleneck has persisted, as many thought…

Machine Learning · Computer Science 2024-01-17 Yi Heng Lim , Qi Zhu , Joshua Selfridge , Muhammad Firmansyah Kasim

Single Stream Parallelization of Recurrent Neural Networks for Low Power and Fast Inference

As neural network algorithms show high performance in many applications, their efficient inference on mobile and embedded systems are of great interests. When a single stream recurrent neural network (RNN) is executed for a personal user in…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-04-02 Wonyong Sung , Jinhwan Park

Parallel training of linear models without compromising convergence

In this paper we analyze, evaluate, and improve the performance of training generalized linear models on modern CPUs. We start with a state-of-the-art asynchronous parallel training algorithm, identify system-level performance bottlenecks,…

Machine Learning · Computer Science 2018-12-20 Nikolas Ioannou , Celestine Dünner , Kornilios Kourtis , Thomas Parnell

Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks

Recurrent neural networks (RNNs) in the brain and in silico excel at solving tasks with intricate temporal dependencies. Long timescales required for solving such tasks can arise from properties of individual neurons (single-neuron…

Neural and Evolutionary Computing · Computer Science 2024-10-31 Sina Khajehabdollahi , Roxana Zeraati , Emmanouil Giannakakis , Tim Jakob Schäfer , Georg Martius , Anna Levina

Learning to Remember More with Less Memorization

Memory-augmented neural networks consisting of a neural controller and an external memory have shown potentials in long-term sequential learning. Current RAM-like memory models maintain memory accessing every timesteps, thus they do not…

Machine Learning · Computer Science 2019-03-21 Hung Le , Truyen Tran , Svetha Venkatesh

Convolutional Sequence to Sequence Learning

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to…

Computation and Language · Computer Science 2017-07-26 Jonas Gehring , Michael Auli , David Grangier , Denis Yarats , Yann N. Dauphin

Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences

Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for extracting patterns from temporal sequences. However, current RNN models are ill-suited to process irregularly sampled data triggered by events generated in…

Machine Learning · Computer Science 2016-11-01 Daniel Neil , Michael Pfeiffer , Shih-Chii Liu

Optimizing Performance of Recurrent Neural Networks on GPUs

As recurrent neural networks become larger and deeper, training times for single networks are rising into weeks or even months. As such there is a significant incentive to improve the performance and scalability of these networks. While…

Machine Learning · Computer Science 2016-04-08 Jeremy Appleyard , Tomas Kocisky , Phil Blunsom

Sliced Recurrent Neural Networks

Recurrent neural networks have achieved great success in many NLP tasks. However, they have difficulty in parallelization because of the recurrent structure, so it takes much time to train RNNs. In this paper, we introduce sliced recurrent…

Computation and Language · Computer Science 2018-07-09 Zeping Yu , Gongshen Liu

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the…

Machine Learning · Computer Science 2015-09-24 Samy Bengio , Oriol Vinyals , Navdeep Jaitly , Noam Shazeer

Image Classification using Sequence of Pixels

This study compares sequential image classification methods based on recurrent neural networks. We describe methods based on recurrent neural networks such as Long-Short-Term memory(LSTM), bidirectional Long-Short-Term memory(BiLSTM)…

Image and Video Processing · Electrical Eng. & Systems 2022-09-26 Gajraj Kuldeep