Related papers: Parallelizable memory recurrent units

Improving the Performance and Learning Stability of Parallelizable RNNs Designed for Ultra-Low Power Applications

Sequence learning is dominated by Transformers and parallelizable recurrent neural networks (RNNs) such as state-space models, yet learning long-term dependencies remains challenging, and state-of-the-art designs trade power consumption for…

Machine Learning · Computer Science 2026-05-13 Julien Brandoit , Arthur Fyon , Damien Ernst , Guillaume Drion

Simple Recurrent Units for Highly Parallelizable Recurrence

Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU), a light recurrent unit that balances model capacity and…

Computation and Language · Computer Science 2018-09-10 Tao Lei , Yu Zhang , Sida I. Wang , Hui Dai , Yoav Artzi

Single Stream Parallelization of Recurrent Neural Networks for Low Power and Fast Inference

As neural network algorithms show high performance in many applications, their efficient inference on mobile and embedded systems are of great interests. When a single stream recurrent neural network (RNN) is executed for a personal user in…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-04-02 Wonyong Sung , Jinhwan Park

Parallelizing Legendre Memory Unit Training

Recently, a new recurrent neural network (RNN) named the Legendre Memory Unit (LMU) was proposed and shown to achieve state-of-the-art performance on several benchmark datasets. Here we leverage the linear time-invariant (LTI) memory…

Machine Learning · Computer Science 2021-05-12 Narsimha Chilkuri , Chris Eliasmith

Resurrecting Recurrent Neural Networks for Long Sequences

Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are hard to optimize and slow to train. Deep state-space models (SSMs) have recently been shown to perform remarkably well on long sequence modeling tasks, and have…

Machine Learning · Computer Science 2023-03-14 Antonio Orvieto , Samuel L Smith , Albert Gu , Anushan Fernando , Caglar Gulcehre , Razvan Pascanu , Soham De

ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models

Recurrent Neural Networks (RNNs) laid the foundation for sequence modeling, but their intrinsic sequential nature restricts parallel computation, creating a fundamental barrier to scaling. This has led to the dominance of parallelizable…

Machine Learning · Computer Science 2025-11-04 Federico Danieli , Pau Rodriguez , Miguel Sarabia , Xavier Suau , Luca Zappella

Improved state mixing in higher-order and block diagonal linear recurrent networks

Linear recurrent networks (LRNNs) and linear state space models (SSMs) promise computational and memory efficiency on long-sequence modeling tasks, yet their diagonal state transitions limit expressivity. Dense and nonlinear architectures…

Machine Learning · Computer Science 2026-03-03 Igor Dubinin , Antonio Orvieto , Felix Effenberger

RotRNN: Modelling Long Sequences with Rotations

Linear recurrent neural networks, such as State Space Models (SSMs) and Linear Recurrent Units (LRUs), have recently shown state-of-the-art performance on long sequence modelling benchmarks. Despite their success, their empirical…

Machine Learning · Computer Science 2024-10-08 Kai Biegun , Rares Dolga , Jake Cunningham , David Barber

Were RNNs All We Needed?

The introduction of Transformers in 2017 reshaped the landscape of deep learning. Originally proposed for sequence modelling, Transformers have since achieved widespread success across various domains. However, the scalability limitations…

Machine Learning · Computer Science 2024-12-02 Leo Feng , Frederick Tung , Mohamed Osama Ahmed , Yoshua Bengio , Hossein Hajimirsadeghi

Spatially-Enhanced Recurrent Memory for Long-Range Mapless Navigation via End-to-End Reinforcement Learning

Recent advancements in robot navigation, particularly with end-to-end learning approaches such as reinforcement learning (RL), have demonstrated strong performance. However, successful navigation still depends on two key capabilities:…

Robotics · Computer Science 2025-09-05 Fan Yang , Per Frivik , David Hoeller , Chen Wang , Cesar Cadena , Marco Hutter

Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion

Sequence modeling is a critical yet challenging task with wide-ranging applications, especially in time series forecasting for domains like weather prediction, temperature monitoring, and energy load forecasting. Transformers, with their…

Machine Learning · Computer Science 2025-04-15 Qisai Liu , Zhanhong Jiang , Joshua R. Waite , Chao Liu , Aditya Balu , Soumik Sarkar

Recurrent neural networks: vanishing and exploding gradients are not the end of the story

Recurrent neural networks (RNNs) notoriously struggle to learn long-term memories, primarily due to vanishing and exploding gradients. The recent success of state-space models (SSMs), a subclass of RNNs, to overcome such difficulties…

Machine Learning · Computer Science 2024-11-06 Nicolas Zucchet , Antonio Orvieto

Prototypical Recurrent Unit

Despite the great successes of deep learning, the effectiveness of deep neural networks has not been understood at any theoretical depth. This work is motivated by the thrust of developing a deeper understanding of recurrent neural…

Machine Learning · Computer Science 2018-02-12 Dingkun Long , Richong Zhang , Yongyi Mao

Accuracy, Memory Efficiency and Generalization: A Comparative Study on Liquid Neural Networks and Recurrent Neural Networks

This review aims to conduct a comparative analysis of liquid neural networks (LNNs) and traditional recurrent neural networks (RNNs) and their variants, such as long short-term memory networks (LSTMs) and gated recurrent units (GRUs). The…

Machine Learning · Computer Science 2025-10-10 Shilong Zong , Alex Bierly , Almuatazbellah Boker , Hoda Eldardiry

Parallel Spiking Unit for Efficient Training of Spiking Neural Networks

Efficient parallel computing has become a pivotal element in advancing artificial intelligence. Yet, the deployment of Spiking Neural Networks (SNNs) in this domain is hampered by their inherent sequential computational dependency. This…

Neural and Evolutionary Computing · Computer Science 2024-06-11 Yang Li , Yinqian Sun , Xiang He , Yiting Dong , Dongcheng Zhao , Yi Zeng

Learning Long Sequences in Spiking Neural Networks

Spiking neural networks (SNNs) take inspiration from the brain to enable energy-efficient computations. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on modern sequential tasks, as they inherit…

Neural and Evolutionary Computing · Computer Science 2024-01-03 Matei Ioan Stan , Oliver Rhodes

Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

Always-on AI applications, from environmental sensors to biomedical implants, require ultra-low power consumption. Analog circuits offer a path to sub-microwatt inference, yet existing analog implementations are limited to feedforward…

Hardware Architecture · Computer Science 2026-05-27 Arthur Fyon , Julien Brandoit , Loris Mendolia , Damien Ernst , Jean-Michel Redouté , Guillaume Drion

Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition

Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU). However, the dynamic properties behind the…

Machine Learning · Computer Science 2017-02-28 Zhiyuan Tang , Ying Shi , Dong Wang , Yang Feng , Shiyue Zhang

Modeling Irregular Time Series with Continuous Recurrent Units

Recurrent neural networks (RNNs) are a popular choice for modeling sequential data. Modern RNN architectures assume constant time-intervals between observations. However, in many datasets (e.g. medical records) observation times are…

Machine Learning · Computer Science 2022-07-27 Mona Schirmer , Mazin Eltayeb , Stefan Lessmann , Maja Rudolph

MCRM: Mother Compact Recurrent Memory

LSTMs and GRUs are the most common recurrent neural network architectures used to solve temporal sequence problems. The two architectures have differing data flows dealing with a common component called the cell state (also referred to as…

Neural and Evolutionary Computing · Computer Science 2019-08-08 Abduallah A. Mohamed , Christian Claudel