English
Related papers

Related papers: Recurrence-Complete Frame-based Action Models

200 papers

The recurrent network architecture is a widely used model in sequence modeling, but its serial dependency hinders the computation parallelization, which makes the operation inefficient. The same problem was encountered in serial adder at…

Machine Learning · Computer Science 2021-08-25 Haowei Jiang , Feiwei Qin , Jin Cao , Yong Peng , Yanli Shao

Recurrent neural networks (RNNs) are well suited for solving sequence tasks in resource-constrained systems due to their expressivity and low computational requirements. However, there is still a need to bridge the gap between what RNNs are…

Machine Learning · Computer Science 2023-03-13 Anand Subramoney , Khaleelulla Khan Nazeer , Mark Schöne , Christian Mayr , David Kappel

Recurrent neural networks (RNNs) have long been an architecture of interest for computational models of human sentence processing. The recently introduced Transformer architecture outperforms RNNs on many natural language processing tasks…

Computation and Language · Computer Science 2022-03-31 Danny Merkx , Stefan L. Frank

We present a novel architecture, residual attention net (RAN), which merges a sequence architecture, universal transformer, and a computer vision architecture, residual net, with a high-way architecture for cross-domain sequence modeling.…

Machine Learning · Computer Science 2020-01-14 Seth H. Huang , Xu Lingjie , Jiang Congwei

Recurrent neural networks (RNNs) provide state-of-the-art performance in processing sequential data but are memory intensive to train, limiting the flexibility of RNN models which can be trained. Reversible RNNs---RNNs for which the…

Machine Learning · Computer Science 2018-10-26 Matthew MacKay , Paul Vicol , Jimmy Ba , Roger Grosse

Recurrent neural networks (RNNs) are omnipresent in sequence modeling tasks. Practical models usually consist of several layers of hundreds or thousands of neurons which are fully connected. This places a heavy computational and memory…

Machine Learning · Computer Science 2019-05-30 Matthijs Van Keirsbilck , Alexander Keller , Xiaodong Yang

Linear Recurrence has proven to be a powerful tool for modeling long sequences efficiently. In this work, we show that existing models fail to take full advantage of its potential. Motivated by this finding, we develop GateLoop, a…

Machine Learning · Computer Science 2024-01-30 Tobias Katsch

The advent of Transformers marked a significant breakthrough in sequence modelling, providing a highly performant architecture capable of leveraging GPU parallelism. However, Transformers are computationally expensive at inference time,…

Machine Learning · Computer Science 2024-05-29 Leo Feng , Frederick Tung , Hossein Hajimirsadeghi , Mohamed Osama Ahmed , Yoshua Bengio , Greg Mori

Neural networks using transformer-based architectures have recently demonstrated great power and flexibility in modeling sequences of many types. One of the core components of transformer networks is the attention layer, which allows…

Machine Learning · Computer Science 2019-07-16 Matthew Spellings

This work introduces a novel Retention Layer mechanism for Transformer based architectures, addressing their inherent lack of intrinsic retention capabilities. Unlike human cognition, which can encode and dynamically recall symbolic…

Machine Learning · Computer Science 2025-01-17 M. Murat Yaslioglu

The key to a Transformer model is the self-attention mechanism, which allows the model to analyze an entire sequence in a computationally efficient manner. Recent work has suggested the possibility that general attention mechanisms used by…

Machine Learning · Computer Science 2020-01-01 Thomas Dowdell , Hongyu Zhang

With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long…

Machine Learning · Statistics 2018-07-17 Huan Song , Deepta Rajan , Jayaraman J. Thiagarajan , Andreas Spanias

Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and…

Computation and Language · Computer Science 2018-08-29 Ke Tran , Arianna Bisazza , Christof Monz

A longstanding challenge for the Machine Learning community is the one of developing models that are capable of processing and learning from very long sequences of data. The outstanding results of Transformers-based networks (e.g., Large…

Machine Learning · Computer Science 2024-02-15 Matteo Tiezzi , Michele Casoni , Alessandro Betti , Tommaso Guidi , Marco Gori , Stefano Melacci

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection…

Computation and Language · Computer Science 2023-08-10 Yutao Sun , Li Dong , Shaohan Huang , Shuming Ma , Yuqing Xia , Jilong Xue , Jianyong Wang , Furu Wei

Recent architectural developments have enabled recurrent neural networks (RNNs) to reach and even surpass the performance of Transformers on certain sequence modeling tasks. These modern RNNs feature a prominent design pattern: linear…

Countless learning tasks require dealing with sequential data. Image captioning, speech synthesis, and music generation all require that a model produce outputs that are sequences. In other domains, such as time series prediction, video…

Machine Learning · Computer Science 2015-10-20 Zachary C. Lipton , John Berkowitz , Charles Elkan

Building and maintaining state to learn policies and value functions is critical for deploying reinforcement learning (RL) agents in the real world. Recurrent neural networks (RNNs) have become a key point of interest for the state-building…

Machine Learning · Computer Science 2026-05-19 Matthew Schlegel , Volodymyr Tkachuk , Adam White , Martha White

Recurrent Neural Networks have long been the dominating choice for sequence modeling. However, it severely suffers from two issues: impotent in capturing very long-term dependencies and unable to parallelize the sequential computation…

Machine Learning · Computer Science 2019-07-15 Zhiwei Wang , Yao Ma , Zitao Liu , Jiliang Tang

Recurrent neural networks (RNNs) have been used extensively and with increasing success to model various types of sequential data. Much of this progress has been achieved through devising recurrent units and architectures with the…

Machine Learning · Statistics 2017-03-06 Yacine Jernite , Edouard Grave , Armand Joulin , Tomas Mikolov
‹ Prev 1 2 3 10 Next ›