Related papers: Robustifying Sequential Neural Processes

Recurrent Attentive Neural Process for Sequential Data

Neural processes (NPs) learn stochastic processes and predict the distribution of target output adaptively conditioned on a context set of observed input-output pairs. Furthermore, Attentive Neural Process (ANP) improved the prediction…

Machine Learning · Computer Science 2019-10-22 Shenghao Qin , Jiacheng Zhu , Jimmy Qin , Wenshuo Wang , Ding Zhao

Transformers are Meta-Reinforcement Learners

The transformer architecture and variants presented remarkable success across many machine learning tasks in recent years. This success is intrinsically related to the capability of handling long sequences and the presence of…

Machine Learning · Computer Science 2022-06-15 Luckeciano C. Melo

Self-Attention Meta-Learner for Continual Learning

Continual learning aims to provide intelligent agents capable of learning multiple tasks sequentially with neural networks. One of its main challenging, catastrophic forgetting, is caused by the neural networks non-optimal ability to learn…

Machine Learning · Computer Science 2021-01-29 Ghada Sokar , Decebal Constantin Mocanu , Mykola Pechenizkiy

Revisiting associative recall in modern recurrent models

Despite the advantageous subquadratic complexity of modern recurrent deep learning models -- such as state-space models (SSMs) -- recent studies have highlighted their potential shortcomings compared to transformers on reasoning and…

Machine Learning · Computer Science 2025-10-13 Destiny Okpekpe , Antonio Orvieto

Recurrent Memory Transformer

Transformer-based models show their effectiveness across multiple domains and tasks. The self-attention allows to combine information from all sequence elements into context-aware representations. However, global and local information has…

Computation and Language · Computer Science 2022-12-09 Aydar Bulatov , Yuri Kuratov , Mikhail S. Burtsev

Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition

Attention-based sequence-to-sequence automatic speech recognition (ASR) requires a significant delay to recognize long utterances because the output is generated after receiving entire input sequences. Although several studies recently…

Computation and Language · Computer Science 2020-11-05 Sashi Novitasari , Andros Tjandra , Sakriani Sakti , Satoshi Nakamura

Multi-Level Recurrent Residual Networks for Action Recognition

Most existing Convolutional Neural Networks(CNNs) used for action recognition are either difficult to optimize or underuse crucial temporal information. Inspired by the fact that the recurrent model consistently makes breakthroughs in the…

Computer Vision and Pattern Recognition · Computer Science 2018-01-04 Zhenxing Zheng , Gaoyun An , Qiuqi Ruan

Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search

Past work has long recognized the important role of context in guiding how humans search their memory. While context-based memory models can explain many memory phenomena, it remains unclear why humans develop such architectures over…

Neurons and Cognition · Quantitative Biology 2025-06-24 Nikolaus Salvatore , Qiong Zhang

RRA: Recurrent Residual Attention for Sequence Learning

In this paper, we propose a recurrent neural network (RNN) with residual attention (RRA) to learn long-range dependencies from sequential data. We propose to add residual connections across timesteps to RNN, which explicitly enhances the…

Machine Learning · Computer Science 2017-09-19 Cheng Wang

R-Transformer: Recurrent Neural Network Enhanced Transformer

Recurrent Neural Networks have long been the dominating choice for sequence modeling. However, it severely suffers from two issues: impotent in capturing very long-term dependencies and unable to parallelize the sequential computation…

Machine Learning · Computer Science 2019-07-15 Zhiwei Wang , Yao Ma , Zitao Liu , Jiliang Tang

Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition

Self-attention models have been successfully applied in end-to-end speech recognition systems, which greatly improve the performance of recognition accuracy. However, such attention-based models cannot be used in online speech recognition,…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-24 Jian Luo , Jianzong Wang , Ning Cheng , Jing Xiao

How to Retrain Recommender System? A Sequential Meta-Learning Method

Practical recommender systems need be periodically retrained to refresh the model with new interaction data. To pursue high model fidelity, it is usually desirable to retrain the model on both historical and new data, since it can account…

Information Retrieval · Computer Science 2021-01-15 Yang Zhang , Fuli Feng , Chenxu Wang , Xiangnan He , Meng Wang , Yan Li , Yongdong Zhang

Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model

Transformers have shown dominant performance across a range of domains including language and vision. However, their computational cost grows quadratically with the sequence length, making their usage prohibitive for resource-constrained…

Computation and Language · Computer Science 2023-10-24 Yinghan Long , Sayeed Shafayet Chowdhury , Kaushik Roy

A Novel Framework for Recurrent Neural Networks with Enhancing Information Processing and Transmission between Units

This paper proposes a novel framework for recurrent neural networks (RNNs) inspired by the human memory models in the field of cognitive neuroscience to enhance information processing and transmission between adjacent RNNs' units. The…

Neural and Evolutionary Computing · Computer Science 2018-06-05 Xi Chen , Zhihong Deng , Gehui Shen , Ting Huang

Attention-based Memory Selection Recurrent Network for Language Modeling

Recurrent neural networks (RNNs) have achieved great success in language modeling. However, since the RNNs have fixed size of memory, their memory cannot store all the information about the words it have seen before in the sentence, and…

Computation and Language · Computer Science 2016-11-29 Da-Rong Liu , Shun-Po Chuang , Hung-yi Lee

Self-Attentive Sequential Recommendation

Sequential dynamics are a key feature of many modern recommender systems, which seek to capture the `context' of users' activities on the basis of actions they have performed recently. To capture such patterns, two approaches have…

Information Retrieval · Computer Science 2018-08-30 Wang-Cheng Kang , Julian McAuley

Learning to Adaptively Scale Recurrent Neural Networks

Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series. Currently, most of multiscale RNNs use fixed scales,…

Machine Learning · Computer Science 2019-02-18 Hao Hu , Liqiang Wang , Guo-Jun Qi

Parallelizable memory recurrent units

With the emergence of massively parallel processing units, parallelization has become a desirable property for new sequence models. The ability to parallelize the processing of sequences with respect to the sequence length during training…

Machine Learning · Computer Science 2026-05-19 Florent De Geeter , Gaspard Lambrechts , Damien Ernst , Guillaume Drion

Toward Lifelong Learning in Equilibrium Propagation: Sleep-like and Awake Rehearsal for Enhanced Stability

Recurrent neural networks (RNNs) trained using Equilibrium Propagation (EP), a biologically plausible training algorithm, have demonstrated strong performance in various tasks such as image classification and reinforcement learning.…

Machine Learning · Computer Science 2025-08-21 Yoshimasa Kubo , Jean Erik Delanois , Maxim Bazhenov

Multimodal Meta-Learning for Time Series Regression

Recent work has shown the efficiency of deep learning models such as Fully Convolutional Networks (FCN) or Recurrent Neural Networks (RNN) to deal with Time Series Regression (TSR) problems. These models sometimes need a lot of data to be…

Machine Learning · Computer Science 2021-11-03 Sebastian Pineda Arango , Felix Heinrich , Kiran Madhusudhanan , Lars Schmidt-Thieme