Related papers: Robustifying Sequential Neural Processes
Neural processes (NPs) learn stochastic processes and predict the distribution of target output adaptively conditioned on a context set of observed input-output pairs. Furthermore, Attentive Neural Process (ANP) improved the prediction…
The transformer architecture and variants presented remarkable success across many machine learning tasks in recent years. This success is intrinsically related to the capability of handling long sequences and the presence of…
Continual learning aims to provide intelligent agents capable of learning multiple tasks sequentially with neural networks. One of its main challenging, catastrophic forgetting, is caused by the neural networks non-optimal ability to learn…
Despite the advantageous subquadratic complexity of modern recurrent deep learning models -- such as state-space models (SSMs) -- recent studies have highlighted their potential shortcomings compared to transformers on reasoning and…
Transformer-based models show their effectiveness across multiple domains and tasks. The self-attention allows to combine information from all sequence elements into context-aware representations. However, global and local information has…
Attention-based sequence-to-sequence automatic speech recognition (ASR) requires a significant delay to recognize long utterances because the output is generated after receiving entire input sequences. Although several studies recently…
Most existing Convolutional Neural Networks(CNNs) used for action recognition are either difficult to optimize or underuse crucial temporal information. Inspired by the fact that the recurrent model consistently makes breakthroughs in the…
Past work has long recognized the important role of context in guiding how humans search their memory. While context-based memory models can explain many memory phenomena, it remains unclear why humans develop such architectures over…
In this paper, we propose a recurrent neural network (RNN) with residual attention (RRA) to learn long-range dependencies from sequential data. We propose to add residual connections across timesteps to RNN, which explicitly enhances the…
Recurrent Neural Networks have long been the dominating choice for sequence modeling. However, it severely suffers from two issues: impotent in capturing very long-term dependencies and unable to parallelize the sequential computation…
Self-attention models have been successfully applied in end-to-end speech recognition systems, which greatly improve the performance of recognition accuracy. However, such attention-based models cannot be used in online speech recognition,…
Practical recommender systems need be periodically retrained to refresh the model with new interaction data. To pursue high model fidelity, it is usually desirable to retrain the model on both historical and new data, since it can account…
Transformers have shown dominant performance across a range of domains including language and vision. However, their computational cost grows quadratically with the sequence length, making their usage prohibitive for resource-constrained…
This paper proposes a novel framework for recurrent neural networks (RNNs) inspired by the human memory models in the field of cognitive neuroscience to enhance information processing and transmission between adjacent RNNs' units. The…
Recurrent neural networks (RNNs) have achieved great success in language modeling. However, since the RNNs have fixed size of memory, their memory cannot store all the information about the words it have seen before in the sentence, and…
Sequential dynamics are a key feature of many modern recommender systems, which seek to capture the `context' of users' activities on the basis of actions they have performed recently. To capture such patterns, two approaches have…
Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series. Currently, most of multiscale RNNs use fixed scales,…
With the emergence of massively parallel processing units, parallelization has become a desirable property for new sequence models. The ability to parallelize the processing of sequences with respect to the sequence length during training…
Recurrent neural networks (RNNs) trained using Equilibrium Propagation (EP), a biologically plausible training algorithm, have demonstrated strong performance in various tasks such as image classification and reinforcement learning.…
Recent work has shown the efficiency of deep learning models such as Fully Convolutional Networks (FCN) or Recurrent Neural Networks (RNN) to deal with Time Series Regression (TSR) problems. These models sometimes need a lot of data to be…