Related papers: Multigrid Neural Memory
Intelligence necessitates memory. Without memory, humans fail to perform various nontrivial tasks such as reading novels, playing games or solving maths. As the ultimate goal of machine learning is to derive intelligent systems that learn…
We propose a multigrid extension of convolutional neural networks (CNNs). Rather than manipulating representations living on a single spatial grid, our network layers operate across scale space, on a pyramid of grids. They consume multigrid…
Memory is fundamental to intelligence, enabling learning, reasoning, and adaptability across biological and artificial systems. While Transformer architectures excel at sequence modeling, they face critical limitations in long-range context…
A remarkable capacity of the brain is its ability to autonomously reorganize memories during offline periods. Memory replay, a mechanism hypothesized to underlie biological offline learning, has inspired offline methods for reducing…
Seeking effective neural networks is a critical and practical field in deep learning. Besides designing the depth, type of convolution, normalization, and nonlinearities, the topological connectivity of neural networks is also important.…
Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled.…
The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only…
Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficulty of training memory models remains a problem obstructing the widespread use of such models. In this…
Memory units have been widely used to enrich the capabilities of deep networks on capturing long-term dependencies in reasoning and prediction tasks, but little investigation exists on deep generative models (DGMs) which are good at…
In this paper, we propose a novel ensembling technique for deep neural networks, which is able to drastically reduce the required memory compared to alternative approaches. In particular, we propose to extract multiple sub-networks from a…
Continual acquisition of novel experience without interfering previously learned knowledge, i.e. continual learning, is critical for artificial neural networks, but limited by catastrophic forgetting. A neural network adjusts its parameters…
The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be…
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through…
Memory networks are neural networks with an explicit memory component that can be both read and written to by the network. The memory is often addressed in a soft way using a softmax function, making end-to-end training with backpropagation…
Memory is an important cognitive function for humans. How a brain with such a small power can complete such a complex memory function, the working mechanism behind this is undoubtedly fascinating. Engram theory views memory as the…
Temporal networks are widely used models for describing the architecture of complex systems. Network memory -- that is the dependence of a temporal network's structure on its past -- has been shown to play a prominent role in diffusion,…
Many scientific areas, from computer science to the environmental sciences and finance, give rise to multivariate time series which exhibit long memory, or loosely put, a slow decay in their autocorrelation structure. Efficient modelling…
Traditional neural networks require enormous amounts of data to build their complex mappings during a slow training procedure that hinders their abilities for relearning and adapting to new data. Memory-augmented neural networks enhance…
Memories in the brain are separated in two categories: short-term and long-term memories. Long-term memories remain for a lifetime, while short-term ones exist from a few milliseconds to a few minutes. Within short-term memory studies,…
We present a new distributed representation in deep neural nets wherein the information is represented in native form as a matrix. This differs from current neural architectures that rely on vector representations. We consider matrices as…