Related papers: Maelstrom Networks

Recurrent Memory Transformer

Transformer-based models show their effectiveness across multiple domains and tasks. The self-attention allows to combine information from all sequence elements into context-aware representations. However, global and local information has…

Computation and Language · Computer Science 2022-12-09 Aydar Bulatov , Yuri Kuratov , Mikhail S. Burtsev

Neural network design for J function approximation in dynamic programming

This paper shows that a new type of artificial neural network (ANN) -- the Simultaneous Recurrent Network (SRN) -- can, if properly trained, solve a difficult function approximation problem which conventional ANNs -- either feedforward or…

adap-org · Physics 2007-05-23 X. Pang , P. Werbos

Adaptive Compression-based Lifelong Learning

The problem of a deep learning model losing performance on a previously learned task when fine-tuned to a new one is a phenomenon known as Catastrophic forgetting. There are two major ways to mitigate this problem: either preserving…

Computer Vision and Pattern Recognition · Computer Science 2019-07-24 Shivangi Srivastava , Maxim Berman , Matthew B. Blaschko , Devis Tuia

Combining Recurrent Neural Networks and Adversarial Training for Human Motion Synthesis and Control

This paper introduces a new generative deep learning network for human motion synthesis and control. Our key idea is to combine recurrent neural networks (RNNs) and adversarial training for human motion modeling. We first describe an…

Graphics · Computer Science 2018-06-25 Zhiyong Wang , Jinxiang Chai , Shihong Xia

Long Timescale Credit Assignment in NeuralNetworks with External Memory

Credit assignment in traditional recurrent neural networks usually involves back-propagating through a long chain of tied weight matrices. The length of this chain scales linearly with the number of time-steps as the same network is run at…

Artificial Intelligence · Computer Science 2017-01-17 Steven Stenberg Hansen

Working Memory for Online Memory Binding Tasks: A Hybrid Model

Working Memory is the brain module that holds and manipulates information online. In this work, we design a hybrid model in which a simple feed-forward network is coupled to a balanced random network via a read-write vector called the…

Neural and Evolutionary Computing · Computer Science 2020-12-15 Seyed Mohammad Mahdi Heidarpoor Yazdi , Abdolhossein Abbassian

A neural network model of perception and reasoning

How perception and reasoning arise from neuronal network activity is poorly understood. This is reflected in the fundamental limitations of connectionist artificial intelligence, typified by deep neural networks trained via gradient-based…

Artificial Intelligence · Computer Science 2020-02-27 Paul J. Blazek , Milo M. Lin

Pseudo-Rehearsal: Achieving Deep Reinforcement Learning without Catastrophic Forgetting

Neural networks can achieve excellent results in a wide variety of applications. However, when they attempt to sequentially learn, they tend to learn the new task while catastrophically forgetting previous ones. We propose a model that…

Machine Learning · Computer Science 2020-12-18 Craig Atkinson , Brendan McCane , Lech Szymanski , Anthony Robins

Schematic Memory Persistence and Transience for Efficient and Robust Continual Learning

Continual learning is considered a promising step towards next-generation Artificial Intelligence (AI), where deep neural networks (DNNs) make decisions by continuously learning a sequence of different tasks akin to human learning…

Machine Learning · Computer Science 2021-05-06 Yuyang Gao , Giorgio A. Ascoli , Liang Zhao

Towards Training Recurrent Neural Networks for Lifelong Learning

Catastrophic forgetting and capacity saturation are the central challenges of any parametric lifelong learning system. In this work, we study these challenges in the context of sequential supervised learning with an emphasis on recurrent…

Machine Learning · Computer Science 2019-09-10 Shagun Sodhani , Sarath Chandar , Yoshua Bengio

Recurrent neural networks: vanishing and exploding gradients are not the end of the story

Recurrent neural networks (RNNs) notoriously struggle to learn long-term memories, primarily due to vanishing and exploding gradients. The recent success of state-space models (SSMs), a subclass of RNNs, to overcome such difficulties…

Machine Learning · Computer Science 2024-11-06 Nicolas Zucchet , Antonio Orvieto

Meta-Learning Neural Bloom Filters

There has been a recent trend in training neural networks to replace data structures that have been crafted by hand, with an aim for faster execution, better accuracy, or greater compression. In this setting, a neural data structure is…

Machine Learning · Computer Science 2019-06-12 Jack W Rae , Sergey Bartunov , Timothy P Lillicrap

A Comprehensive Study of Class Incremental Learning Algorithms for Visual Tasks

The ability of artificial agents to increment their capabilities when confronted with new data is an open challenge in artificial intelligence. The main challenge faced in such cases is catastrophic forgetting, i.e., the tendency of neural…

Machine Learning · Computer Science 2020-12-16 Eden Belouadah , Adrian Popescu , Ioannis Kanellos

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labelled data, which has little knowledge behind the…

Machine Learning · Computer Science 2021-05-11 Zeke Xie , Fengxiang He , Shaopeng Fu , Issei Sato , Dacheng Tao , Masashi Sugiyama

Learning to Remember, Forget and Ignore using Attention Control in Memory

Typical neural networks with external memory do not effectively separate capacity for episodic and working memory as is required for reasoning in humans. Applying knowledge gained from psychological studies, we designed a new model called…

Machine Learning · Computer Science 2018-10-01 T. S. Jayram , Younes Bouhadjar , Ryan L. McAvoy , Tomasz Kornuta , Alexis Asseman , Kamil Rocki , Ahmet S. Ozcan

Overcome Anterograde Forgetting with Cycled Memory Networks

Learning from a sequence of tasks for a lifetime is essential for an agent towards artificial general intelligence. This requires the agent to continuously learn and memorize new knowledge without interference. This paper first demonstrates…

Machine Learning · Computer Science 2021-12-07 Jian Peng , Dingqi Ye , Bo Tang , Yinjie Lei , Yu Liu , Haifeng Li

HyperNetworks

This work explores hypernetworks: an approach of using a one network, also known as a hypernetwork, to generate the weights for another network. Hypernetworks provide an abstraction that is similar to what is found in nature: the…

Machine Learning · Computer Science 2016-12-02 David Ha , Andrew Dai , Quoc V. Le

Relational recurrent neural networks

Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods. It is unclear, however, whether they also have an ability to perform complex relational reasoning with the information they…

Machine Learning · Computer Science 2018-06-29 Adam Santoro , Ryan Faulkner , David Raposo , Jack Rae , Mike Chrzanowski , Theophane Weber , Daan Wierstra , Oriol Vinyals , Razvan Pascanu , Timothy Lillicrap

Which Neural Network Architecture matches Human Behavior in Artificial Grammar Learning?

In recent years artificial neural networks achieved performance close to or better than humans in several domains: tasks that were previously human prerogatives, such as language processing, have witnessed remarkable improvements in state…

Neurons and Cognition · Quantitative Biology 2019-02-14 Andrea Alamia , Victor Gauducheau , Dimitri Paisios , Rufin VanRullen

Convolutional Neural Associative Memories: Massive Capacity with Noise Tolerance

The task of a neural associative memory is to retrieve a set of previously memorized patterns from their noisy versions using a network of neurons. An ideal network should have the ability to 1) learn a set of patterns as they arrive, 2)…

Neural and Evolutionary Computing · Computer Science 2014-07-25 Amin Karbasi , Amir Hesam Salavati , Amin Shokrollahi