Related papers: Maelstrom Networks

DecisiveNets: Training Deep Associative Memories to Solve Complex Machine Learning Problems

Learning deep representations to solve complex machine learning tasks has become the prominent trend in the past few years. Indeed, Deep Neural Networks are now the golden standard in domains as various as computer vision, natural language…

Machine Learning · Computer Science 2020-12-04 Vincent Gripon , Carlos Lassance , Ghouthi Boukli Hacene

Drift to Remember

Lifelong learning in artificial intelligence (AI) aims to mimic the biological brain's ability to continuously learn and retain knowledge, yet it faces challenges such as catastrophic forgetting. Recent neuroscience research suggests that…

Artificial Intelligence · Computer Science 2024-09-24 Jin Du , Xinhe Zhang , Hao Shen , Xun Xian , Ganghua Wang , Jiawei Zhang , Yuhong Yang , Na Li , Jia Liu , Jie Ding

Recent Advances in Recurrent Neural Networks

Recurrent neural networks (RNNs) are capable of learning features and long term dependencies from sequential and time-series data. The RNNs have a stack of non-linear units where at least one connection between units forms a directed cycle.…

Neural and Evolutionary Computing · Computer Science 2018-02-26 Hojjat Salehinejad , Sharan Sankar , Joseph Barfett , Errol Colak , Shahrokh Valaee

Human-like Working Memory Interference in Large Language Models

Intelligent systems must maintain and manipulate task-relevant information online to adapt to dynamic environments and changing goals. This capacity, known as working memory, is fundamental to human reasoning and intelligence. Despite…

Machine Learning · Computer Science 2026-04-14 Hua-Dong Xiong , Li Ji-An , Jiaqi Huang , Robert C. Wilson , Kwonjoon Lee , Xue-Xin Wei

Auto-Compressing Networks

Deep neural networks with short residual connections have demonstrated remarkable success across domains, but increasing depth often introduces computational redundancy without corresponding improvements in representation quality. We…

Machine Learning · Computer Science 2025-11-10 Vaggelis Dorovatas , Georgios Paraskevopoulos , Alexandros Potamianos

One-shot Learning with Memory-Augmented Neural Networks

Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through…

Machine Learning · Computer Science 2016-05-20 Adam Santoro , Sergey Bartunov , Matthew Botvinick , Daan Wierstra , Timothy Lillicrap

EWGN: Elastic Weight Generation and Context Switching in Deep Learning

The ability to learn and retain a wide variety of tasks is a hallmark of human intelligence that has inspired research in artificial general intelligence. Continual learning approaches provide a significant step towards achieving this goal.…

Machine Learning · Computer Science 2025-06-04 Shriraj P. Sawant , Krishna P. Miyapuram

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Current deep learning methods are regarded as favorable if they empirically perform well on dedicated test sets. This mentality is seamlessly reflected in the resurfacing area of continual learning, where consecutively arriving data is…

Machine Learning · Computer Science 2023-01-25 Martin Mundt , Yongwon Hong , Iuliia Pliushch , Visvanathan Ramesh

An introduction to artificial neural networks

Artificial neural networks are algorithms which have been developed to tackle a range of computational problems. These range from modelling brain function to making predictions of time-dependent phenomena to solving hard (NP-complete)…

Astrophysics · Physics 2007-05-23 C. A. L. Bailer-Jones , R. Gupta , H. P. Singh

Recurrent Memory Networks for Language Modeling

Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent…

Computation and Language · Computer Science 2016-04-25 Ke Tran , Arianna Bisazza , Christof Monz

Incremental Concept Learning via Online Generative Memory Recall

The ability to learn more and more concepts over time from incrementally arriving data is essential for the development of a life-long learning system. However, deep neural networks often suffer from forgetting previously learned concepts…

Machine Learning · Computer Science 2019-07-08 Huaiyu Li , Weiming Dong , Bao-Gang Hu

Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy

The online learning of deep neural networks is an interesting problem of machine learning because, for example, major IT companies want to manage the information of the massive data uploaded on the web daily, and this technology can…

Machine Learning · Computer Science 2015-06-16 Sang-Woo Lee , Min-Oh Heo , Jiwon Kim , Jeonghee Kim , Byoung-Tak Zhang

Slow manifolds in recurrent networks encode working memory efficiently and robustly

Working memory is a cognitive function involving the storage and manipulation of latent information over brief intervals of time, thus making it crucial for context-dependent computation. Here, we use a top-down modeling approach to examine…

Neurons and Cognition · Quantitative Biology 2021-11-17 Elham Ghazizadeh , ShiNung Ching

Recurrent Connectivity Aids Recognition of Partly Occluded Objects

Feedforward convolutional neural networks are the prevalent model of core object recognition. For challenging conditions, such as occlusion, neuroscientists believe that the recurrent connectivity in the visual cortex aids object…

Computer Vision and Pattern Recognition · Computer Science 2019-09-16 Markus Roland Ernst , Jochen Triesch , Thomas Burwick

Depth Enables Long-Term Memory for Recurrent Neural Networks

A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies. However, a well…

Machine Learning · Computer Science 2020-03-24 Alon Ziv

Learning Various Length Dependence by Dual Recurrent Neural Networks

Recurrent neural networks (RNNs) are widely used as a memory model for sequence-related problems. Many variants of RNN have been proposed to solve the gradient problems of training RNNs and process long sequences. Although some classical…

Neural and Evolutionary Computing · Computer Science 2020-05-29 Chenpeng Zhang , Shuai Li , Mao Ye , Ce Zhu , Xue Li

Forward Thinking: Building Deep Random Forests

The success of deep neural networks has inspired many to wonder whether other learners could benefit from deep, layered architectures. We present a general framework called forward thinking for deep learning that generalizes the…

Machine Learning · Statistics 2017-05-23 Kevin Miller , Chris Hettinger , Jeffrey Humpherys , Tyler Jarvis , David Kartchner

Learning to Continually Learn Rapidly from Few and Noisy Data

Neural networks suffer from catastrophic forgetting and are unable to sequentially learn new tasks without guaranteed stationarity in data distribution. Continual learning could be achieved via replay -- by concurrently training externally…

Machine Learning · Computer Science 2021-03-09 Nicholas I-Hsien Kuo , Mehrtash Harandi , Nicolas Fourrier , Christian Walder , Gabriela Ferraro , Hanna Suominen

Mechanisms for Handling Nested Dependencies in Neural-Network Language Models and Humans

Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern artificial neural network trained with…

Computation and Language · Computer Science 2021-05-04 Yair Lakretz , Dieuwke Hupkes , Alessandra Vergallito , Marco Marelli , Marco Baroni , Stanislas Dehaene

Neural Associative Memory for Dual-Sequence Modeling

Many important NLP problems can be posed as dual-sequence or sequence-to-sequence modeling tasks. Recent advances in building end-to-end neural architectures have been highly successful in solving such tasks. In this work we propose a new…

Neural and Evolutionary Computing · Computer Science 2016-06-15 Dirk Weissenborn