Related papers: Long Expressive Memory for Sequence Modeling

Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

In many sequential tasks, a model needs to remember relevant events from the distant past to make correct predictions. Unfortunately, a straightforward application of gradient based training requires intermediate computations to be stored…

Machine Learning · Computer Science 2023-08-14 Artyom Sorokin , Nazar Buzun , Leonid Pugachev , Mikhail Burtsev

Language Modeling through Long Term Memory Network

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can…

Computation and Language · Computer Science 2019-04-22 Anupiya Nugaliyadde , Kok Wai Wong , Ferdous Sohel , Hong Xie

Gradient Episodic Memory for Continual Learning

One major obstacle towards AI is the poor ability of models to solve new problems quicker, and without forgetting previously acquired knowledge. To better understand this issue, we study the problem of continual learning, where the model…

Machine Learning · Computer Science 2022-09-14 David Lopez-Paz , Marc'Aurelio Ranzato

Extending Memory for Language Modelling

Breakthroughs in deep learning and memory networks have made major advances in natural language understanding. Language is sequential and information carried through the sequence can be captured through memory networks. Learning the…

Computation and Language · Computer Science 2023-05-22 Anupiya Nugaliyadde

Learning Sequence Representations by Non-local Recurrent Neural Memory

The key challenge of sequence representation learning is to capture the long-range temporal dependencies. Typical methods for supervised sequence representation learning are built upon recurrent neural networks to capture temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-07-21 Wenjie Pei , Xin Feng , Canmiao Fu , Qiong Cao , Guangming Lu , Yu-Wing Tai

Learning Longer Memory in Recurrent Neural Networks

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Current generation of memory-augmented neural networks has limited scalability as they cannot efficiently process data that are too large to fit in the external memory storage. One example of this is lifelong learning scenario where the…

Machine Learning · Computer Science 2018-12-12 Hyunwoo Jung , Moonsu Han , Minki Kang , Sungju Hwang

Non-local Recurrent Neural Memory for Supervised Sequence Modeling

Typical methods for supervised sequence modeling are built upon the recurrent neural networks to capture temporal dependencies. One potential limitation of these methods is that they only model explicitly information interactions between…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Canmiao Fu , Wenjie Pei , Qiong Cao , Chaopeng Zhang , Yong Zhao , Xiaoyong Shen , Yu-Wing Tai

Neural Latent Dependency Model for Sequence Labeling

Sequence labeling is a fundamental problem in machine learning, natural language processing and many other fields. A classic approach to sequence labeling is linear chain conditional random fields (CRFs). When combined with neural network…

Machine Learning · Computer Science 2020-11-11 Yang Zhou , Yong Jiang , Zechuan Hu , Kewei Tu

MLEM: Generative and Contrastive Learning as Distinct Modalities for Event Sequences

This study explores the application of self-supervised learning techniques for event sequences. It is a key modality in various applications such as banking, e-commerce, and healthcare. However, there is limited research on self-supervised…

Machine Learning · Computer Science 2024-07-04 Viktor Moskvoretskii , Dmitry Osin , Egor Shvetsov , Igor Udovichenko , Maxim Zhelnin , Andrey Dukhovny , Anna Zhimerikina , Evgeny Burnaev

Learning with Memory Embeddings

Embedding learning, a.k.a. representation learning, has been shown to be able to model large-scale semantic knowledge graphs. A key concept is a mapping of the knowledge graph to a tensor representation whose entries are predicted by models…

Artificial Intelligence · Computer Science 2016-05-10 Volker Tresp , Cristóbal Esteban , Yinchong Yang , Stephan Baier , Denis Krompaß

Augmenting Language Models with Long-Term Memory

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models…

Computation and Language · Computer Science 2023-06-13 Weizhi Wang , Li Dong , Hao Cheng , Xiaodong Liu , Xifeng Yan , Jianfeng Gao , Furu Wei

The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks

Biological cortical neurons are remarkably sophisticated computational devices, temporally integrating their vast synaptic input over an intricate dendritic tree, subject to complex, nonlinearly interacting internal biological processes. A…

Neural and Evolutionary Computing · Computer Science 2024-03-19 Aaron Spieler , Nasim Rahaman , Georg Martius , Bernhard Schölkopf , Anna Levina

Semiparametric Language Models Are Scalable Continual Learners

Semiparametric language models (LMs) have shown promise in continuously learning from new text data by combining a parameterized neural LM with a growable non-parametric memory for memorizing new content. However, conventional…

Computation and Language · Computer Science 2023-03-03 Guangyue Peng , Tao Ge , Si-Qing Chen , Furu Wei , Houfeng Wang

Constrained Extreme Learning Machines: A Study on Classification Cases

Extreme learning machine (ELM) is an extremely fast learning method and has a powerful performance for pattern recognition tasks proven by enormous researches and engineers. However, its good generalization ability is built on large numbers…

Machine Learning · Computer Science 2015-02-05 Wentao Zhu , Jun Miao , Laiyun Qing

Latent Regression Bayesian Network for Data Representation

Deep directed generative models have attracted much attention recently due to their expressive representation power and the ability of ancestral sampling. One major difficulty of learning directed models with many latent variables is the…

Machine Learning · Computer Science 2015-06-16 Siqi Nie , Qiang Ji

Is One Score Enough? Rethinking the Evaluation of Sequentially Evolving LLM Memory

Memory plays a central role in enabling large language models (LLMs) to operate over sequential tasks by accumulating and reusing experience over time. However, existing evaluations of LLM memory mostly rely on aggregate metrics such as…

Machine Learning · Computer Science 2026-05-18 Songwei Dong , Zihan Chen , Chengshuai Shi , Peng Wang , Jundong Li , Cong Shen

Towards a Vision-Language Episodic Memory Framework: Large-scale Pretrained Model-Augmented Hippocampal Attractor Dynamics

Modeling episodic memory (EM) remains a significant challenge in both neuroscience and AI, with existing models either lacking interpretability or struggling with practical applications. This paper proposes the Vision-Language Episodic…

Neurons and Cognition · Quantitative Biology 2025-05-09 Chong Li , Taiping Zeng , Xiangyang Xue , Jianfeng Feng

HyperMem: Hypergraph Memory for Long-Term Conversations

Long-term memory is essential for conversational agents to maintain coherence, track persistent tasks, and provide personalized interactions across extended dialogues. However, existing approaches as Retrieval-Augmented Generation (RAG) and…

Computation and Language · Computer Science 2026-04-13 Juwei Yue , Chuanrui Hu , Jiawei Sheng , Zuyi Zhou , Wenyuan Zhang , Tingwen Liu , Li Guo , Yafeng Deng

Evaluating Long-Term Memory for Long-Context Question Answering

In order for large language models to achieve true conversational continuity and benefit from experiential learning, they need memory. While research has focused on the development of complex memory systems, it remains unclear which types…

Computation and Language · Computer Science 2025-12-09 Alessandra Terranova , Björn Ross , Alexandra Birch