English
Related papers

Related papers: Long Expressive Memory for Sequence Modeling

200 papers

In many sequential tasks, a model needs to remember relevant events from the distant past to make correct predictions. Unfortunately, a straightforward application of gradient based training requires intermediate computations to be stored…

Machine Learning · Computer Science 2023-08-14 Artyom Sorokin , Nazar Buzun , Leonid Pugachev , Mikhail Burtsev

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can…

Computation and Language · Computer Science 2019-04-22 Anupiya Nugaliyadde , Kok Wai Wong , Ferdous Sohel , Hong Xie

One major obstacle towards AI is the poor ability of models to solve new problems quicker, and without forgetting previously acquired knowledge. To better understand this issue, we study the problem of continual learning, where the model…

Machine Learning · Computer Science 2022-09-14 David Lopez-Paz , Marc'Aurelio Ranzato

Breakthroughs in deep learning and memory networks have made major advances in natural language understanding. Language is sequential and information carried through the sequence can be captured through memory networks. Learning the…

Computation and Language · Computer Science 2023-05-22 Anupiya Nugaliyadde

The key challenge of sequence representation learning is to capture the long-range temporal dependencies. Typical methods for supervised sequence representation learning are built upon recurrent neural networks to capture temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-07-21 Wenjie Pei , Xin Feng , Canmiao Fu , Qiong Cao , Guangming Lu , Yu-Wing Tai

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

Current generation of memory-augmented neural networks has limited scalability as they cannot efficiently process data that are too large to fit in the external memory storage. One example of this is lifelong learning scenario where the…

Machine Learning · Computer Science 2018-12-12 Hyunwoo Jung , Moonsu Han , Minki Kang , Sungju Hwang

Typical methods for supervised sequence modeling are built upon the recurrent neural networks to capture temporal dependencies. One potential limitation of these methods is that they only model explicitly information interactions between…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Canmiao Fu , Wenjie Pei , Qiong Cao , Chaopeng Zhang , Yong Zhao , Xiaoyong Shen , Yu-Wing Tai

Sequence labeling is a fundamental problem in machine learning, natural language processing and many other fields. A classic approach to sequence labeling is linear chain conditional random fields (CRFs). When combined with neural network…

Machine Learning · Computer Science 2020-11-11 Yang Zhou , Yong Jiang , Zechuan Hu , Kewei Tu

This study explores the application of self-supervised learning techniques for event sequences. It is a key modality in various applications such as banking, e-commerce, and healthcare. However, there is limited research on self-supervised…

Embedding learning, a.k.a. representation learning, has been shown to be able to model large-scale semantic knowledge graphs. A key concept is a mapping of the knowledge graph to a tensor representation whose entries are predicted by models…

Artificial Intelligence · Computer Science 2016-05-10 Volker Tresp , Cristóbal Esteban , Yinchong Yang , Stephan Baier , Denis Krompaß

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models…

Computation and Language · Computer Science 2023-06-13 Weizhi Wang , Li Dong , Hao Cheng , Xiaodong Liu , Xifeng Yan , Jianfeng Gao , Furu Wei

Biological cortical neurons are remarkably sophisticated computational devices, temporally integrating their vast synaptic input over an intricate dendritic tree, subject to complex, nonlinearly interacting internal biological processes. A…

Neural and Evolutionary Computing · Computer Science 2024-03-19 Aaron Spieler , Nasim Rahaman , Georg Martius , Bernhard Schölkopf , Anna Levina

Semiparametric language models (LMs) have shown promise in continuously learning from new text data by combining a parameterized neural LM with a growable non-parametric memory for memorizing new content. However, conventional…

Computation and Language · Computer Science 2023-03-03 Guangyue Peng , Tao Ge , Si-Qing Chen , Furu Wei , Houfeng Wang

Extreme learning machine (ELM) is an extremely fast learning method and has a powerful performance for pattern recognition tasks proven by enormous researches and engineers. However, its good generalization ability is built on large numbers…

Machine Learning · Computer Science 2015-02-05 Wentao Zhu , Jun Miao , Laiyun Qing

Deep directed generative models have attracted much attention recently due to their expressive representation power and the ability of ancestral sampling. One major difficulty of learning directed models with many latent variables is the…

Machine Learning · Computer Science 2015-06-16 Siqi Nie , Qiang Ji

Memory plays a central role in enabling large language models (LLMs) to operate over sequential tasks by accumulating and reusing experience over time. However, existing evaluations of LLM memory mostly rely on aggregate metrics such as…

Machine Learning · Computer Science 2026-05-18 Songwei Dong , Zihan Chen , Chengshuai Shi , Peng Wang , Jundong Li , Cong Shen

Modeling episodic memory (EM) remains a significant challenge in both neuroscience and AI, with existing models either lacking interpretability or struggling with practical applications. This paper proposes the Vision-Language Episodic…

Neurons and Cognition · Quantitative Biology 2025-05-09 Chong Li , Taiping Zeng , Xiangyang Xue , Jianfeng Feng

Long-term memory is essential for conversational agents to maintain coherence, track persistent tasks, and provide personalized interactions across extended dialogues. However, existing approaches as Retrieval-Augmented Generation (RAG) and…

Computation and Language · Computer Science 2026-04-13 Juwei Yue , Chuanrui Hu , Jiawei Sheng , Zuyi Zhou , Wenyuan Zhang , Tingwen Liu , Li Guo , Yafeng Deng

In order for large language models to achieve true conversational continuity and benefit from experiential learning, they need memory. While research has focused on the development of complex memory systems, it remains unclear which types…

Computation and Language · Computer Science 2025-12-09 Alessandra Terranova , Björn Ross , Alexandra Birch
‹ Prev 1 2 3 10 Next ›