Related papers: Extending Memory for Language Modelling

Language Modeling through Long Term Memory Network

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can…

Computation and Language · Computer Science 2019-04-22 Anupiya Nugaliyadde , Kok Wai Wong , Ferdous Sohel , Hong Xie

Augmenting Language Models with Long-Term Memory

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models…

Computation and Language · Computer Science 2023-06-13 Weizhi Wang , Li Dong , Hao Cheng , Xiaodong Liu , Xifeng Yan , Jianfeng Gao , Furu Wei

Aspects of human memory and Large Language Models

Large Language Models (LLMs) are huge artificial neural networks which primarily serve to generate text, but also provide a very sophisticated probabilistic model of language use. Since generating a semantically consistent text requires a…

Computation and Language · Computer Science 2024-04-09 Romuald A. Janik

Recurrent Memory Networks for Language Modeling

Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent…

Computation and Language · Computer Science 2016-04-25 Ke Tran , Arianna Bisazza , Christof Monz

Long-Term Memory Networks for Question Answering

Question answering is an important and difficult task in the natural language processing domain, because many basic natural language processing tasks can be cast into a question answering task. Several deep neural network architectures have…

Computation and Language · Computer Science 2017-07-10 Fenglong Ma , Radha Chitta , Saurabh Kataria , Jing Zhou , Palghat Ramesh , Tong Sun , Jing Gao

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of…

Computation and Language · Computer Science 2015-06-02 Kai Sheng Tai , Richard Socher , Christopher D. Manning

LSTMs Exploit Linguistic Attributes of Data

While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM's ability to…

Computation and Language · Computer Science 2019-04-09 Nelson F. Liu , Omer Levy , Roy Schwartz , Chenhao Tan , Noah A. Smith

Learning Longer Memory in Recurrent Neural Networks

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

Evaluating the Ability of LSTMs to Learn Context-Free Grammars

While long short-term memory (LSTM) neural net architectures are designed to capture sequence information, human language is generally composed of hierarchical structures. This raises the question as to whether LSTMs can learn hierarchical…

Computation and Language · Computer Science 2018-11-08 Luzi Sennhauser , Robert C. Berwick

Long Term Memory: The Foundation of AI Self-Evolution

Large language models (LLMs) like GPTs, trained on vast datasets, have demonstrated impressive capabilities in language understanding, reasoning, and planning, achieving human-level performance in various tasks. Most studies focus on…

Artificial Intelligence · Computer Science 2025-05-13 Xun Jiang , Feng Li , Han Zhao , Jiahao Qiu , Jiaying Wang , Jun Shao , Shihao Xu , Shu Zhang , Weiling Chen , Xavier Tang , Yize Chen , Mengyue Wu , Weizhi Ma , Mengdi Wang , Tianqiao Chen

Context based Text-generation using LSTM networks

Long short-term memory(LSTM) units on sequence-based models are being used in translation, question-answering systems, classification tasks due to their capability of learning long-term dependencies. In Natural language generation, LSTM…

Computation and Language · Computer Science 2020-05-04 Sivasurya Santhanam

Long Short-Term Memory Over Tree Structures

The chain-structured long short-term memory (LSTM) has showed to be effective in a wide range of problems such as speech recognition and machine translation. In this paper, we propose to extend it to tree structures, in which a memory cell…

Computation and Language · Computer Science 2015-03-18 Xiaodan Zhu , Parinaz Sobhani , Hongyu Guo

Persistence pays off: Paying Attention to What the LSTM Gating Mechanism Persists

Language Models (LMs) are important components in several Natural Language Processing systems. Recurrent Neural Network LMs composed of LSTM units, especially those augmented with an external memory, have achieved state-of-the-art results.…

Machine Learning · Computer Science 2018-10-11 Giancarlo D. Salton , John D. Kelleher

Visualizing and Understanding Recurrent Networks

Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data.…

Machine Learning · Computer Science 2015-11-18 Andrej Karpathy , Justin Johnson , Li Fei-Fei

StoryBench: A Dynamic Benchmark for Evaluating Long-Term Memory with Multi Turns

Long-term memory (LTM) is essential for large language models (LLMs) to achieve autonomous intelligence in complex, evolving environments. Despite increasing efforts in memory-augmented and retrieval-based architectures, there remains a…

Computation and Language · Computer Science 2025-06-17 Luanbo Wan , Weizhi Ma

Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition

Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks. To achieve a further performance improvement, in this research, deep extensions on…

Computation and Language · Computer Science 2015-05-12 Xiangang Li , Xihong Wu

Large-scale study of human memory for meaningful narratives

The statistical study of human memory requires large-scale experiments, involving many stimuli conditions and test subjects. While this approach has proven to be quite fruitful for meaningless material such as random lists of words,…

Computation and Language · Computer Science 2024-11-26 Antonios Georgiou , Tankut Can , Mikhail Katkov , Misha Tsodyks

Automatic Rule Extraction from Long Short Term Memory Networks

Although deep learning models have proven effective at solving problems in natural language processing, the mechanism by which they come to their conclusions is often unclear. As a result, these models are generally treated as black boxes,…

Computation and Language · Computer Science 2017-02-28 W. James Murdoch , Arthur Szlam

Learning Over Long Time Lags

The advantage of recurrent neural networks (RNNs) in learning dependencies between time-series data has distinguished RNNs from other deep learning models. Recently, many advances are proposed in this emerging field. However, there is a…

Neural and Evolutionary Computing · Computer Science 2016-02-16 Hojjat Salehinejad

A Study on Neural Network Language Modeling

An exhaustive study on neural network language modeling (NNLM) is performed in this paper. Different architectures of basic neural network language models are described and examined. A number of different improvements over basic neural…

Computation and Language · Computer Science 2017-08-25 Dengliang Shi