Related papers: LM2: Large Memory Models

LDM$^2$: A Large Decision Model Imitating Human Cognition with Dynamic Memory Enhancement

With the rapid development of large language models (LLMs), it is highly demanded that LLMs can be adopted to make decisions to enable the artificial general intelligence. Most approaches leverage manually crafted examples to prompt the…

Machine Learning · Computer Science 2023-12-15 Xingjin Wang , Linjing Li , Daniel Zeng

M$^2$: Dual-Memory Augmentation for Long-Horizon Web Agents via Trajectory Summarization and Insight Retrieval

Multimodal Large Language Models (MLLMs) based agents have demonstrated remarkable potential in autonomous web navigation. However, handling long-horizon tasks remains a critical bottleneck. Prevailing strategies often rely heavily on…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Dawei Yan , Haokui Zhang , Guangda Huzhang , Yang Li , Yibo Wang , Qing-Guo Chen , Zhao Xu , Weihua Luo , Ying Li , Wei Dong , Chunhua Shen

HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing

Transformer-based large language models (LLM) have been widely used in language processing applications. However, due to the memory constraints of the devices, most of them restrict the context window. Even though recurrent models in…

Computation and Language · Computer Science 2025-02-07 Zifan He , Yingqi Cao , Zongyue Qin , Neha Prakriya , Yizhou Sun , Jason Cong

LLM2: Let Large Language Models Harness System 2 Reasoning

Large language models (LLMs) have exhibited impressive capabilities across a myriad of tasks, yet they occasionally yield undesirable outputs. We posit that these limitations are rooted in the foundational autoregressive architecture of…

Computation and Language · Computer Science 2025-03-03 Cheng Yang , Chufan Shi , Siheng Li , Bo Shui , Yujiu Yang , Wai Lam

Hardware-based Heterogeneous Memory Management for Large Language Model Inference

A large language model (LLM) is one of the most important emerging machine learning applications nowadays. However, due to its huge model size and runtime increase of the memory footprint, LLM inferences suffer from the lack of memory…

Hardware Architecture · Computer Science 2025-04-22 Soojin Hwang , Jungwoo Kim , Sanghyeon Lee , Hongbeen Kim , Jaehyuk Huh

Augmenting Language Models with Long-Term Memory

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models…

Computation and Language · Computer Science 2023-06-13 Weizhi Wang , Li Dong , Hao Cheng , Xiaodong Liu , Xifeng Yan , Jianfeng Gao , Furu Wei

MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models

Transformer-based language models (LMs) track contextual information through large, hard-coded input windows. We introduce MemoryPrompt, a leaner approach in which the LM is complemented by a small auxiliary recurrent network that passes…

Computation and Language · Computer Science 2024-02-26 Nathanaël Carraz Rakotonirina , Marco Baroni

Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents

Large Language Models (LLMs) represent a landmark achievement in Artificial Intelligence (AI), demonstrating unprecedented proficiency in procedural tasks such as text generation, code completion, and conversational coherence. These…

Artificial Intelligence · Computer Science 2025-05-07 Schaun Wheeler , Olivier Jeunen

Can Memory-Augmented Language Models Generalize on Reasoning-in-a-Haystack Tasks?

Large language models often expose their brittleness in reasoning tasks, especially while executing long chains of reasoning over context. We propose MemReasoner, a new and simple memory-augmented LLM architecture, in which the memory…

Computation and Language · Computer Science 2025-03-12 Payel Das , Ching-Yun Ko , Sihui Dai , Georgios Kollias , Subhajit Chaudhury , Aurelie Lozano

L2MAC: Large Language Model Automatic Computer for Extensive Code Generation

Transformer-based large language models (LLMs) are constrained by the fixed context window of the underlying transformer architecture, hindering their ability to produce long and coherent outputs. Memory-augmented LLMs are a promising…

Software Engineering · Computer Science 2025-06-30 Samuel Holt , Max Ruiz Luyten , Mihaela van der Schaar

Empowering Working Memory for Large Language Model Agents

Large language models (LLMs) have achieved impressive linguistic capabilities. However, a key limitation persists in their lack of human-like memory faculties. LLMs exhibit constrained memory retention across sequential interactions,…

Computation and Language · Computer Science 2024-05-29 Jing Guo , Nan Li , Jianchuan Qi , Hang Yang , Ruiqiao Li , Yuzhen Feng , Si Zhang , Ming Xu

H$^{2}$MT: Semantic Hierarchy-Aware Hierarchical Memory Transformer

Transformer-based LLMs achieve strong results on many language tasks; however, long inputs remain challenging because context windows are finite, and prefill latency and memory grow rapidly with prompt length. Flat token-stream processing…

Computation and Language · Computer Science 2026-05-26 Maryam Haghifam , Zifan He , Jason Cong , Yizhou Sun

$\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning

Despite demonstrating emergent reasoning abilities, Large Language Models (LLMS) often lose track of complex, multi-step reasoning. Existing studies show that providing guidance via decomposing the original question into multiple…

Computation and Language · Computer Science 2024-04-04 Gurusha Juneja , Subhabrata Dutta , Tanmoy Chakraborty

Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures

Memory is fundamental to intelligence, enabling learning, reasoning, and adaptability across biological and artificial systems. While Transformer architectures excel at sequence modeling, they face critical limitations in long-range context…

Machine Learning · Computer Science 2025-08-19 Parsa Omidi , Xingshuai Huang , Axel Laborieux , Bahareh Nikpour , Tianyu Shi , Armaghan Eshaghi

ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question Answering

The context window of large language models (LLMs) has been extended significantly in recent years. However, while the context length that the LLM can process has grown, the capability of the model to accurately reason over that context…

Computation and Language · Computer Science 2024-10-07 Huayang Li , Pat Verga , Priyanka Sen , Bowen Yang , Vijay Viswanathan , Patrick Lewis , Taro Watanabe , Yixuan Su

Human-like Working Memory Interference in Large Language Models

Intelligent systems must maintain and manipulate task-relevant information online to adapt to dynamic environments and changing goals. This capacity, known as working memory, is fundamental to human reasoning and intelligence. Despite…

Machine Learning · Computer Science 2026-04-14 Hua-Dong Xiong , Li Ji-An , Jiaqi Huang , Robert C. Wilson , Kwonjoon Lee , Xue-Xin Wei

Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models

Recurrent LLM architectures have emerged as a promising approach for improving reasoning, as they enable multi-step computation in the embedding space without generating intermediate tokens. Models such as Ouro perform reasoning by…

Computation and Language · Computer Science 2026-05-20 Victor Conchello Vendrell , Arnau Padres Masdemont , Niccolò Grillo , Jordi Ros-Giralt , Arash Behboodi , Fabio Valerio Massoli

Structured Memory Mechanisms for Stable Context Representation in Large Language Models

This paper addresses the limitations of large language models in understanding long-term context. It proposes a model architecture equipped with a long-term memory mechanism to improve the retention and retrieval of semantic information…

Computation and Language · Computer Science 2025-05-30 Yue Xing , Tao Yang , Yijiashun Qi , Minggu Wei , Yu Cheng , Honghui Xin

Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless, constrained by limited context windows that hinder long-horizon reasoning. Recent efforts to…

Computation and Language · Computer Science 2026-01-15 Sikuan Yan , Xiufeng Yang , Zuchao Huang , Ercong Nie , Zifeng Ding , Zonggen Li , Xiaowen Ma , Jinhe Bi , Kristian Kersting , Jeff Z. Pan , Hinrich Schütze , Volker Tresp , Yunpu Ma

Evaluating Long-Term Memory for Long-Context Question Answering

In order for large language models to achieve true conversational continuity and benefit from experiential learning, they need memory. While research has focused on the development of complex memory systems, it remains unclear which types…

Computation and Language · Computer Science 2025-12-09 Alessandra Terranova , Björn Ross , Alexandra Birch