Related papers: MEME: Multi-entity & Evolving Memory Evaluation

Task Memory Engine: Spatial Memory for Robust Multi-Step LLM Agents

Large Language Models (LLMs) falter in multi-step interactions -- often hallucinating, repeating actions, or misinterpreting user corrections -- due to reliance on linear, unstructured context. This fragility stems from the lack of…

Artificial Intelligence · Computer Science 2025-05-27 Ye Ye

E-mem: Multi-agent based Episodic Context Reconstruction for LLM Agent Memory

The evolution of Large Language Model (LLM) agents towards System~2 reasoning, characterized by deliberative, high-precision problem-solving, requires maintaining rigorous logical integrity over extended horizons. However, prevalent memory…

Artificial Intelligence · Computer Science 2026-05-15 Kaixiang Wang , Yidan Lin , Jiong Lou , Zhaojiacheng Zhou , Bunyod Suvonov , Jie Li

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Long-term agent memory is increasingly multimodal, yet existing evaluations rarely test whether agents preserve the visual evidence needed for later reasoning. In prior work, many visually grounded questions can be answered using only…

Computer Vision and Pattern Recognition · Computer Science 2026-05-15 Minghao Guo , Qingyue Jiao , Zeru Shi , Yihao Quan , Boxuan Zhang , Danrui Li , Liwei Che , Wujiang Xu , Shilong Liu , Zirui Liu , Mubbasir Kapadia , Vladimir Pavlovic , Jiang Liu , Mengdi Wang , Yiyu Shi , Dimitris N. Metaxas , Ruixiang Tang

AEL: Agent Evolving Learning for Open-Ended Environments

LLM agents increasingly operate in open-ended environments spanning hundreds of sequential episodes, yet they remain largely stateless: each task is solved from scratch without converting past experience into better future behavior. The…

Computation and Language · Computer Science 2026-04-24 Wujiang Xu , Jiaojiao Han , Minghao Guo , Kai Mei , Xi Zhu , Han Zhang , Dimitris N. Metaxas

EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

Recent benchmarks for Large Language Model (LLM) agents mainly evaluate reasoning, planning, and execution. However, memory is also essential for agents, as it enables them to store, update, and retrieve information over time. This ability…

Computation and Language · Computer Science 2026-05-19 Yuyao Wang , Zhongjian Zhang , Mo Chi , Kaichi Yu , Yuhan Li , Miao Peng , Bing Tong , Chen Zhang , Yan Zhou , Jia Li

STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?

Large Language Model (LLM) agents are increasingly expected to maintain coherent, long-term personalized memory, yet current benchmarks primarily measure static fact retrieval, overlooking the ability to revise stored beliefs when new…

Computation and Language · Computer Science 2026-05-08 Hanxiang Chao , Yihan Bai , Rui Sheng , Tianle Li , Yushi Sun

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

Modern language agents must operate over long-horizon, multi-turn interactions, where they retrieve external information, adapt to observations, and answer interdependent queries. Yet, most LLM systems rely on full-context prompting,…

Computation and Language · Computer Science 2025-07-18 Zijian Zhou , Ao Qu , Zhaoxuan Wu , Sunghwan Kim , Alok Prakash , Daniela Rus , Jinhua Zhao , Bryan Kian Hsiang Low , Paul Pu Liang

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

Statefulness is essential for large language model (LLM) agents to perform long-term planning and problem-solving. This makes memory a critical component, yet its management and evolution remain largely underexplored. Existing evaluations…

Computation and Language · Computer Science 2026-05-19 Tianxin Wei , Noveen Sachdeva , Benjamin Coleman , Zhankui He , Yuanchen Bei , Xuying Ning , Mengting Ai , Yunzhe Li , Jingrui He , Ed H. Chi , Chi Wang , Shuo Chen , Fernando Pereira , Wang-Cheng Kang , Derek Zhiyuan Cheng

EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents

Long-term memory is essential for LLM agents that operate across multiple sessions, yet existing memory systems treat retrieval infrastructure as fixed: stored content evolves while scoring functions, fusion strategies, and…

Machine Learning · Computer Science 2026-05-15 Jiaqi Liu , Xinyu Ye , Peng Xia , Zeyu Zheng , Cihang Xie , Mingyu Ding , Huaxiu Yao

Task Memory Engine (TME): Enhancing State Awareness for Multi-Step LLM Agent Tasks

Large Language Models (LLMs) are increasingly used as autonomous agents for multi-step tasks. However, most existing frameworks fail to maintain a structured understanding of the task state, often relying on linear prompt concatenation or…

Artificial Intelligence · Computer Science 2025-08-26 Ye Ye

Self-Evolving Multi-Agent Systems via Decentralized Memory

Self-evolving multi-agent systems (MAS) have emerged as a promising route to LLM agents that continually improve from experience, with persistent memory at their foundation. However, existing designs almost exclusively adopt a centralized…

Multiagent Systems · Computer Science 2026-05-22 Guangya Hao , Yunbo Long , Zhuokai Zhao

Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution

Procedural memory enables large language model (LLM) agents to internalize "how-to" knowledge, theoretically reducing redundant trial-and-error. However, existing frameworks predominantly suffer from a "passive accumulation" paradigm,…

Artificial Intelligence · Computer Science 2026-04-16 Zouying Cao , Jiaji Deng , Li Yu , Weikang Zhou , Zhaoyang Liu , Bolin Ding , Hai Zhao

GroupMemBench: Benchmarking LLM Agent Memory in Multi-Party Conversations

Large Language Model (LLM) agents increasingly serve as personal assistants and workplace collaborators, where their utility depends on memory systems that extract, retrieve, and apply information across long-running conversations. However,…

Computation and Language · Computer Science 2026-05-19 Jingbo Yang , Kwei-Herng Lai , Xiaowen Wang , Shiyu Chang , Yaar Harari , Evgeniy Gabrilovich

Memory for Autonomous LLM Agents:Mechanisms, Evaluation, and Emerging Frontiers

Large language model (LLM) agents increasingly operate in settings where a single context window is far too small to capture what has happened, what was learned, and what should not be repeated. Memory -- the ability to persist, organize,…

Artificial Intelligence · Computer Science 2026-03-10 Pengfei Du

MemEmo: Evaluating Emotion in Memory Systems of Agents

Memory systems address the challenge of context loss in Large Language Model during prolonged interactions. However, compared to human cognition, the efficacy of these systems in processing emotion-related information remains inconclusive.…

Computation and Language · Computer Science 2026-03-02 Peng Liu , Zhen Tao , Jihao Zhao , Ding Chen , Yansong Zhang , Cuiping Li , Zhiyu Li , Hong Chen

Is One Score Enough? Rethinking the Evaluation of Sequentially Evolving LLM Memory

Memory plays a central role in enabling large language models (LLMs) to operate over sequential tasks by accumulating and reusing experience over time. However, existing evaluations of LLM memory mostly rely on aggregate metrics such as…

Machine Learning · Computer Science 2026-05-18 Songwei Dong , Zihan Chen , Chengshuai Shi , Peng Wang , Jundong Li , Cong Shen

MemEvolve: Meta-Evolution of Agent Memory Systems

Self-evolving memory systems are unprecedentedly reshaping the evolutionary paradigm of large language model (LLM)-based agents. Prior work has predominantly relied on manually engineered memory architectures to store trajectories, distill…

Computation and Language · Computer Science 2025-12-23 Guibin Zhang , Haotian Ren , Chong Zhan , Zhenhong Zhou , Junhao Wang , He Zhu , Wangchunshu Zhou , Shuicheng Yan

Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions

Recent benchmarks for Large Language Model (LLM) agents primarily focus on evaluating reasoning, planning, and execution capabilities, while another critical component-memory, encompassing how agents memorize, update, and retrieve long-term…

Computation and Language · Computer Science 2026-03-19 Yuanzhe Hu , Yu Wang , Julian McAuley

Memp: Exploring Agent Procedural Memory

Large Language Models (LLMs) based agents excel at diverse tasks, yet they suffer from brittle procedural memory that is manually engineered or entangled in static parameters. In this work, we investigate strategies to endow agents with a…

Computation and Language · Computer Science 2026-04-16 Runnan Fang , Yuan Liang , Xiaobin Wang , Jialong Wu , Shuofei Qiao , Pengjun Xie , Fei Huang , Huajun Chen , Ningyu Zhang

EvoMem: Improving Multi-Agent Planning with Dual-Evolving Memory

Planning has been a cornerstone of artificial intelligence for solving complex problems, and recent progress in LLM-based multi-agent frameworks have begun to extend this capability. However, the role of human-like memory within these…

Multiagent Systems · Computer Science 2025-12-09 Wenzhe Fan , Ning Yan , Masood Mortazavi