English
Related papers

Related papers: LM2: Large Memory Models

200 papers

With the rapid development of large language models (LLMs), it is highly demanded that LLMs can be adopted to make decisions to enable the artificial general intelligence. Most approaches leverage manually crafted examples to prompt the…

Machine Learning · Computer Science 2023-12-15 Xingjin Wang , Linjing Li , Daniel Zeng

Multimodal Large Language Models (MLLMs) based agents have demonstrated remarkable potential in autonomous web navigation. However, handling long-horizon tasks remains a critical bottleneck. Prevailing strategies often rely heavily on…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Dawei Yan , Haokui Zhang , Guangda Huzhang , Yang Li , Yibo Wang , Qing-Guo Chen , Zhao Xu , Weihua Luo , Ying Li , Wei Dong , Chunhua Shen

Transformer-based large language models (LLM) have been widely used in language processing applications. However, due to the memory constraints of the devices, most of them restrict the context window. Even though recurrent models in…

Computation and Language · Computer Science 2025-02-07 Zifan He , Yingqi Cao , Zongyue Qin , Neha Prakriya , Yizhou Sun , Jason Cong

Large language models (LLMs) have exhibited impressive capabilities across a myriad of tasks, yet they occasionally yield undesirable outputs. We posit that these limitations are rooted in the foundational autoregressive architecture of…

Computation and Language · Computer Science 2025-03-03 Cheng Yang , Chufan Shi , Siheng Li , Bo Shui , Yujiu Yang , Wai Lam

A large language model (LLM) is one of the most important emerging machine learning applications nowadays. However, due to its huge model size and runtime increase of the memory footprint, LLM inferences suffer from the lack of memory…

Hardware Architecture · Computer Science 2025-04-22 Soojin Hwang , Jungwoo Kim , Sanghyeon Lee , Hongbeen Kim , Jaehyuk Huh

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models…

Computation and Language · Computer Science 2023-06-13 Weizhi Wang , Li Dong , Hao Cheng , Xiaodong Liu , Xifeng Yan , Jianfeng Gao , Furu Wei

Transformer-based language models (LMs) track contextual information through large, hard-coded input windows. We introduce MemoryPrompt, a leaner approach in which the LM is complemented by a small auxiliary recurrent network that passes…

Computation and Language · Computer Science 2024-02-26 Nathanaël Carraz Rakotonirina , Marco Baroni

Large Language Models (LLMs) represent a landmark achievement in Artificial Intelligence (AI), demonstrating unprecedented proficiency in procedural tasks such as text generation, code completion, and conversational coherence. These…

Artificial Intelligence · Computer Science 2025-05-07 Schaun Wheeler , Olivier Jeunen

Large language models often expose their brittleness in reasoning tasks, especially while executing long chains of reasoning over context. We propose MemReasoner, a new and simple memory-augmented LLM architecture, in which the memory…

Computation and Language · Computer Science 2025-03-12 Payel Das , Ching-Yun Ko , Sihui Dai , Georgios Kollias , Subhajit Chaudhury , Aurelie Lozano

Transformer-based large language models (LLMs) are constrained by the fixed context window of the underlying transformer architecture, hindering their ability to produce long and coherent outputs. Memory-augmented LLMs are a promising…

Software Engineering · Computer Science 2025-06-30 Samuel Holt , Max Ruiz Luyten , Mihaela van der Schaar

Large language models (LLMs) have achieved impressive linguistic capabilities. However, a key limitation persists in their lack of human-like memory faculties. LLMs exhibit constrained memory retention across sequential interactions,…

Computation and Language · Computer Science 2024-05-29 Jing Guo , Nan Li , Jianchuan Qi , Hang Yang , Ruiqiao Li , Yuzhen Feng , Si Zhang , Ming Xu

Transformer-based LLMs achieve strong results on many language tasks; however, long inputs remain challenging because context windows are finite, and prefill latency and memory grow rapidly with prompt length. Flat token-stream processing…

Computation and Language · Computer Science 2026-05-26 Maryam Haghifam , Zifan He , Jason Cong , Yizhou Sun

Despite demonstrating emergent reasoning abilities, Large Language Models (LLMS) often lose track of complex, multi-step reasoning. Existing studies show that providing guidance via decomposing the original question into multiple…

Computation and Language · Computer Science 2024-04-04 Gurusha Juneja , Subhabrata Dutta , Tanmoy Chakraborty

Memory is fundamental to intelligence, enabling learning, reasoning, and adaptability across biological and artificial systems. While Transformer architectures excel at sequence modeling, they face critical limitations in long-range context…

Machine Learning · Computer Science 2025-08-19 Parsa Omidi , Xingshuai Huang , Axel Laborieux , Bahareh Nikpour , Tianyu Shi , Armaghan Eshaghi

The context window of large language models (LLMs) has been extended significantly in recent years. However, while the context length that the LLM can process has grown, the capability of the model to accurately reason over that context…

Computation and Language · Computer Science 2024-10-07 Huayang Li , Pat Verga , Priyanka Sen , Bowen Yang , Vijay Viswanathan , Patrick Lewis , Taro Watanabe , Yixuan Su

Intelligent systems must maintain and manipulate task-relevant information online to adapt to dynamic environments and changing goals. This capacity, known as working memory, is fundamental to human reasoning and intelligence. Despite…

Machine Learning · Computer Science 2026-04-14 Hua-Dong Xiong , Li Ji-An , Jiaqi Huang , Robert C. Wilson , Kwonjoon Lee , Xue-Xin Wei

Recurrent LLM architectures have emerged as a promising approach for improving reasoning, as they enable multi-step computation in the embedding space without generating intermediate tokens. Models such as Ouro perform reasoning by…

This paper addresses the limitations of large language models in understanding long-term context. It proposes a model architecture equipped with a long-term memory mechanism to improve the retention and retrieval of semantic information…

Computation and Language · Computer Science 2025-05-30 Yue Xing , Tao Yang , Yijiashun Qi , Minggu Wei , Yu Cheng , Honghui Xin

Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless, constrained by limited context windows that hinder long-horizon reasoning. Recent efforts to…

In order for large language models to achieve true conversational continuity and benefit from experiential learning, they need memory. While research has focused on the development of complex memory systems, it remains unclear which types…

Computation and Language · Computer Science 2025-12-09 Alessandra Terranova , Björn Ross , Alexandra Birch
‹ Prev 1 2 3 10 Next ›