Related papers: Memory Augmented Large Language Models are Computa…

Autoregressive Large Language Models are Computationally Universal

We show that autoregressive decoding of a transformer-based language model can realize universal computation, without external intervention or modification of the model's weights. Establishing this result requires understanding how a…

Computation and Language · Computer Science 2024-10-07 Dale Schuurmans , Hanjun Dai , Francesco Zanini

Augmenting Interpretable Models with LLMs during Training

Recent large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks. However, their proliferation into high-stakes domains (e.g. medicine) and compute-limited settings has created a…

Artificial Intelligence · Computer Science 2023-12-05 Chandan Singh , Armin Askari , Rich Caruana , Jianfeng Gao

Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents

Large Language Models (LLMs) represent a landmark achievement in Artificial Intelligence (AI), demonstrating unprecedented proficiency in procedural tasks such as text generation, code completion, and conversational coherence. These…

Artificial Intelligence · Computer Science 2025-05-07 Schaun Wheeler , Olivier Jeunen

Neural Stored-program Memory

Neural networks powered with external memory simulate computer behaviors. These models, which use the memory to store data for a neural controller, can learn algorithms and other complex tasks. In this paper, we introduce a new memory to…

Neural and Evolutionary Computing · Computer Science 2019-12-30 Hung Le , Truyen Tran , Svetha Venkatesh

Scalable MatMul-free Language Modeling

Large Language Models (LLMs) have fundamentally altered how we approach scaling in machine learning. However, these models pose substantial computational and memory challenges, primarily due to the reliance on matrix multiplication (MatMul)…

Computation and Language · Computer Science 2025-07-29 Rui-Jie Zhu , Yu Zhang , Steven Abreu , Ethan Sifferman , Tyler Sheaves , Yiqiao Wang , Dustin Richmond , Sumit Bam Shrestha , Peng Zhou , Jason K. Eshraghian

Universal computation is intrinsic to language model decoding

Language models now provide an interface to express and often solve general problems in natural language, yet their ultimate computational capabilities remain a major topic of scientific debate. Unlike a formal computer, a language model is…

Computation and Language · Computer Science 2026-02-11 Alex Lewandowski , Marlos C. Machado , Dale Schuurmans

Augmented Language Models: a Survey

This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in…

Computation and Language · Computer Science 2023-02-16 Grégoire Mialon , Roberto Dessì , Maria Lomeli , Christoforos Nalmpantis , Ram Pasunuru , Roberta Raileanu , Baptiste Rozière , Timo Schick , Jane Dwivedi-Yu , Asli Celikyilmaz , Edouard Grave , Yann LeCun , Thomas Scialom

Schrodinger's Memory: Large Language Models

Memory is the foundation of all human activities; without memory, it would be nearly impossible for people to perform any task in daily life. With the development of Large Language Models (LLMs), their language capabilities are becoming…

Computation and Language · Computer Science 2024-09-30 Wei Wang , Qing Li

Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers

Recent studies have demonstrated that the performance of transformers on the task of language modeling obeys a power-law relationship with model size over six orders of magnitude. While transformers exhibit impressive scaling, their…

Machine Learning · Computer Science 2021-10-07 Narsimha Chilkuri , Eric Hunsberger , Aaron Voelker , Gurshaant Malik , Chris Eliasmith

Augmenting Large Language Model Translators via Translation Memories

Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models. In this work, we take a step towards prompting large language models (LLMs) with TMs and making them better…

Computation and Language · Computer Science 2023-05-30 Yongyu Mu , Abudurexiti Reheman , Zhiquan Cao , Yuchun Fan , Bei Li , Yinqiao Li , Tong Xiao , Chunliang Zhang , Jingbo Zhu

On the Universality of Memcomputing Machines

Universal memcomputing machines (UMMs) [IEEE Trans. Neural Netw. Learn. Syst. 26, 2702 (2015)] represent a novel computational model in which memory (time non-locality) accomplishes both tasks of storing and processing of information. UMMs…

Neural and Evolutionary Computing · Computer Science 2019-05-29 Yan Ru Pei , Fabio L. Traversa , Massimiliano Di Ventra

Transcending Scaling Laws with 0.1% Extra Compute

Scaling language models improves performance but comes with significant computational costs. This paper proposes UL2R, a method that substantially improves existing language models and their scaling curves with a relatively tiny amount of…

Computation and Language · Computer Science 2022-11-17 Yi Tay , Jason Wei , Hyung Won Chung , Vinh Q. Tran , David R. So , Siamak Shakeri , Xavier Garcia , Huaixiu Steven Zheng , Jinfeng Rao , Aakanksha Chowdhery , Denny Zhou , Donald Metzler , Slav Petrov , Neil Houlsby , Quoc V. Le , Mostafa Dehghani

Large Language Models and the Extended Church-Turing Thesis

The Extended Church-Turing Thesis (ECTT) posits that all effective information processing, including unbounded and non-uniform interactive computations, can be described in terms of interactive Turing machines with advice. Does this…

Formal Languages and Automata Theory · Computer Science 2024-09-12 Jiří Wiedermann , Jan van Leeuwen

Algorithmic Language Models with Neurally Compiled Libraries

Important tasks such as reasoning and planning are fundamentally algorithmic, meaning that solving them robustly requires acquiring true reasoning or planning algorithms, rather than shortcuts. Large Language Models lack true algorithmic…

Artificial Intelligence · Computer Science 2025-05-27 Lucas Saldyt , Subbarao Kambhampati

TALM: Tool Augmented Language Models

Transformer based language models (LMs) demonstrate increasing performance with scale across a wide variety of tasks. Scale alone however cannot enable models to solve tasks that require access to ephemeral, changing, or private data that…

Computation and Language · Computer Science 2022-05-25 Aaron Parisi , Yao Zhao , Noah Fiedel

Large Memory Layers with Product Keys

This paper introduces a structured memory which can be easily integrated into a neural network. The memory is very large by design and significantly increases the capacity of the architecture, by up to a billion parameters with a negligible…

Computation and Language · Computer Science 2019-12-17 Guillaume Lample , Alexandre Sablayrolles , Marc'Aurelio Ranzato , Ludovic Denoyer , Hervé Jégou

Relational Database Augmented Large Language Model

Large language models (LLMs) excel in many natural language processing (NLP) tasks. However, since LLMs can only incorporate new knowledge through training or supervised fine-tuning processes, they are unsuitable for applications that…

Databases · Computer Science 2024-07-23 Zongyue Qin , Chen Luo , Zhengyang Wang , Haoming Jiang , Yizhou Sun

Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference

Large language models (LLMs) have recently transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations. This development necessitates speed, efficiency, and accessibility in…

Hardware Architecture · Computer Science 2024-06-13 Christopher Wolters , Xiaoxuan Yang , Ulf Schlichtmann , Toyotaro Suzumura

Universal Memcomputing Machines

We introduce the notion of universal memcomputing machines (UMMs): a class of brain-inspired general-purpose computing machines based on systems with memory, whereby processing and storing of information occur on the same physical location.…

Neural and Evolutionary Computing · Computer Science 2015-12-17 Fabio L. Traversa , Massimiliano Di Ventra

Transformation of Turing Machines into Context-Dependent Fusion Grammars

Context-dependent fusion grammars were recently introduced as devices for the generation of hypergraph languages. In this paper, we show that this new type of hypergraph grammars, where the application of fusion rules is restricted by…

Formal Languages and Automata Theory · Computer Science 2019-12-23 Aaron Lye