Memory Augmented Large Language Models are Computationally Universal

Dale Schuurmans

Memory Augmented Large Language Models are Computationally Universal

Computation and Language 2023-01-12 v1 Formal Languages and Automata Theory

Authors: Dale Schuurmans

Abstract

We show that transformer-based large language models are computationally universal when augmented with an external memory. Any deterministic language model that conditions on strings of bounded length is equivalent to a finite automaton, hence computationally limited. However, augmenting such models with a read-write memory creates the possibility of processing arbitrarily large inputs and, potentially, simulating any algorithm. We establish that an existing large language model, Flan-U-PaLM 540B, can be combined with an associative read-write memory to exactly simulate the execution of a universal Turing machine, $U_{15,2}$ . A key aspect of the finding is that it does not require any modification of the language model weights. Instead, the construction relies solely on designing a form of stored instruction computer that can subsequently be programmed with a specific set of prompts.

Keywords

large language model language modeling automata theory

Cite

@article{arxiv.2301.04589,
  title  = {Memory Augmented Large Language Models are Computationally Universal},
  author = {Dale Schuurmans},
  journal= {arXiv preprint arXiv:2301.04589},
  year   = {2023}
}

Comments

23 pages, 0 figures

Memory Augmented Large Language Models are Computationally Universal

Abstract

Keywords

Cite

Comments

Related papers