Related papers: Efficient Turing Machine Simulation with Transform…

Constant Bit-size Transformers Are Turing Complete

We prove that any Turing machine running on inputs of arbitrary length can be simulated by a constant bit-size transformer, as long as the context window is sufficiently long. This improves previous works, which require scaling up either…

Computational Complexity · Computer Science 2025-09-30 Qian Li , Yuyi Wang

Two Heads Are Better than One: Simulating Large Transformers with Small Ones

The quadratic complexity of self-attention prevents transformers from scaling effectively to long input sequences. On the other hand, modern GPUs and other specialized hardware accelerators are well-optimized for processing small input…

Machine Learning · Computer Science 2025-06-23 Hantao Yu , Josh Alman

Barriers to Universal Reasoning With Transformers (And How to Overcome Them)

Chain-of-Thought (CoT) has been shown to empirically improve Transformers' performance, and theoretically increase their expressivity to Turing completeness. However, whether Transformers can learn to generalize to CoT traces longer than…

Machine Learning · Computer Science 2026-04-29 Oliver Kraus , Yash Sarrof , Yuekun Yao , Alexander Koller , Michael Hahn

Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Instructing the model to generate a sequence of intermediate steps, a.k.a., a chain of thought (CoT), is a highly effective method to improve the accuracy of large language models (LLMs) on arithmetics and symbolic reasoning tasks. However,…

Machine Learning · Computer Science 2024-09-24 Zhiyuan Li , Hong Liu , Denny Zhou , Tengyu Ma

Simulating Time With Square-Root Space

We show that for all functions $t(n) \geq n$, every multitape Turing machine running in time $t$ can be simulated in space only $O(\sqrt{t \log t})$. This is a substantial improvement over Hopcroft, Paul, and Valiant's simulation of time…

Computational Complexity · Computer Science 2025-02-26 R. Ryan Williams

A reliable Turing machine

We consider computations of a Turing machine subjected to noise. In every step, the action (the new state and the new content of the observed cell, the direction of the head movement) can differ from that prescribed by the transition…

Computational Complexity · Computer Science 2021-12-07 Ilir Çapuni , Peter Gács

Token Turing Machines

We propose Token Turing Machines (TTM), a sequential, autoregressive Transformer model with memory for real-world sequential visual understanding. Our model is inspired by the seminal Neural Turing Machine, and has an external memory…

Machine Learning · Computer Science 2023-04-14 Michael S. Ryoo , Keerthana Gopalakrishnan , Kumara Kahatapitiya , Ted Xiao , Kanishka Rao , Austin Stone , Yao Lu , Julian Ibarz , Anurag Arnab

Transformers Learn Shortcuts to Automata

Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine. However, Transformer models, while lacking recurrence, are able to perform such reasoning…

Machine Learning · Computer Science 2023-05-03 Bingbin Liu , Jordan T. Ash , Surbhi Goel , Akshay Krishnamurthy , Cyril Zhang

Simulations of Quantum Turing Machines by Quantum Multi-Stack Machines

As was well known, in classical computation, Turing machines, circuits, multi-stack machines, and multi-counter machines are equivalent, that is, they can simulate each other in polynomial time. In quantum computation, Yao [11] first proved…

Quantum Physics · Physics 2007-05-23 Daowen Qiu

The Expressive Power of Low Precision Softmax Transformers with (Summarized) Chain-of-Thought

Existing expressivity results for transformers typically rely on hardmax attention, high precision, and other architectural modifications that disconnect them from the models used in practice. We bridge this gap by analyzing standard…

Machine Learning · Computer Science 2026-05-19 Moritz Brösamle , Stephan Eckstein

Verifying Time Complexity of Deterministic Turing Machines

We show that, for all reasonable functions $T(n)=o(n\log n)$, we can algorithmically verify whether a given one-tape Turing machine runs in time at most $T(n)$. This is a tight bound on the order of growth for the function $T$ because we…

Logic in Computer Science · Computer Science 2019-01-15 David Gajser

Revisiting the simulation of quantum Turing machines by quantum circuits

Yao (1993) proved that quantum Turing machines and uniformly generated quantum circuits are polynomially equivalent computational models: $t \geq n$ steps of a quantum Turing machine running on an input of length $n$ can be simulated by a…

Computational Complexity · Computer Science 2019-09-11 Abel Molina , John Watrous

Improved Bounds on the Space Complexity of Circuit Evaluation

Williams (STOC 2025) recently proved that time-$t$ multitape Turing machines can be simulated using $O(\sqrt{t \log t})$ space using the Cook-Mertz (STOC 2024) tree evaluation procedure. As Williams notes, applying this result to fast…

Computational Complexity · Computer Science 2025-06-23 Yakov Shalunov

A computing machinery using a continuous memory tape

By considering a discrete tape where each cell corresponds to an integer, thus to a possible sum, a pseudo-polynomial solution can be given to subset sum problem, which is an NP-complete problem and a cornerstone application for this study,…

Computational Complexity · Computer Science 2024-01-08 Yigit Oktar

Multiway Turing Machines

Multiway Turing machines (also known as nondeterministic Turing machines or NDTMs) with explicit, simple rules are studied. Even very simple rules are found to generate complex behavior, characterized by complex multiway graphs, that can be…

Logic in Computer Science · Computer Science 2021-03-09 Stephen Wolfram

Turing machines on represented sets, a model of computation for Analysis

We introduce a new type of generalized Turing machines (GTMs), which are intended as a tool for the mathematician who studies computability in Analysis. In a single tape cell a GTM can store a symbol, a real number, a continuous real…

Logic · Mathematics 2015-07-01 Nazanin Tavana , Klaus Weihrauch

Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought

Chain of Thought (CoT) prompting has been shown to significantly improve the performance of large language models (LLMs), particularly in arithmetic and reasoning tasks, by instructing the model to produce intermediate reasoning steps.…

Machine Learning · Computer Science 2025-03-03 Jianhao Huang , Zixuan Wang , Jason D. Lee

Accelerating OTA Circuit Design: Transistor Sizing Based on a Transformer Model and Precomputed Lookup Tables

Device sizing is crucial for meeting performance specifications in operational transconductance amplifiers (OTAs), and this work proposes an automated sizing framework based on a transformer model. The approach first leverages the…

Hardware Architecture · Computer Science 2025-02-07 Subhadip Ghosh , Endalk Y. Gebru , Chandramouli V. Kashyap , Ramesh Harjani , Sachin S. Sapatnekar

Learning Circuits with Infinite Tensor Networks

Hamiltonian simulation on quantum computers is strongly constrained by gate counts, motivating techniques to reduce circuit depths. While tensor networks are natural competitors to quantum computers, we instead leverage them to support…

Quantum Physics · Physics 2025-06-04 Joe Gibbs , Lukasz Cincio

Trainable Transformer in Transformer

Recent works attribute the capability of in-context learning (ICL) in large pre-trained language models to implicitly simulating and fine-tuning an internal model (e.g., linear or 2-layer MLP) during inference. However, such constructions…

Computation and Language · Computer Science 2024-02-09 Abhishek Panigrahi , Sadhika Malladi , Mengzhou Xia , Sanjeev Arora