Related papers: Structure Development in List-Sorting Transformers

Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

Although it is known that transformer language models (LMs) pass features from early layers to later layers, it is not well understood how this information is represented and routed by the model. We analyze a mechanism used in two LMs to…

Computation and Language · Computer Science 2025-05-12 Jack Merullo , Carsten Eickhoff , Ellie Pavlick

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

While the successes of transformers across many domains are indisputable, accurate understanding of the learning mechanics is still largely lacking. Their capabilities have been probed on benchmarks which include a variety of structured and…

Machine Learning · Computer Science 2023-07-25 Yuchen Li , Yuanzhi Li , Andrej Risteski

Analyzing the Structure of Attention in a Transformer Language Model

The Transformer is a fully attention-based alternative to recurrent networks that has achieved state-of-the-art results across a range of NLP tasks. In this paper, we analyze the structure of attention in a Transformer language model, the…

Computation and Language · Computer Science 2019-06-20 Jesse Vig , Yonatan Belinkov

Attention-Only Transformers via Unrolled Subspace Denoising

Despite the popularity of transformers in practice, their architectures are empirically designed and neither mathematically justified nor interpretable. Moreover, as indicated by many empirical studies, some components of transformer…

Machine Learning · Computer Science 2025-06-05 Peng Wang , Yifu Lu , Yaodong Yu , Druv Pai , Qing Qu , Yi Ma

Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting

Spatial functional organization is a hallmark of biological brains: neurons are arranged topographically according to their response properties, at multiple scales. In contrast, representations within most machine learning models lack…

Computation and Language · Computer Science 2025-10-22 Taha Binhuraib , Greta Tuckute , Nicholas Blauch

Transformer Learns Optimal Variable Selection in Group-Sparse Classification

Transformers have demonstrated remarkable success across various applications. However, the success of transformers have not been understood in theory. In this work, we give a case study of how transformers can be trained to learn a classic…

Machine Learning · Statistics 2025-04-14 Chenyang Zhang , Xuran Meng , Yuan Cao

Transformers Pretrained on Procedural Data Contain Modular Structures for Algorithmic Reasoning

Pretraining on large, semantically rich datasets is key for developing language models. Surprisingly, recent studies have shown that even synthetic data, generated procedurally through simple semantic-free algorithms, can yield some of the…

Machine Learning · Computer Science 2025-05-29 Zachary Shinnick , Liangze Jiang , Hemanth Saratchandran , Anton van den Hengel , Damien Teney

Characterizing Intrinsic Compositionality in Transformers with Tree Projections

When trained on language data, do transformers learn some arbitrary computation that utilizes the full capacity of the architecture or do they learn a simpler, tree-like computation, hypothesized to underlie compositional meaning systems…

Computation and Language · Computer Science 2022-11-07 Shikhar Murty , Pratyusha Sharma , Jacob Andreas , Christopher D. Manning

On the Ability and Limitations of Transformers to Recognize Formal Languages

Transformers have supplanted recurrent models in a large number of NLP tasks. However, the differences in their abilities to model different syntactic properties remain largely unknown. Past works suggest that LSTMs generalize very well on…

Computation and Language · Computer Science 2020-10-09 Satwik Bhattamishra , Kabir Ahuja , Navin Goyal

Attention-based clustering

Transformers have emerged as a powerful neural network architecture capable of tackling a wide range of learning tasks. In this work, we provide a theoretical analysis of their ability to automatically extract structure from data in an…

Machine Learning · Statistics 2025-10-29 Rodrigo Maulen-Soto , Pierre Marion , Claire Boyer

Humans and transformer LMs: Abstraction drives language learning

Categorization is a core component of human linguistic competence. We investigate how a transformer-based language model (LM) learns linguistic categories by comparing its behaviour over the course of training to behaviours which…

Computation and Language · Computer Science 2026-03-19 Jasper Jian , Christopher D. Manning

Unveiling Transformers with LEGO: a synthetic reasoning task

We propose a synthetic reasoning task, LEGO (Learning Equality and Group Operations), that encapsulates the problem of following a chain of reasoning, and we study how the Transformer architectures learn this task. We pay special attention…

Machine Learning · Computer Science 2023-02-21 Yi Zhang , Arturs Backurs , Sébastien Bubeck , Ronen Eldan , Suriya Gunasekar , Tal Wagner

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Transformer networks have seen great success in natural language processing and machine vision, where task objectives such as next word prediction and image classification benefit from nuanced context sensitivity across high-dimensional…

Machine Learning · Computer Science 2022-12-13 Yuxuan Li , James L. McClelland

Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis

Understanding the training dynamics of transformers is important to explain the impressive capabilities behind large language models. In this work, we study the dynamics of training a shallow transformer on a task of recognizing…

Machine Learning · Computer Science 2024-10-15 Hongru Yang , Bhavya Kailkhura , Zhangyang Wang , Yingbin Liang

How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias

Language recognition tasks are fundamental in natural language processing (NLP) and have been widely used to benchmark the performance of large language models (LLMs). These tasks also play a crucial role in explaining the working…

Machine Learning · Computer Science 2025-05-30 Ruiquan Huang , Yingbin Liang , Jing Yang

StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training

Most state-of-the-art techniques for Language Models (LMs) today rely on transformer-based architectures and their ubiquitous attention mechanism. However, the exponential growth in computational requirements with longer input sequences…

Computation and Language · Computer Science 2024-11-26 Kaustubh Ponkshe , Venkatapathy Subramanian , Natwar Modani , Ganesh Ramakrishnan

Mechanistic Interpretability of GPT-like Models on Summarization Tasks

Mechanistic interpretability research seeks to reveal the inner workings of large language models, yet most work focuses on classification or generative tasks rather than summarization. This paper presents an interpretability framework for…

Computation and Language · Computer Science 2025-05-26 Anurag Mishra

Is Random Attention Sufficient for Sequence Modeling? Disentangling Trainable Components in the Transformer

The transformer architecture is central to the success of modern Large Language Models (LLMs), in part due to its surprising ability to perform a wide range of tasks - including mathematical reasoning, memorization, and retrieval - using…

Machine Learning · Computer Science 2025-09-05 Yihe Dong , Lorenzo Noci , Mikhail Khodak , Mufan Li

Single-Head Attention in High Dimensions: A Theory of Generalization, Weights Spectra, and Scaling Laws

Trained attention layers exhibit striking and reproducible spectral structure of the weights, including low-rank collapse, bulk deformation, and isolated spectral outliers, yet the origin of these phenomena and their implications for…

Machine Learning · Statistics 2026-02-03 Fabrizio Boncoraglio , Vittorio Erba , Emanuele Troiani , Yizhou Xu , Florent Krzakala , Lenka Zdeborová

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

We introduce refined variants of the Local Learning Coefficient (LLC), a measure of model complexity grounded in singular learning theory, to study the development of internal structure in transformer language models during training. By…

Machine Learning · Computer Science 2024-10-07 George Wang , Jesse Hoogland , Stan van Wingerden , Zach Furman , Daniel Murfet