Related papers: Complexity Control Facilitates Reasoning-Based Com…

Analyzing the Inner Workings of Transformers in Compositional Generalization

The compositional generalization abilities of neural models have been sought after for human-like linguistic competence. The popular method to evaluate such abilities is to assess the models' input-output behavior. However, that does not…

Computation and Language · Computer Science 2025-02-24 Ryoma Kumon , Hitomi Yanaka

Critical Windows of Complexity Control: When Transformers Decide to Reason or Memorize

Recent work has shown that Transformers' compositional generalization is governed by \emph{complexity control}, initialization scale and weight decay, which steers training toward low-complexity reasoning solutions rather than…

Machine Learning · Computer Science 2026-05-07 Sarwan Ali

An explainable transformer circuit for compositional generalization

Compositional generalization-the systematic combination of known components into novel structures-remains a core challenge in cognitive science and machine learning. Although transformer-based large language models can exhibit strong…

Machine Learning · Computer Science 2025-02-25 Cheng Tang , Brenden Lake , Mehrdad Jazayeri

Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing

Transformers have shown impressive capabilities across various tasks, but their performance on compositional problems remains a topic of debate. In this work, we investigate the mechanisms of how transformers behave on unseen compositional…

Machine Learning · Computer Science 2025-01-14 Zhongwang Zhang , Pengxiao Lin , Zhiwei Wang , Yaoyu Zhang , Zhi-Qin John Xu

Compositional Generalization and Decomposition in Neural Program Synthesis

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, what…

Machine Learning · Computer Science 2023-10-31 Kensen Shi , Joey Hong , Manzil Zaheer , Pengcheng Yin , Charles Sutton

Towards Understanding the Relationship between In-context Learning and Compositional Generalization

According to the principle of compositional generalization, the meaning of a complex expression can be understood as a function of the meaning of its parts and of how they are combined. This principle is crucial for human language…

Computation and Language · Computer Science 2024-03-19 Sungjun Han , Sebastian Padó

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Transformer networks have seen great success in natural language processing and machine vision, where task objectives such as next word prediction and image classification benefit from nuanced context sensitivity across high-dimensional…

Machine Learning · Computer Science 2022-12-13 Yuxuan Li , James L. McClelland

Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers

We study implicit reasoning, i.e. the ability to combine knowledge or rules within a single forward pass. While transformer-based large language models store substantial factual knowledge and rules, they often fail to compose this knowledge…

Computation and Language · Computer Science 2026-04-10 Harsh Kohli , Srinivasan Parthasarathy , Huan Sun , Yuekun Yao

Characterizing Intrinsic Compositionality in Transformers with Tree Projections

When trained on language data, do transformers learn some arbitrary computation that utilizes the full capacity of the architecture or do they learn a simpler, tree-like computation, hypothesized to underlie compositional meaning systems…

Computation and Language · Computer Science 2022-11-07 Shikhar Murty , Pratyusha Sharma , Jacob Andreas , Christopher D. Manning

Propositional Logic for Probing Generalization in Neural Networks

The extent to which neural networks are able to acquire and represent symbolic rules remains a key topic of research and debate. Much current work focuses on the impressive capabilities of large language models, as well as their often…

Machine Learning · Computer Science 2025-06-11 Anna Langedijk , Jaap Jumelet , Willem Zuidema

Layer Specialization Underlying Compositional Reasoning in Transformers

Transformers exhibit compositional reasoning on sequences not observed during training, a capability often attributed to in-context learning (ICL) and skill composition. We investigate this phenomenon using the Random Hierarchy Model (RHM),…

Machine Learning · Computer Science 2025-10-21 Jing Liu

Policy Architectures for Compositional Generalization in Control

Many tasks in control, robotics, and planning can be specified using desired goal configurations for various entities in the environment. Learning goal-conditioned policies is a natural paradigm to solve such tasks. However, current…

Machine Learning · Computer Science 2022-03-14 Allan Zhou , Vikash Kumar , Chelsea Finn , Aravind Rajeswaran

When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

Humans can reason compositionally whilst grounding language utterances to the real world. Recent benchmarks like ReaSCAN use navigation tasks grounded in a grid world to assess whether neural models exhibit similar capabilities. In this…

Computation and Language · Computer Science 2022-11-01 Ankur Sikarwar , Arkil Patel , Navin Goyal

A Compositional Neuro-Controller for Advanced Motor Control Tasks

Humans and animals developed a sophisticated motor control apparatus and there is much evidence that it has a modular structure. The modularity offers a range of benefits, e.g. ability to learn dissociable motion styles without interference…

Robotics · Computer Science 2016-05-20 Kirill Makukhin

Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?

Humans exhibit remarkable compositional reasoning by integrating knowledge from various sources. For example, if someone learns ( B = f(A) ) from one source and ( C = g(B) ) from another, they can deduce ( C=g(B)=g(f(A)) ) even without…

Artificial Intelligence · Computer Science 2025-10-14 Yutong Yin , Zhaoran Wang

Routing Networks and the Challenges of Modular and Compositional Computation

Compositionality is a key strategy for addressing combinatorial complexity and the curse of dimensionality. Recent work has shown that compositional solutions can be learned and offer substantial gains across a variety of domains, including…

Machine Learning · Computer Science 2019-04-30 Clemens Rosenbaum , Ignacio Cases , Matthew Riemer , Tim Klinger

Making Transformers Solve Compositional Tasks

Several studies have reported the inability of Transformer models to generalize compositionally, a key type of generalization in many NLP tasks such as semantic parsing. In this paper we explore the design space of Transformer models…

Artificial Intelligence · Computer Science 2022-03-04 Santiago Ontañón , Joshua Ainslie , Vaclav Cvicek , Zachary Fisher

Faith and Fate: Limits of Transformers on Compositionality

Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This…

Computation and Language · Computer Science 2023-11-01 Nouha Dziri , Ximing Lu , Melanie Sclar , Xiang Lorraine Li , Liwei Jiang , Bill Yuchen Lin , Peter West , Chandra Bhagavatula , Ronan Le Bras , Jena D. Hwang , Soumya Sanyal , Sean Welleck , Xiang Ren , Allyson Ettinger , Zaid Harchaoui , Yejin Choi

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e.g., performing basic arithmetic. Given the inherent compositional nature of language, one can expect the model to learn to compose these capabilities,…

Machine Learning · Computer Science 2024-02-07 Rahul Ramesh , Ekdeep Singh Lubana , Mikail Khona , Robert P. Dick , Hidenori Tanaka

Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization

The ability to reason lies at the core of artificial intelligence (AI), and challenging problems usually call for deeper and longer reasoning to tackle. A crucial question about AI reasoning is whether models can extrapolate learned…

Machine Learning · Computer Science 2025-11-11 Yu Huang , Zixin Wen , Aarti Singh , Yuejie Chi , Yuxin Chen