Related papers: Task Agnostic Architecture for Algorithm Induction…

Task-Driven Modular Networks for Zero-Shot Compositional Learning

One of the hallmarks of human intelligence is the ability to compose learned knowledge into novel concepts which can be recognized without a single training example. In contrast, current state-of-the-art methods require hundreds of training…

Computer Vision and Pattern Recognition · Computer Science 2019-05-16 Senthil Purushwalkam , Maximilian Nickel , Abhinav Gupta , Marc'Aurelio Ranzato

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e.g., performing basic arithmetic. Given the inherent compositional nature of language, one can expect the model to learn to compose these capabilities,…

Machine Learning · Computer Science 2024-02-07 Rahul Ramesh , Ekdeep Singh Lubana , Mikail Khona , Robert P. Dick , Hidenori Tanaka

Task Generalization With AutoRegressive Compositional Structure: Can Learning From $D$ Tasks Generalize to $D^{T}$ Tasks?

Large language models (LLMs) exhibit remarkable task generalization, solving tasks they were never explicitly trained on with only a few demonstrations. This raises a fundamental question: When can learning from a small set of tasks…

Machine Learning · Computer Science 2025-06-10 Amirhesam Abedsoltan , Huaqing Zhang , Kaiyue Wen , Hongzhou Lin , Jingzhao Zhang , Mikhail Belkin

Attention as a Hypernetwork

Transformers can under some circumstances generalize to novel problem instances whose constituent parts might have been encountered during training, but whose compositions have not. What mechanisms underlie this ability for compositional…

Machine Learning · Computer Science 2025-02-18 Simon Schug , Seijin Kobayashi , Yassir Akram , João Sacramento , Razvan Pascanu

Composing Task-Agnostic Policies with Deep Reinforcement Learning

The composition of elementary behaviors to solve challenging transfer learning problems is one of the key elements in building intelligent machines. To date, there has been plenty of work on learning task-specific policies or skills but…

Machine Learning · Computer Science 2020-01-01 Ahmed H. Qureshi , Jacob J. Johnson , Yuzhe Qin , Taylor Henderson , Byron Boots , Michael C. Yip

Compositional Networks Enable Systematic Generalization for Grounded Language Understanding

Humans are remarkably flexible when understanding new sentences that include combinations of concepts they have never encountered before. Recent work has shown that while deep networks can mimic some human language abilities when presented…

Computation and Language · Computer Science 2021-10-20 Yen-Ling Kuo , Boris Katz , Andrei Barbu

Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks

Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. However, existing neural models have been shown to lack this basic ability in learning symbolic…

Computation and Language · Computer Science 2021-10-01 Yichen Jiang , Mohit Bansal

Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers

We study implicit reasoning, i.e. the ability to combine knowledge or rules within a single forward pass. While transformer-based large language models store substantial factual knowledge and rules, they often fail to compose this knowledge…

Computation and Language · Computer Science 2026-04-10 Harsh Kohli , Srinivasan Parthasarathy , Huan Sun , Yuekun Yao

Policy Architectures for Compositional Generalization in Control

Many tasks in control, robotics, and planning can be specified using desired goal configurations for various entities in the environment. Learning goal-conditioned policies is a natural paradigm to solve such tasks. However, current…

Machine Learning · Computer Science 2022-03-14 Allan Zhou , Vikash Kumar , Chelsea Finn , Aravind Rajeswaran

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Transformer networks have seen great success in natural language processing and machine vision, where task objectives such as next word prediction and image classification benefit from nuanced context sensitivity across high-dimensional…

Machine Learning · Computer Science 2022-12-13 Yuxuan Li , James L. McClelland

A Generalist Neural Algorithmic Learner

The cornerstone of neural algorithmic reasoning is the ability to solve algorithmic tasks, especially in a way that generalises out of distribution. While recent years have seen a surge in methodological improvements in this area, they…

Machine Learning · Computer Science 2022-12-06 Borja Ibarz , Vitaly Kurin , George Papamakarios , Kyriacos Nikiforou , Mehdi Bennani , Róbert Csordás , Andrew Dudzik , Matko Bošnjak , Alex Vitvitskyi , Yulia Rubanova , Andreea Deac , Beatrice Bevilacqua , Yaroslav Ganin , Charles Blundell , Petar Veličković

Automatically Composing Representation Transformations as a Means for Generalization

A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all…

Machine Learning · Computer Science 2019-05-09 Michael B. Chang , Abhishek Gupta , Sergey Levine , Thomas L. Griffiths

MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning

We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL). Existing NAS methods typically define different search spaces according to different tasks. In order to adapt to different task…

Machine Learning · Computer Science 2020-04-01 Yuan Gao , Haoping Bai , Zequn Jie , Jiayi Ma , Kui Jia , Wei Liu

Adaptation-Agnostic Meta-Training

Many meta-learning algorithms can be formulated into an interleaved process, in the sense that task-specific predictors are learned during inner-task adaptation and meta-parameters are updated during meta-update. The normal meta-training…

Machine Learning · Computer Science 2021-08-25 Jiaxin Chen , Li-Ming Zhan , Xiao-Ming Wu , Fu-Lai Chung

Propositional Logic for Probing Generalization in Neural Networks

The extent to which neural networks are able to acquire and represent symbolic rules remains a key topic of research and debate. Much current work focuses on the impressive capabilities of large language models, as well as their often…

Machine Learning · Computer Science 2025-06-11 Anna Langedijk , Jaap Jumelet , Willem Zuidema

Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning

Systematic generalization refers to the capacity to understand and generate novel combinations from known components. Despite recent progress by large language models (LLMs) across various domains, these models often fail to extend their…

Artificial Intelligence · Computer Science 2026-02-27 Philipp Mondorf , Shijia Zhou , Monica Riedler , Barbara Plank

Compositionality as Lexical Symmetry

In tasks like semantic parsing, instruction following, and question answering, standard deep networks fail to generalize compositionally from small datasets. Many existing approaches overcome this limitation with model architectures that…

Computation and Language · Computer Science 2023-07-06 Ekin Akyürek , Jacob Andreas

On Task-Level Dialogue Composition of Generative Transformer Model

Task-oriented dialogue systems help users accomplish tasks such as booking a movie ticket and ordering food via conversation. Generative models parameterized by a deep neural network are widely used for next turn response generation in such…

Computation and Language · Computer Science 2020-10-13 Prasanna Parthasarathi , Arvind Neelakantan , Sharan Narang

Modular Task Decomposition and Dynamic Collaboration in Multi-Agent Systems Driven by Large Language Models

This paper addresses the limitations of a single agent in task decomposition and collaboration during complex task execution, and proposes a multi-agent architecture for modular task decomposition and dynamic collaboration based on large…

Artificial Intelligence · Computer Science 2025-11-04 Shuaidong Pan , Di Wu

Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken Conversations

Building robust and general dialogue models for spoken conversations is challenging due to the gap in distributions of spoken and written data. This paper presents our approach to build generalized models for the Knowledge-grounded…

Computation and Language · Computer Science 2022-03-09 Ruijie Yan , Shuang Peng , Haitao Mi , Liang Jiang , Shihui Yang , Yuchi Zhang , Jiajun Li , Liangrui Peng , Yongliang Wang , Zujie Wen