Related papers: Can Models Learn Skill Composition from Examples?

Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models

With LLMs shifting their role from statistical modeling of language to serving as general-purpose AI agents, how should LLM evaluations change? Arguably, a key ability of an AI agent is to flexibly combine, as needed, the basic skills it…

Computation and Language · Computer Science 2023-10-27 Dingli Yu , Simran Kaur , Arushi Gupta , Jonah Brown-Cohen , Anirudh Goyal , Sanjeev Arora

Can Language Models Compose Skills In-Context?

Composing basic skills from simple tasks to accomplish composite tasks is crucial for modern intelligent systems. We investigate the in-context composition ability of language models to perform composite tasks that combine basic skills…

Machine Learning · Computer Science 2025-10-28 Zidong Liu , Zhuoyan Xu , Zhenmei Shi , Yingyu Liang

Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability

Large language models (LLMs) have emerged as powerful tools for many AI problems and exhibit remarkable in-context learning (ICL) capabilities. Compositional ability, solving unseen complex tasks that combine two or more simple tasks, is an…

Computation and Language · Computer Science 2024-08-13 Zhuoyan Xu , Zhenmei Shi , Yingyu Liang

Multimodal LLMs Do Not Compose Skills Optimally Across Modalities

Skill composition is the ability to combine previously learned skills to solve new tasks. As neural networks acquire increasingly complex skills during their pretraining, it is not clear how successfully they can compose them. In this…

Computation and Language · Computer Science 2026-03-10 Paula Ontalvilla , Aitor Ormazabal , Gorka Azkune

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

Large language models (LLMs) with enormous pre-training tokens and parameters emerge diverse abilities, including math reasoning, code generation, and instruction following. These abilities are further enhanced by supervised fine-tuning…

Computation and Language · Computer Science 2024-06-10 Guanting Dong , Hongyi Yuan , Keming Lu , Chengpeng Li , Mingfeng Xue , Dayiheng Liu , Wei Wang , Zheng Yuan , Chang Zhou , Jingren Zhou

Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

We investigate how to elicit compositional generalization capabilities in large language models (LLMs). Compositional generalization empowers LLMs to solve complex problems by combining foundational skills, a critical reasoning ability akin…

Computation and Language · Computer Science 2024-07-18 Jiaao Chen , Xiaoman Pan , Dian Yu , Kaiqiang Song , Xiaoyang Wang , Dong Yu , Jianshu Chen

Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning

Systematic generalization refers to the capacity to understand and generate novel combinations from known components. Despite recent progress by large language models (LLMs) across various domains, these models often fail to extend their…

Artificial Intelligence · Computer Science 2026-02-27 Philipp Mondorf , Shijia Zhou , Monica Riedler , Barbara Plank

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionally…

Computation and Language · Computer Science 2022-03-16 Arkil Patel , Satwik Bhattamishra , Phil Blunsom , Navin Goyal

Compositional Generalization from Learned Skills via CoT Training: A Theoretical and Structural Analysis for Reasoning

Chain-of-Thought (CoT) training has markedly advanced the reasoning capabilities of large language models (LLMs), yet the mechanisms by which CoT training enhances generalization remain inadequately understood. In this work, we demonstrate…

Machine Learning · Computer Science 2026-02-13 Xinhao Yao , Ruifeng Ren , Yun Liao , Lizhong Ding , Yong Liu

Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model

While large language models (LLMs) demonstrate strong reasoning capabilities utilizing reinforcement learning (RL) with verifiable reward, whether large vision-language models (VLMs) can directly inherit such capabilities through similar…

Artificial Intelligence · Computer Science 2025-05-27 Tianle Li , Jihai Zhang , Yongming Rao , Yu Cheng

How Does RL Post-training Induce Skill Composition? A Case Study on Countdown

While reinforcement learning (RL) successfully enhances reasoning in large language models, its role in fostering compositional generalization (the ability to synthesize novel skills from known components) is often conflated with mere…

Machine Learning · Computer Science 2025-12-02 Simon Park , Simran Kaur , Sanjeev Arora

From Words to Worlds: Compositionality for Cognitive Architectures

Large language models (LLMs) are very performant connectionist systems, but do they exhibit more compositionality? More importantly, is that part of why they perform so well? We present empirical analyses across four LLM families (12…

Computation and Language · Computer Science 2025-05-21 Ruchira Dhar , Anders Søgaard

Evaluating Morphological Compositional Generalization in Large Language Models

Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether…

Computation and Language · Computer Science 2025-06-06 Mete Ismayilzada , Defne Circi , Jonne Sälevä , Hale Sirin , Abdullatif Köksal , Bhuwan Dhingra , Antoine Bosselut , Duygu Ataman , Lonneke van der Plas

Improving Compositional Generalization in Math Word Problem Solving

Compositional generalization refers to a model's capability to generalize to newly composed input data based on the data components observed during training. It has triggered a series of compositional generalization analysis on different…

Computation and Language · Computer Science 2022-09-07 Yunshi Lan , Lei Wang , Jing Jiang , Ee-Peng Lim

Compositional Generalization and Decomposition in Neural Program Synthesis

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, what…

Machine Learning · Computer Science 2023-10-31 Kensen Shi , Joey Hong , Manzil Zaheer , Pengcheng Yin , Charles Sutton

Revisiting the Superficial Alignment Hypothesis

The Superficial Alignment Hypothesis posits that almost all of a language model's abilities and knowledge are learned during pre-training, while post-training is about giving a model the right style and format. We re-examine these claims by…

Computation and Language · Computer Science 2024-10-08 Mohit Raghavendra , Vaskar Nath , Sean Hendryx

Measuring and Improving Compositional Generalization in Text-to-SQL via Component Alignment

In text-to-SQL tasks -- as in much of NLP -- compositional generalization is a major challenge: neural networks struggle with compositional generalization where training and test distributions differ. However, most recent attempts to…

Computation and Language · Computer Science 2022-05-05 Yujian Gan , Xinyun Chen , Qiuping Huang , Matthew Purver

LLM Augmented LLMs: Expanding Capabilities through Composition

Foundational models with billions of parameters which have been trained on large corpora of data have demonstrated non-trivial skills in a variety of domains. However, due to their monolithic structure, it is challenging and expensive to…

Machine Learning · Computer Science 2024-01-05 Rachit Bansal , Bidisha Samanta , Siddharth Dalmia , Nitish Gupta , Shikhar Vashishth , Sriram Ganapathy , Abhishek Bapna , Prateek Jain , Partha Talukdar

Towards Compositionally Generalizable Semantic Parsing in Large Language Models: A Survey

Compositional generalization is the ability of a model to generalize to complex, previously unseen types of combinations of entities from just having seen the primitives. This type of generalization is particularly relevant to the semantic…

Computation and Language · Computer Science 2024-04-23 Amogh Mannekote

Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing

Despite their strong performance on many tasks, pre-trained language models have been shown to struggle on out-of-distribution compositional generalization. Meanwhile, recent work has shown considerable improvements on many NLP tasks from…

Computation and Language · Computer Science 2022-10-26 Linlu Qiu , Peter Shaw , Panupong Pasupat , Tianze Shi , Jonathan Herzig , Emily Pitler , Fei Sha , Kristina Toutanova