Related papers: Block-Operations: Using Modular Routing to Improve…

Generalization in Multimodal Language Learning from Simulation

Neural networks can be powerful function approximators, which are able to model high-dimensional feature distributions from a subset of examples drawn from the target distribution. Naturally, they perform well at generalizing within the…

Machine Learning · Computer Science 2021-08-06 Aaron Eisermann , Jae Hee Lee , Cornelius Weber , Stefan Wermter

Analyzing the Inner Workings of Transformers in Compositional Generalization

The compositional generalization abilities of neural models have been sought after for human-like linguistic competence. The popular method to evaluate such abilities is to assess the models' input-output behavior. However, that does not…

Computation and Language · Computer Science 2025-02-24 Ryoma Kumon , Hitomi Yanaka

Scaling can lead to compositional generalization

Can neural networks systematically capture discrete, compositional task structure despite their continuous, distributed nature? The impressive capabilities of large-scale neural networks suggest that the answer to this question is yes.…

Machine Learning · Computer Science 2025-10-27 Florian Redhardt , Yassir Akram , Simon Schug

Routing Networks and the Challenges of Modular and Compositional Computation

Compositionality is a key strategy for addressing combinatorial complexity and the curse of dimensionality. Recent work has shown that compositional solutions can be learned and offer substantial gains across a variety of domains, including…

Machine Learning · Computer Science 2019-04-30 Clemens Rosenbaum , Ignacio Cases , Matthew Riemer , Tim Klinger

Knowledge transfer in deep block-modular neural networks

Although deep neural networks (DNNs) have demonstrated impressive results during the last decade, they remain highly specialized tools, which are trained -- often from scratch -- to solve each particular task. The human brain, in contrast,…

Neural and Evolutionary Computing · Computer Science 2019-08-22 Alexander V. Terekhov , Guglielmo Montone , J. Kevin O'Regan

Generalizing Pooling Functions in Convolutional Neural Networks: Mixed, Gated, and Tree

We seek to improve deep neural networks by generalizing the pooling operations that play a central role in current architectures. We pursue a careful exploration of approaches to allow pooling to learn and to adapt to complex and variable…

Machine Learning · Statistics 2015-10-13 Chen-Yu Lee , Patrick W. Gallagher , Zhuowen Tu

Memorize or generalize? Searching for a compositional RNN in a haystack

Neural networks are very powerful learning systems, but they do not readily generalize from one task to the other. This is partly due to the fact that they do not learn in a compositional way, that is, by discovering skills that are shared…

Artificial Intelligence · Computer Science 2018-07-27 Adam Liška , Germán Kruszewski , Marco Baroni

Learning to Generalize Compositionally by Transferring Across Semantic Parsing Tasks

Neural network models often generalize poorly to mismatched domains or distributions. In NLP, this issue arises in particular when models are expected to generalize compositionally, that is, to novel combinations of familiar words and…

Computation and Language · Computer Science 2021-11-10 Wang Zhu , Peter Shaw , Tal Linzen , Fei Sha

Breaking Neural Network Scaling Laws with Modularity

Modular neural networks outperform nonmodular neural networks on tasks ranging from visual question answering to robotics. These performance improvements are thought to be due to modular networks' superior ability to model the compositional…

Machine Learning · Computer Science 2025-03-12 Akhilan Boopathy , Sunshine Jiang , William Yue , Jaedong Hwang , Abhiram Iyer , Ila Fiete

Making Neural Programming Architectures Generalize via Recursion

Empirically, neural networks that attempt to learn programs from data have exhibited poor generalizability. Moreover, it has traditionally been difficult to reason about the behavior of these models beyond a certain level of input…

Machine Learning · Computer Science 2017-04-24 Jonathon Cai , Richard Shin , Dawn Song

Towards Understanding the Relationship between In-context Learning and Compositional Generalization

According to the principle of compositional generalization, the meaning of a complex expression can be understood as a function of the meaning of its parts and of how they are combined. This principle is crucial for human language…

Computation and Language · Computer Science 2024-03-19 Sungjun Han , Sebastian Padó

Making Transformers Solve Compositional Tasks

Several studies have reported the inability of Transformer models to generalize compositionally, a key type of generalization in many NLP tasks such as semantic parsing. In this paper we explore the design space of Transformer models…

Artificial Intelligence · Computer Science 2022-03-04 Santiago Ontañón , Joshua Ainslie , Vaclav Cvicek , Zachary Fisher

Modular meta-learning

Many prediction problems, such as those that arise in the context of robotics, have a simplifying underlying structure that, if known, could accelerate learning. In this paper, we present a strategy for learning a set of neural network…

Machine Learning · Computer Science 2019-05-06 Ferran Alet , Tomás Lozano-Pérez , Leslie P. Kaelbling

Break It Down: Evidence for Structural Compositionality in Neural Networks

Though modern neural networks have achieved impressive performance in both vision and language tasks, we know little about the functions that they implement. One possibility is that neural networks implicitly break down complex tasks into…

Computation and Language · Computer Science 2023-11-08 Michael A. Lepori , Thomas Serre , Ellie Pavlick

Propositional Logic for Probing Generalization in Neural Networks

The extent to which neural networks are able to acquire and represent symbolic rules remains a key topic of research and debate. Much current work focuses on the impressive capabilities of large language models, as well as their often…

Machine Learning · Computer Science 2025-06-11 Anna Langedijk , Jaap Jumelet , Willem Zuidema

Transformer Module Networks for Systematic Generalization in Visual Question Answering

Transformers achieve great performance on Visual Question Answering (VQA). However, their systematic generalization capabilities, i.e., handling novel combinations of known concepts, is unclear. We reveal that Neural Module Networks (NMNs),…

Computer Vision and Pattern Recognition · Computer Science 2023-03-20 Moyuru Yamada , Vanessa D'Amario , Kentaro Takemoto , Xavier Boix , Tomotake Sasaki

Compositional Generalization by Learning Analytical Expressions

Compositional generalization is a basic and essential intellective capability of human beings, which allows us to recombine known parts readily. However, existing neural network based models have been proven to be extremely deficient in…

Artificial Intelligence · Computer Science 2020-10-27 Qian Liu , Shengnan An , Jian-Guang Lou , Bei Chen , Zeqi Lin , Yan Gao , Bin Zhou , Nanning Zheng , Dongmei Zhang

Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks

Conditional computation and modular networks have been recently proposed for multitask learning and other problems as a way to decompose problem solving into multiple reusable computational blocks. We propose a new approach for learning…

Machine Learning · Computer Science 2021-07-26 Andrey Zhmoginov , Dina Bashkirova , Mark Sandler

Compositional Generalization and Decomposition in Neural Program Synthesis

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, what…

Machine Learning · Computer Science 2023-10-31 Kensen Shi , Joey Hong , Manzil Zaheer , Pengcheng Yin , Charles Sutton

Improving Compositional Generalization Using Iterated Learning and Simplicial Embeddings

Compositional generalization, the ability of an agent to generalize to unseen combinations of latent factors, is easy for humans but hard for deep neural networks. A line of research in cognitive science has hypothesized a process,…

Machine Learning · Computer Science 2023-10-31 Yi Ren , Samuel Lavoie , Mikhail Galkin , Danica J. Sutherland , Aaron Courville