Related papers: Iterative Decoding for Compositional Generalizatio…

Unlocking Compositional Generalization in Pre-trained Models Using Intermediate Representations

Sequence-to-sequence (seq2seq) models are prevalent in semantic parsing, but have been found to struggle at out-of-distribution compositional generalization. While specialized model architectures and pre-training of seq2seq models have been…

Computation and Language · Computer Science 2021-04-16 Jonathan Herzig , Peter Shaw , Ming-Wei Chang , Kelvin Guu , Panupong Pasupat , Yuan Zhang

Revisiting Iterative Back-Translation from the Perspective of Compositional Generalization

Human intelligence exhibits compositional generalization (i.e., the capacity to understand and produce unseen combinations of seen components), but current neural seq2seq models lack such ability. In this paper, we revisit iterative…

Computation and Language · Computer Science 2020-12-09 Yinuo Guo , Hualei Zhu , Zeqi Lin , Bei Chen , Jian-Guang Lou , Dongmei Zhang

Disentangled Sequence to Sequence Learning for Compositional Generalization

There is mounting evidence that existing neural network models, in particular the very popular sequence-to-sequence architecture, struggle to systematically generalize to unseen compositions of seen components. We demonstrate that one of…

Computation and Language · Computer Science 2022-03-23 Hao Zheng , Mirella Lapata

Compositional Generalization without Trees using Multiset Tagging and Latent Permutations

Seq2seq models have been shown to struggle with compositional generalization in semantic parsing, i.e. generalizing to unseen compositions of phenomena that the model handles correctly in isolation. We phrase semantic parsing as a two-step…

Computation and Language · Computer Science 2023-05-29 Matthias Lindemann , Alexander Koller , Ivan Titov

Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization

Recent studies have shown that sequence-to-sequence (seq2seq) models struggle with compositional generalization (CG), i.e., the ability to systematically generalize to unseen compositions of seen components. There is mounting evidence that…

Computation and Language · Computer Science 2023-10-19 Lei Lin , Shuangtao Li , Yafang Zheng , Biao Fu , Shan Liu , Yidong Chen , Xiaodong Shi

Compositional generalization through meta sequence-to-sequence learning

People can learn a new concept and use it compositionally, understanding how to "blicket twice" after learning how to "blicket." In contrast, powerful sequence-to-sequence (seq2seq) neural networks fail such tests of compositionality,…

Computation and Language · Computer Science 2019-10-10 Brenden M. Lake

Improving Compositional Generalization Using Iterated Learning and Simplicial Embeddings

Compositional generalization, the ability of an agent to generalize to unseen combinations of latent factors, is easy for humans but hard for deep neural networks. A line of research in cognitive science has hypothesized a process,…

Machine Learning · Computer Science 2023-10-31 Yi Ren , Samuel Lavoie , Mikhail Galkin , Danica J. Sutherland , Aaron Courville

Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks

Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. However, existing neural models have been shown to lack this basic ability in learning symbolic…

Computation and Language · Computer Science 2021-10-01 Yichen Jiang , Mohit Bansal

Transcoding compositionally: using attention to find more generalizable solutions

While sequence-to-sequence models have shown remarkable generalization power across several natural language tasks, their construct of solutions are argued to be less compositional than human-like generalization. In this paper, we present…

Computation and Language · Computer Science 2019-06-07 Kris Korrel , Dieuwke Hupkes , Verna Dankers , Elia Bruni

Randomized Positional Encodings Boost Length Generalization of Transformers

Transformers have impressive generalization capabilities on tasks with a fixed context length. However, they fail to generalize to sequences of arbitrary length, even for seemingly simple tasks such as duplicating a string. Moreover, simply…

Machine Learning · Computer Science 2023-05-29 Anian Ruoss , Grégoire Delétang , Tim Genewein , Jordi Grau-Moya , Róbert Csordás , Mehdi Bennani , Shane Legg , Joel Veness

Compositional Generalization via Semantic Tagging

Although neural sequence-to-sequence models have been successfully applied to semantic parsing, they fail at compositional generalization, i.e., they are unable to systematically generalize to unseen compositions of seen components.…

Computation and Language · Computer Science 2021-09-10 Hao Zheng , Mirella Lapata

Compositional Generalisation with Structured Reordering and Fertility Layers

Seq2seq models have been shown to struggle with compositional generalisation, i.e. generalising to new and potentially more complex structures than seen during training. Taking inspiration from grammar-based models that excel at…

Computation and Language · Computer Science 2023-02-16 Matthias Lindemann , Alexander Koller , Ivan Titov

Recursive Decoding: A Situated Cognition Approach to Compositional Generation in Grounded Language Understanding

Compositional generalization is a troubling blind spot for neural language models. Recent efforts have presented techniques for improving a model's ability to encode novel combinations of known inputs, but less work has focused on…

Computation and Language · Computer Science 2022-02-21 Matthew Setzler , Scott Howland , Lauren Phillips

Automatically Composing Representation Transformations as a Means for Generalization

A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all…

Machine Learning · Computer Science 2019-05-09 Michael B. Chang , Abhishek Gupta , Sergey Levine , Thomas L. Griffiths

Making Transformers Solve Compositional Tasks

Several studies have reported the inability of Transformer models to generalize compositionally, a key type of generalization in many NLP tasks such as semantic parsing. In this paper we explore the design space of Transformer models…

Artificial Intelligence · Computer Science 2022-03-04 Santiago Ontañón , Joshua Ainslie , Vaclav Cvicek , Zachary Fisher

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e.g., performing basic arithmetic. Given the inherent compositional nature of language, one can expect the model to learn to compose these capabilities,…

Machine Learning · Computer Science 2024-02-07 Rahul Ramesh , Ekdeep Singh Lubana , Mikail Khona , Robert P. Dick , Hidenori Tanaka

Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Question answering models struggle to generalize to novel compositions of training patterns, such to longer sequences or more complex test structures. Current end-to-end models learn a flat input embedding which can lose input syntax…

Computation and Language · Computer Science 2021-11-08 Yu Gai , Paras Jain , Wendi Zhang , Joseph E. Gonzalez , Dawn Song , Ion Stoica

Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning

Compositional generalization is a basic mechanism in human language learning, which current neural networks struggle with. A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability by…

Computation and Language · Computer Science 2022-12-13 Hao Zheng , Mirella Lapata

Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations

Models need appropriate inductive biases to effectively learn from small amounts of data and generalize systematically outside of the training distribution. While Transformers are highly versatile and powerful, they can still benefit from…

Computation and Language · Computer Science 2024-07-08 Matthias Lindemann , Alexander Koller , Ivan Titov

Compositional Generalization and Decomposition in Neural Program Synthesis

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, what…

Machine Learning · Computer Science 2023-10-31 Kensen Shi , Joey Hong , Manzil Zaheer , Pengcheng Yin , Charles Sutton