Related papers: Learning to Compose: Improving Object Centric Lear…

Provable Compositional Generalization for Object-Centric Learning

Learning representations that generalize to novel compositions of known concepts is crucial for bridging the gap between human and machine perception. One prominent effort is learning object-centric representations, which are widely…

Machine Learning · Computer Science 2024-11-13 Thaddäus Wiedemer , Jack Brady , Alexander Panfilov , Attila Juhos , Matthias Bethge , Wieland Brendel

Self-supervised Visual Reinforcement Learning with Object-centric Representations

Autonomous agents need large repertoires of skills to act reasonably on new tasks that they have not seen before. However, acquiring these skills using only a stream of high-dimensional, unstructured, and unlabeled observations is a tricky…

Machine Learning · Computer Science 2021-02-09 Andrii Zadaianchuk , Maximilian Seitzer , Georg Martius

Provably Learning Object-Centric Representations

Learning structured representations of the visual world in terms of objects promises to significantly improve the generalization abilities of current machine learning models. While recent efforts to this end have shown promising empirical…

Machine Learning · Computer Science 2023-05-24 Jack Brady , Roland S. Zimmermann , Yash Sharma , Bernhard Schölkopf , Julius von Kügelgen , Wieland Brendel

Compositional Video Synthesis by Temporal Object-Centric Learning

We present a novel framework for compositional video synthesis that leverages temporally consistent object-centric representations, extending our previous work, SlotAdapt, from images to video. While existing object-centric approaches…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Adil Kaan Akan , Yucel Yemez

Unlocking Compositional Generalization in Continual Few-Shot Learning

Object-centric representations promise a key property for few-shot learning: Rather than treating a scene as a single unit, a model can decompose it into individual object-level parts that can be matched and compared across different…

Machine Learning · Computer Science 2026-05-19 Phu-Quy Nguyen-Lam , Phu-Hoa Pham , Dao Sy Duy Minh , Chi-Nguyen Tran , Huynh Trung Kiet , Long Tran-Thanh

Object-Centric Learning with Slot Attention

Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not…

Machine Learning · Computer Science 2020-10-15 Francesco Locatello , Dirk Weissenborn , Thomas Unterthiner , Aravindh Mahendran , Georg Heigold , Jakob Uszkoreit , Alexey Dosovitskiy , Thomas Kipf

Are Object-Centric Representations Better At Compositional Generalization?

Compositional generalization, the ability to reason about novel combinations of familiar concepts, is fundamental to human cognition and a critical challenge for machine learning. Object-centric (OC) representations, which encode a scene as…

Computer Vision and Pattern Recognition · Computer Science 2026-02-19 Ferdinand Kapl , Amir Mohammad Karimi Mamaghan , Maximilian Seitzer , Karl Henrik Johansson , Carsten Marr , Stefan Bauer , Andrea Dittadi

Object Pursuit: Building a Space of Objects via Discriminative Weight Generation

We propose a framework to continuously learn object-centric representations for visual learning and understanding. Existing object-centric representations either rely on supervisions that individualize objects in the scene, or perform…

Computer Vision and Pattern Recognition · Computer Science 2022-04-05 Chuanyu Pan , Yanchao Yang , Kaichun Mo , Yueqi Duan , Leonidas Guibas

Spotlight Attention: Robust Object-Centric Learning With a Spatial Locality Prior

The aim of object-centric vision is to construct an explicit representation of the objects in a scene. This representation is obtained via a set of interchangeable modules called \emph{slots} or \emph{object files} that compete for local…

Computer Vision and Pattern Recognition · Computer Science 2023-06-06 Ayush Chakravarthy , Trang Nguyen , Anirudh Goyal , Yoshua Bengio , Michael C. Mozer

Toward Compositional Generalization in Object-Oriented World Modeling

Compositional generalization is a critical ability in learning and decision-making. We focus on the setting of reinforcement learning in object-oriented environments to study compositional generalization in world modeling. We (1) formalize…

Machine Learning · Computer Science 2022-06-20 Linfeng Zhao , Lingzhi Kong , Robin Walters , Lawson L. S. Wong

Cycle Consistency Driven Object Discovery

Developing deep learning models that effectively learn object-centric representations, akin to human cognition, remains a challenging task. Existing approaches facilitate object discovery by representing objects as fixed-size vectors,…

Computer Vision and Pattern Recognition · Computer Science 2023-12-11 Aniket Didolkar , Anirudh Goyal , Yoshua Bengio

Compositional Scene Modeling with Global Object-Centric Representations

The appearance of the same object may vary in different scene images due to perspectives and occlusions between objects. Humans can easily identify the same object, even if occlusions exist, by completing the occluded parts based on its…

Computer Vision and Pattern Recognition · Computer Science 2022-11-28 Tonglin Chen , Bin Li , Zhimeng Shen , Xiangyang Xue

Measuring Compositionality in Representation Learning

Many machine learning algorithms represent input data with vector embeddings or discrete codes. When inputs exhibit compositional structure (e.g. objects built from parts or procedures from subroutines), it is natural to ask whether this…

Machine Learning · Computer Science 2019-04-09 Jacob Andreas

Successes and Limitations of Object-centric Models at Compositional Generalisation

In recent years, it has been shown empirically that standard disentangled latent variable models do not support robust compositional learning in the visual domain. Indeed, in spite of being designed with the goal of factorising datasets…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Milton L. Montero , Jeffrey S. Bowers , Gaurav Malhotra

Object-centric architectures enable efficient causal representation learning

Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class). Common to all of these approaches is the assumption that…

Machine Learning · Computer Science 2023-10-31 Amin Mansouri , Jason Hartford , Yan Zhang , Yoshua Bengio

Learning Object-Centric Video Models by Contrasting Sets

Contrastive, self-supervised learning of object representations recently emerged as an attractive alternative to reconstruction-based training. Prior approaches focus on contrasting individual object representations (slots) against one…

Computer Vision and Pattern Recognition · Computer Science 2020-11-23 Sindy Löwe , Klaus Greff , Rico Jonschkowski , Alexey Dosovitskiy , Thomas Kipf

Disentangled Representation Learning via Modular Compositional Bias

Recent disentangled representation learning (DRL) methods heavily rely on factor specific strategies-either learning objectives for attributes or model architectures for objects-to embed inductive biases. Such divergent approaches result in…

Machine Learning · Computer Science 2025-11-12 Whie Jung , Dong Hoon Lee , Seunghoon Hong

Conditional Object-Centric Learning from Video

Object-centric representations are a promising path toward more systematic generalization by providing flexible abstractions upon which compositional world models can be built. Recent work on simple 2D and 3D datasets has shown that models…

Computer Vision and Pattern Recognition · Computer Science 2022-03-16 Thomas Kipf , Gamaleldin F. Elsayed , Aravindh Mahendran , Austin Stone , Sara Sabour , Georg Heigold , Rico Jonschkowski , Alexey Dosovitskiy , Klaus Greff

Teaching Compositionality to CNNs

Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting…

Computer Vision and Pattern Recognition · Computer Science 2017-06-15 Austin Stone , Huayan Wang , Michael Stark , Yi Liu , D. Scott Phoenix , Dileep George

Improving Object-centric Learning with Query Optimization

The ability to decompose complex natural scenes into meaningful object-centric abstractions lies at the core of human perception and reasoning. In the recent culmination of unsupervised object-centric learning, the Slot-Attention module has…

Computer Vision and Pattern Recognition · Computer Science 2023-02-13 Baoxiong Jia , Yu Liu , Siyuan Huang