Related papers: Cycle Consistency Driven Object Discovery

Temporally Consistent Object-Centric Learning by Contrasting Slots

Unsupervised object-centric learning from videos is a promising approach to extract structured representations from large, unlabeled collections of videos. To support downstream tasks like autonomous control, these representations must be…

Computer Vision and Pattern Recognition · Computer Science 2025-03-19 Anna Manasyan , Maximilian Seitzer , Filip Radovic , Georg Martius , Andrii Zadaianchuk

Cycle Consistency in Video Object-Centric Learning

Self-supervised video Object-Centric Learning (OCL) aims to discover distinct objects and associate them across time, whereas self-supervised Multi-Object Tracking (MOT) focuses on associating pre-defined object detections or segmentations.…

Computer Vision and Pattern Recognition · Computer Science 2026-05-29 Rongzhen Zhao , Zhiyuan Li , Ruonan Wei , Juho Kannala , Joni Pajarinen

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Object-centric representation learning aims to decompose visual scenes into fixed-size vectors called "slots" or "object files", where each slot captures a distinct object. Current state-of-the-art object-centric models have shown…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Aniket Didolkar , Andrii Zadaianchuk , Rabiul Awal , Maximilian Seitzer , Efstratios Gavves , Aishwarya Agrawal

Object Pursuit: Building a Space of Objects via Discriminative Weight Generation

We propose a framework to continuously learn object-centric representations for visual learning and understanding. Existing object-centric representations either rely on supervisions that individualize objects in the scene, or perform…

Computer Vision and Pattern Recognition · Computer Science 2022-04-05 Chuanyu Pan , Yanchao Yang , Kaichun Mo , Yueqi Duan , Leonidas Guibas

Oh-A-DINO: Understanding and Enhancing Attribute-Level Information in Self-Supervised Object-Centric Representations

Object-centric understanding is fundamental to human vision and required for complex reasoning. Traditional methods define slot-based bottlenecks to learn object properties explicitly, while recent self-supervised vision models like DINO…

Computer Vision and Pattern Recognition · Computer Science 2025-10-03 Stefan Sylvius Wagner , Stefan Harmeling

Learning Object-Centric Representations Based on Slots in Real World Scenarios

A central goal in AI is to represent scenes as compositions of discrete objects, enabling fine-grained, controllable image and video generation. Yet leading diffusion models treat images holistically and rely on text conditioning, creating…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Adil Kaan Akan

Boosting Object Representation Learning via Motion and Object Continuity

Recent unsupervised multi-object detection models have shown impressive performance improvements, largely attributed to novel architectural inductive biases. Unfortunately, they may produce suboptimal object encodings for downstream tasks.…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Quentin Delfosse , Wolfgang Stammer , Thomas Rothenbacher , Dwarak Vittal , Kristian Kersting

Grounded Object Centric Learning

The extraction of modular object-centric representations for downstream tasks is an emerging area of research. Learning grounded representations of objects that are guaranteed to be stable and invariant promises robust performance across…

Machine Learning · Computer Science 2024-01-26 Avinash Kori , Francesco Locatello , Fabio De Sousa Ribeiro , Francesca Toni , Ben Glocker

MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning

Learning object-level, structured representations is widely regarded as a key to better generalization in vision and underpins the design of next-generation Pre-trained Vision Models (PVMs). Mainstream Object-Centric Learning (OCL) methods…

Computer Vision and Pattern Recognition · Computer Science 2025-10-09 Hongjia Liu , Rongzhen Zhao , Haohan Chen , Joni Pajarinen

Adaptive Slot Attention: Object Discovery with Dynamic Slot Number

Object-centric learning (OCL) extracts the representation of objects with slots, offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot…

Computer Vision and Pattern Recognition · Computer Science 2024-06-14 Ke Fan , Zechen Bai , Tianjun Xiao , Tong He , Max Horn , Yanwei Fu , Francesco Locatello , Zheng Zhang

Efficient Object-centric Representation Learning with Pre-trained Geometric Prior

This paper addresses key challenges in object-centric representation learning of video. While existing approaches struggle with complex scenes, we propose a novel weakly-supervised framework that emphasises geometric understanding and…

Computer Vision and Pattern Recognition · Computer Science 2024-12-18 Phúc H. Le Khac , Graham Healy , Alan F. Smeaton

Learning to Compose: Improving Object Centric Learning by Injecting Compositionality

Learning compositional representation is a key aspect of object-centric learning as it enables flexible systematic generalization and supports complex visual reasoning. However, most of the existing approaches rely on auto-encoding…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Whie Jung , Jaehoon Yoo , Sungjin Ahn , Seunghoon Hong

Contrastive Training of Complex-Valued Autoencoders for Object Discovery

Current state-of-the-art object-centric models use slots and attention-based routing for binding. However, this class of models has several conceptual limitations: the number of slots is hardwired; all slots have equal capacity; training…

Machine Learning · Computer Science 2023-11-10 Aleksandar Stanić , Anand Gopalakrishnan , Kazuki Irie , Jürgen Schmidhuber

Learning Disentangled Representation in Object-Centric Models for Visual Dynamics Prediction via Transformers

Recent work has shown that object-centric representations can greatly help improve the accuracy of learning dynamics while also bringing interpretability. In this work, we take this idea one step further, ask the following question: "can…

Computer Vision and Pattern Recognition · Computer Science 2024-07-04 Sanket Gandhi , Atul , Samanyu Mahajan , Vishal Sharma , Rushil Gupta , Arnab Kumar Mondal , Parag Singla

Object-centric Learning with Cyclic Walks between Parts and Whole

Learning object-centric representations from complex natural environments enables both humans and machines with reasoning abilities from low-level perceptual features. To capture compositional entities of the scene, we proposed cyclic walks…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Ziyu Wang , Mike Zheng Shou , Mengmi Zhang

Learning Object-Centric Video Models by Contrasting Sets

Contrastive, self-supervised learning of object representations recently emerged as an attractive alternative to reconstruction-based training. Prior approaches focus on contrasting individual object representations (slots) against one…

Computer Vision and Pattern Recognition · Computer Science 2020-11-23 Sindy Löwe , Klaus Greff , Rico Jonschkowski , Alexey Dosovitskiy , Thomas Kipf

Generalization and Robustness Implications in Object-Centric Learning

The idea behind object-centric representation learning is that natural scenes can better be modeled as compositions of objects and their relations as opposed to distributed representations. This inductive bias can be injected into neural…

Machine Learning · Computer Science 2022-06-10 Andrea Dittadi , Samuele Papa , Michele De Vita , Bernhard Schölkopf , Ole Winther , Francesco Locatello

Multi-Part Object Representations via Graph Structures and Co-Part Discovery

Discovering object-centric representations from images can significantly enhance the robustness, sample efficiency and generalizability of vision models. Works on images with multi-part objects typically follow an implicit object…

Computer Vision and Pattern Recognition · Computer Science 2025-12-29 Alex Foo , Wynne Hsu , Mong Li Lee

Object-centric architectures enable efficient causal representation learning

Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class). Common to all of these approaches is the assumption that…

Machine Learning · Computer Science 2023-10-31 Amin Mansouri , Jason Hartford , Yan Zhang , Yoshua Bengio

GLASS: Guided Latent Slot Diffusion for Object-Centric Learning

Object-centric learning aims to decompose an input image into a set of meaningful object files (slots). These latent object representations enable a variety of downstream tasks. Yet, object-centric learning struggles on real-world datasets,…

Computer Vision and Pattern Recognition · Computer Science 2025-06-10 Krishnakant Singh , Simone Schaub-Meyer , Stefan Roth