Related papers: Evaluating Object-Centric Models beyond Object Dis…

Are We Done with Object-Centric Learning?

Object-centric learning (OCL) seeks to learn representations that only encode an object, isolated from other objects or background cues in a scene. This approach underpins various aims, including out-of-distribution (OOD) generalization,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-14 Alexander Rubinstein , Ameya Prabhu , Matthias Bethge , Seong Joon Oh

Vector-Quantized Vision Foundation Models for Object-Centric Learning

Object-Centric Learning (OCL) aggregates image or video feature maps into object-level feature vectors, termed \textit{slots}. It's self-supervision of reconstructing the input from slots struggles with complex object textures, thus Vision…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Rongzhen Zhao , Vivienne Wang , Juho Kannala , Joni Pajarinen

Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models

Object-centric (OC) representations, which model visual scenes as compositions of discrete objects, have the potential to be used in various downstream tasks to achieve systematic compositional generalization and facilitate reasoning.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Amir Mohammad Karimi Mamaghan , Samuele Papa , Karl Henrik Johansson , Stefan Bauer , Andrea Dittadi

Beyond Object Recognition: A New Benchmark towards Object Concept Learning

Understanding objects is a central building block of artificial intelligence, especially for embodied AI. Even though object recognition excels with deep learning, current machines still struggle to learn higher-level knowledge, e.g., what…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Yong-Lu Li , Yue Xu , Xinyu Xu , Xiaohan Mao , Yuan Yao , Siqi Liu , Cewu Lu

An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning

Unsupervised object-centric representation (OCR) learning has recently drawn attention as a new paradigm of visual representation. This is because of its potential of being an effective pre-training technique for various downstream tasks in…

Machine Learning · Computer Science 2024-02-27 Jaesik Yoon , Yi-Fu Wu , Heechul Bae , Sungjin Ahn

Zero-Shot Object-Centric Representation Learning

The goal of object-centric representation learning is to decompose visual scenes into a structured representation that isolates the entities. Recent successes have shown that object-centric representation learning can be scaled to…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Aniket Didolkar , Andrii Zadaianchuk , Anirudh Goyal , Mike Mozer , Yoshua Bengio , Georg Martius , Maximilian Seitzer

When Object-Centric World Models Meet Policy Learning: From Pixels to Policies, and Where It Breaks

Object-centric world models (OCWM) aim to decompose visual scenes into object-level representations, providing structured abstractions that could improve compositional generalization and data efficiency in reinforcement learning. We…

Artificial Intelligence · Computer Science 2025-11-12 Stefano Ferraro , Akihiro Nakano , Masahiro Suzuki , Yutaka Matsuo

Is an object-centric representation beneficial for robotic manipulation ?

Object-centric representation (OCR) has recently become a subject of interest in the computer vision community for learning a structured representation of images and videos. It has been several times presented as a potential way to improve…

Artificial Intelligence · Computer Science 2025-06-25 Alexandre Chapin , Emmanuel Dellandrea , Liming Chen

Are Object-Centric Representations Better At Compositional Generalization?

Compositional generalization, the ability to reason about novel combinations of familiar concepts, is fundamental to human cognition and a critical challenge for machine learning. Object-centric (OC) representations, which encode a scene as…

Computer Vision and Pattern Recognition · Computer Science 2026-02-19 Ferdinand Kapl , Amir Mohammad Karimi Mamaghan , Maximilian Seitzer , Karl Henrik Johansson , Carsten Marr , Stefan Bauer , Andrea Dittadi

Evaluating Online Continual Learning with CALM

Online Continual Learning (OCL) studies learning over a continuous data stream without observing any single example more than once, a setting that is closer to the experience of humans and systems that must learn "on-the-wild". Yet,…

Computation and Language · Computer Science 2021-02-02 Germán Kruszewski , Ionut-Teodor Sorodoc , Tomas Mikolov

VinVL: Revisiting Visual Representations in Vision-Language Models

This paper presents a detailed study of improving visual representations for vision language (VL) tasks and develops an improved object detection model to provide object-centric representations of images. Compared to the most widely used…

Computer Vision and Pattern Recognition · Computer Science 2021-03-11 Pengchuan Zhang , Xiujun Li , Xiaowei Hu , Jianwei Yang , Lei Zhang , Lijuan Wang , Yejin Choi , Jianfeng Gao

Successes and Limitations of Object-centric Models at Compositional Generalisation

In recent years, it has been shown empirically that standard disentangled latent variable models do not support robust compositional learning in the visual domain. Indeed, in spite of being designed with the goal of factorising datasets…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Milton L. Montero , Jeffrey S. Bowers , Gaurav Malhotra

Top-Down Guidance for Learning Object-Centric Representations

Humans' innate ability to decompose scenes into objects allows for efficient understanding, predicting, and planning. In light of this, Object-Centric Learning (OCL) attempts to endow networks with similar capabilities, learning to…

Computer Vision and Pattern Recognition · Computer Science 2025-08-26 Junhong Zou , Xiangyu Zhu , Zhaoxiang Zhang , Zhen Lei

Bootstrapping Top-down Information for Self-modulating Slot Attention

Object-centric learning (OCL) aims to learn representations of individual objects within visual scenes without manual supervision, facilitating efficient and effective visual reasoning. Traditional OCL methods primarily employ bottom-up…

Computer Vision and Pattern Recognition · Computer Science 2024-11-11 Dongwon Kim , Seoyeon Kim , Suha Kwak

MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning

Learning object-level, structured representations is widely regarded as a key to better generalization in vision and underpins the design of next-generation Pre-trained Vision Models (PVMs). Mainstream Object-Centric Learning (OCL) methods…

Computer Vision and Pattern Recognition · Computer Science 2025-10-09 Hongjia Liu , Rongzhen Zhao , Haohan Chen , Joni Pajarinen

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Object-centric representation learning aims to decompose visual scenes into fixed-size vectors called "slots" or "object files", where each slot captures a distinct object. Current state-of-the-art object-centric models have shown…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Aniket Didolkar , Andrii Zadaianchuk , Rabiul Awal , Maximilian Seitzer , Efstratios Gavves , Aishwarya Agrawal

Language-Mediated, Object-Centric Representation Learning

We present Language-mediated, Object-centric Representation Learning (LORL), a paradigm for learning disentangled, object-centric scene representations from vision and language. LORL builds upon recent advances in unsupervised object…

Machine Learning · Computer Science 2021-06-09 Ruocheng Wang , Jiayuan Mao , Samuel J. Gershman , Jiajun Wu

Learning Disentangled Representation in Object-Centric Models for Visual Dynamics Prediction via Transformers

Recent work has shown that object-centric representations can greatly help improve the accuracy of learning dynamics while also bringing interpretability. In this work, we take this idea one step further, ask the following question: "can…

Computer Vision and Pattern Recognition · Computer Science 2024-07-04 Sanket Gandhi , Atul , Samanyu Mahajan , Vishal Sharma , Rushil Gupta , Arnab Kumar Mondal , Parag Singla

Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation

Vision-Language Model (VLM) have gained widespread adoption in Open-Vocabulary (OV) object detection and segmentation tasks. Despite they have shown promise on OV-related tasks, their effectiveness in conventional vision tasks has thus far…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Yongchao Feng , Yajie Liu , Shuai Yang , Wenrui Cai , Jinqing Zhang , Qiqi Zhan , Ziyue Huang , Hongxi Yan , Qiao Wan , Chenguang Liu , Junzhe Wang , Jiahui Lv , Ziqi Liu , Tengyuan Shi , Qingjie Liu , Yunhong Wang

One-Class Classification: A Survey

One-Class Classification (OCC) is a special case of multi-class classification, where data observed during training is from a single positive class. The goal of OCC is to learn a representation and/or a classifier that enables recognition…

Computer Vision and Pattern Recognition · Computer Science 2021-01-11 Pramuditha Perera , Poojan Oza , Vishal M. Patel