Related papers: A Study on Self-Supervised Object Detection Pretra…
Prior research on self-supervised learning has led to considerable progress on image classification, but often with degraded transfer performance on object detection. The objective of this paper is to advance self-supervised pretrained…
Representation learning approaches typically rely on images of objects captured from a single perspective that are transformed using affine transformations. Additionally, self-supervised learning, a successful paradigm of representation…
Despite their irresistible success, deep learning algorithms still heavily rely on annotated data. On the other hand, unsupervised settings pose many challenges, especially about determining the right inductive bias in diverse scenarios.…
It is known that representations from self-supervised pre-training can perform on par, and often better, on various downstream tasks than representations from fully-supervised pre-training. This has been shown in a host of settings such as…
Progress in self-supervised learning has brought strong general image representation learning methods. Yet so far, it has mostly focused on image-level learning. In turn, tasks such as unsupervised image segmentation have not benefited from…
We study how to leverage Web images to augment human-curated object detection datasets. Our approach is two-pronged. On the one hand, we retrieve Web images by image-to-image search, which incurs less domain shift from the curated data than…
We present a new self-supervised pre-training of Vision Transformers for dense prediction tasks. It is based on a contrastive loss across views that compares pixel-level representations to global image representations. This strategy…
Self-supervised learning (SSL) has emerged as a powerful technique for learning visual representations. While recent SSL approaches achieve strong results in global image understanding, they are limited in capturing the structured…
The unsupervised pretraining of object detectors has recently become a key component of object detector training, as it leads to improved performance and faster convergence during the supervised fine-tuning stage. Existing unsupervised…
We address the problem of inferring self-supervised dense semantic correspondences between objects in multi-object scenes. The method introduces learning of class-aware dense object descriptors by providing either unsupervised discrete…
In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine. The objective is to transfer the trained weights to perform a downstream task in the target domain. We…
Self-supervision can dramatically cut back the amount of manually-labelled data required to train deep neural networks. While self-supervision has usually been considered for tasks such as image classification, in this paper we aim at…
In this paper, we are interested in understanding self-supervised pretraining through studying the capability that self-supervised representation pretraining methods learn part-aware representations. The study is mainly motivated by that…
While supervised object detection methods achieve impressive accuracy, they generalize poorly to images whose appearance significantly differs from the data they have been trained on. To address this in scenarios where annotating data is…
We propose a novel unsupervised object localization method that allows us to explain the predictions of the model by utilizing self-supervised pre-trained models without additional finetuning. Existing unsupervised and self-supervised…
Research in scene graph generation has quickly gained traction in the past few years because of its potential to help in downstream tasks like visual question answering, image captioning, etc. Many interesting approaches have been proposed…
In the fast-evolving field of artificial intelligence, where models are increasingly growing in complexity and size, the availability of labeled data for training deep learning models has become a significant challenge. Addressing complex…
Progress has been achieved recently in object detection given advancements in deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort to label objects. This limits their…
We propose a framework to continuously learn object-centric representations for visual learning and understanding. Existing object-centric representations either rely on supervisions that individualize objects in the scene, or perform…
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. Such generality for transfer learning, however, sacrifices specificity if we are interested in a certain downstream…