Related papers: Unsupervised Keypoint Learning for Guiding Class-C…

Stochastic Video Generation with a Learned Prior

Generating video frames that accurately predict future world states is challenging. Existing approaches either fail to capture the full distribution of outcomes, or yield blurry generations, or both. In this paper we introduce an…

Computer Vision and Pattern Recognition · Computer Science 2024-03-14 Remi Denton , Rob Fergus

Unsupervised Learning of Important Objects from First-Person Videos

A first-person camera, placed at a person's head, captures, which objects are important to the camera wearer. Most prior methods for this task learn to detect such important objects from the manually labeled first-person data in a…

Computer Vision and Pattern Recognition · Computer Science 2017-08-03 Gedas Bertasius , Hyun Soo Park , Stella X. Yu , Jianbo Shi

Semi-supervised Keypoint Localization

Knowledge about the locations of keypoints of an object in an image can assist in fine-grained classification and identification tasks, particularly for the case of objects that exhibit large variations in poses that greatly influence their…

Computer Vision and Pattern Recognition · Computer Science 2021-01-21 Olga Moskvyak , Frederic Maire , Feras Dayoub , Mahsa Baktashmotlagh

Aligned Unsupervised Pretraining of Object Detectors with Self-training

The unsupervised pretraining of object detectors has recently become a key component of object detector training, as it leads to improved performance and faster convergence during the supervised fine-tuning stage. Existing unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Ioannis Maniadis Metaxas , Adrian Bulat , Ioannis Patras , Brais Martinez , Georgios Tzimiropoulos

Self-supervised Video Representation Learning by Context and Motion Decoupling

A key challenge in self-supervised video representation learning is how to effectively capture motion information besides context bias. While most existing works implicitly achieve this with video-specific pretext tasks (e.g., predicting…

Computer Vision and Pattern Recognition · Computer Science 2021-04-05 Lianghua Huang , Yu Liu , Bin Wang , Pan Pan , Yinghui Xu , Rong Jin

A Self-supervised Learning System for Object Detection in Videos Using Random Walks on Graphs

This paper presents a new self-supervised system for learning to detect novel and previously unseen categories of objects in images. The proposed system receives as input several unlabeled videos of scenes containing various objects. The…

Computer Vision and Pattern Recognition · Computer Science 2021-08-25 Juntao Tan , Changkyu Song , Abdeslam Boularias

Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning

Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory…

Robotics · Computer Science 2020-09-14 Lucas Manuelli , Yunzhu Li , Pete Florence , Russ Tedrake

Unsupervised Learning of Object Landmarks through Conditional Image Generation

We propose a method for learning landmark detectors for visual objects (such as the eyes and the nose in a face) without any manual supervision. We cast this as the problem of generating images that combine the appearance of the object as…

Computer Vision and Pattern Recognition · Computer Science 2018-12-17 Tomas Jakab , Ankush Gupta , Hakan Bilen , Andrea Vedaldi

AutoTrajectory: Label-free Trajectory Extraction and Prediction from Videos using Dynamic Points

Current methods for trajectory prediction operate in supervised manners, and therefore require vast quantities of corresponding ground truth data for training. In this paper, we present a novel, label-free algorithm, AutoTrajectory, for…

Computer Vision and Pattern Recognition · Computer Science 2020-07-14 Yuexin Ma , Xinge ZHU , Xinjing Cheng , Ruigang Yang , Jiming Liu , Dinesh Manocha

Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion

Motion, measured via optical flow, provides a powerful cue to discover and learn objects in images and videos. However, compared to using appearance, it has some blind spots, such as the fact that objects become invisible if they do not…

Computer Vision and Pattern Recognition · Computer Science 2022-10-17 Subhabrata Choudhury , Laurynas Karazija , Iro Laina , Andrea Vedaldi , Christian Rupprecht

Refining Pre-Trained Motion Models

Given the difficulty of manually annotating motion in video, the current best motion estimation methods are trained with synthetic data, and therefore struggle somewhat due to a train/test gap. Self-supervised methods hold the promise of…

Computer Vision and Pattern Recognition · Computer Science 2024-02-20 Xinglong Sun , Adam W. Harley , Leonidas J. Guibas

Unsupervised Monocular 3D Keypoint Discovery from Multi-View Diffusion Priors

This paper introduces KeyDiff3D, a framework for unsupervised monocular 3D keypoints estimation that accurately predicts 3D keypoints from a single image. While previous methods rely on manual annotations or calibrated multi-view images,…

Computer Vision and Pattern Recognition · Computer Science 2025-07-17 Subin Jeon , In Cho , Junyoung Hong , Seon Joo Kim

ViewNet: Unsupervised Viewpoint Estimation from Conditional Generation

Understanding the 3D world without supervision is currently a major challenge in computer vision as the annotations required to supervise deep networks for tasks in this domain are expensive to obtain on a large scale. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Octave Mariotti , Oisin Mac Aodha , Hakan Bilen

Unsupervised Deep Learning by Neighbourhood Discovery

Deep convolutional neural networks (CNNs) have demonstrated remarkable success in computer vision by supervisedly learning strong visual feature representations. However, training CNNs relies heavily on the availability of exhaustive…

Computer Vision and Pattern Recognition · Computer Science 2019-05-31 Jiabo Huang , Qi Dong , Shaogang Gong , Xiatian Zhu

Self-supervised Learning of Interpretable Keypoints from Unlabelled Videos

We propose KeypointGAN, a new method for recognizing the pose of objects from a single image that for learning uses only unlabelled videos and a weak empirical prior on the object poses. Video frames differ primarily in the pose of the…

Computer Vision and Pattern Recognition · Computer Science 2020-12-24 Tomas Jakab , Ankush Gupta , Hakan Bilen , Andrea Vedaldi

Disentangling Motion, Foreground and Background Features in Videos

This paper introduces an unsupervised framework to extract semantically rich features for video representation. Inspired by how the human visual system groups objects based on motion cues, we propose a deep convolutional neural network that…

Computer Vision and Pattern Recognition · Computer Science 2017-07-18 Xunyu Lin , Victor Campos , Xavier Giro-i-Nieto , Jordi Torres , Cristian Canton Ferrer

Unsupervised learning based object detection using Contrastive Learning

Training image-based object detectors presents formidable challenges, as it entails not only the complexities of object detection but also the added intricacies of precisely localizing objects within potentially diverse and noisy…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Chandan Kumar , Jansel Herrera-Gerena , John Just , Matthew Darr , Ali Jannesari

Spatio-Temporal Action Localization in a Weakly Supervised Setting

Enabling computational systems with the ability to localize actions in video-based content has manifold applications. Traditionally, such a problem is approached in a fully-supervised setting where video-clips with complete frame-by-frame…

Computer Vision and Pattern Recognition · Computer Science 2019-05-07 Kurt Degiorgio , Fabio Cuzzolin

Track, Check, Repeat: An EM Approach to Unsupervised Tracking

We propose an unsupervised method for detecting and tracking moving objects in 3D, in unlabelled RGB-D videos. The method begins with classic handcrafted techniques for segmenting objects using motion cues: we estimate optical flow and…

Computer Vision and Pattern Recognition · Computer Science 2021-04-09 Adam W. Harley , Yiming Zuo , Jing Wen , Ayush Mangal , Shubhankar Potdar , Ritwick Chaudhry , Katerina Fragkiadaki

Self-supervised object detection from audio-visual correspondence

We tackle the problem of learning object detectors without supervision. Differently from weakly-supervised object detection, we do not assume image-level class labels. Instead, we extract a supervisory signal from audio-visual data, using…

Computer Vision and Pattern Recognition · Computer Science 2022-07-12 Triantafyllos Afouras , Yuki M. Asano , Francois Fagan , Andrea Vedaldi , Florian Metze