Related papers: Unsupervised Keypoint Learning for Guiding Class-C…

Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos

Category-level 3D pose estimation is a fundamentally important problem in computer vision and robotics, e.g. for embodied agents or to train 3D generative models. However, so far methods that estimate the category-level object pose require…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Leonhard Sommer , Artur Jesslen , Eddy Ilg , Adam Kortylewski

Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints

Visual scenes are extremely diverse, not only because there are infinite possible combinations of objects and backgrounds but also because the observations of the same scene may vary greatly with the change of viewpoints. When observing a…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Jinyang Yuan , Tonglin Chen , Zhimeng Shen , Bin Li , Xiangyang Xue

Object-Centric Learning for Real-World Videos by Predicting Temporal Feature Similarities

Unsupervised video-based object-centric learning is a promising avenue to learn structured representations from large, unlabeled video collections, but previous approaches have only managed to scale to real-world datasets in restricted…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Andrii Zadaianchuk , Maximilian Seitzer , Georg Martius

Unsupervised Feature Learning from Temporal Data

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled…

Computer Vision and Pattern Recognition · Computer Science 2015-04-17 Ross Goroshin , Joan Bruna , Jonathan Tompson , David Eigen , Yann LeCun

Conditional Object-Centric Learning from Video

Object-centric representations are a promising path toward more systematic generalization by providing flexible abstractions upon which compositional world models can be built. Recent work on simple 2D and 3D datasets has shown that models…

Computer Vision and Pattern Recognition · Computer Science 2022-03-16 Thomas Kipf , Gamaleldin F. Elsayed , Aravindh Mahendran , Austin Stone , Sara Sabour , Georg Heigold , Rico Jonschkowski , Alexey Dosovitskiy , Klaus Greff

Unsupervised Object Discovery and Tracking in Video Collections

This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision. We formulate the problem as a combination of two complementary…

Computer Vision and Pattern Recognition · Computer Science 2015-05-15 Suha Kwak , Minsu Cho , Ivan Laptev , Jean Ponce , Cordelia Schmid

Video alignment using unsupervised learning of local and global features

In this paper, we tackle the problem of video alignment, the process of matching the frames of a pair of videos containing similar actions. The main challenge in video alignment is that accurate correspondence should be established despite…

Computer Vision and Pattern Recognition · Computer Science 2024-09-09 Niloufar Fakhfour , Mohammad ShahverdiKondori , Sajjad Hashembeiki , Mohammadjavad Norouzi , Hoda Mohammadzade

Cross Pixel Optical Flow Similarity for Self-Supervised Learning

We propose a novel method for learning convolutional neural image representations without manual supervision. We use motion cues in the form of optical flow, to supervise representations of static images. The obvious approach of training a…

Computer Vision and Pattern Recognition · Computer Science 2018-07-17 Aravindh Mahendran , James Thewlis , Andrea Vedaldi

Semi-supervised Viewpoint Estimation with Geometry-aware Conditional Generation

There is a growing interest in developing computer vision methods that can learn from limited supervision. In this paper, we consider the problem of learning to predict camera viewpoints, where obtaining ground-truth annotations are…

Computer Vision and Pattern Recognition · Computer Science 2021-04-05 Octave Mariotti , Hakan Bilen

Unsupervised Learning of Visual 3D Keypoints for Control

Learning sensorimotor control policies from high-dimensional images crucially relies on the quality of the underlying visual representations. Prior works show that structured latent space such as visual keypoints often outperforms…

Machine Learning · Computer Science 2021-06-15 Boyuan Chen , Pieter Abbeel , Deepak Pathak

Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos

Despite the recent advances in video classification, progress in spatio-temporal action recognition has lagged behind. A major contributing factor has been the prohibitive cost of annotating videos frame-by-frame. In this paper, we present…

Computer Vision and Pattern Recognition · Computer Science 2020-07-22 Anurag Arnab , Chen Sun , Arsha Nagrani , Cordelia Schmid

Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints

Visual scenes are extremely rich in diversity, not only because there are infinite combinations of objects and background, but also because the observations of the same scene may vary greatly with the change of viewpoints. When observing a…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Jinyang Yuan , Bin Li , Xiangyang Xue

Self-Supervised Keypoint Discovery in Behavioral Videos

We propose a method for learning the posture and structure of agents from unlabelled behavioral videos. Starting from the observation that behaving agents are generally the main sources of movement in behavioral videos, our method,…

Computer Vision and Pattern Recognition · Computer Science 2022-04-28 Jennifer J. Sun , Serim Ryou , Roni Goldshmid , Brandon Weissbourd , John Dabiri , David J. Anderson , Ann Kennedy , Yisong Yue , Pietro Perona

Learning To Segment Dominant Object Motion From Watching Videos

Existing deep learning based unsupervised video object segmentation methods still rely on ground-truth segmentation masks to train. Unsupervised in this context only means that no annotated frames are used during inference. As obtaining…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Sahir Shrestha , Mohammad Ali Armin , Hongdong Li , Nick Barnes

SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM

Event-based keypoint detection and matching holds significant potential, enabling the integration of event sensors into highly optimized Visual SLAM systems developed for frame cameras over decades of research. Unfortunately, existing…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Yannick Burkhardt , Simon Schaefer , Stefan Leutenegger

Learning image representations tied to ego-motion

Understanding how images of objects and scenes behave in response to specific ego-motions is a crucial aspect of proper visual development, yet existing visual learning methods are conspicuously disconnected from the physical source of…

Computer Vision and Pattern Recognition · Computer Science 2016-03-30 Dinesh Jayaraman , Kristen Grauman

Entropy-driven Unsupervised Keypoint Representation Learning in Videos

Extracting informative representations from videos is fundamental for effectively learning various downstream tasks. We present a novel approach for unsupervised learning of meaningful representations from videos, leveraging the concept of…

Computer Vision and Pattern Recognition · Computer Science 2023-06-07 Ali Younes , Simone Schaub-Meyer , Georgia Chalvatzaki

Unsupervised learning of object landmarks by factorized spatial embeddings

Learning automatically the structure of object categories remains an important open problem in computer vision. In this paper, we propose a novel unsupervised approach that can discover and learn landmarks in object categories, thus…

Computer Vision and Pattern Recognition · Computer Science 2017-08-08 James Thewlis , Hakan Bilen , Andrea Vedaldi

Using a Supervised Method without supervision for foreground segmentation

Neural networks are a powerful framework for foreground segmentation in video acquired by static cameras, segmenting moving objects from the background in a robust way in various challenging scenarios. The premier methods are those based on…

Computer Vision and Pattern Recognition · Computer Science 2021-06-22 Levi Kassel , Michael Werman

Evolving Losses for Unsupervised Video Representation Learning

We present a new method to learn video representations from large-scale unlabeled video data. Ideally, this representation will be generic and transferable, directly usable for new tasks such as action recognition and zero or few-shot…

Computer Vision and Pattern Recognition · Computer Science 2020-02-28 AJ Piergiovanni , Anelia Angelova , Michael S. Ryoo