Related papers: Unsupervised Keypoint Learning for Guiding Class-C…

Pose from Action: Unsupervised Learning of Pose Features based on Motion

Human actions are comprised of a sequence of poses. This makes videos of humans a rich and dense source of human poses. We propose an unsupervised method to learn pose features from videos that exploits a signal which is complementary to…

Computer Vision and Pattern Recognition · Computer Science 2016-09-20 Senthil Purushwalkam , Abhinav Gupta

Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos

This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot. A…

Computer Vision and Pattern Recognition · Computer Science 2023-04-11 Shiyang Lu , Yunfu Deng , Abdeslam Boularias , Kostas Bekris

Fourier-based Video Prediction through Relational Object Motion

The ability to predict future outcomes conditioned on observed video frames is crucial for intelligent decision-making in autonomous systems. Recently, deep recurrent architectures have been applied to the task of video prediction. However,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-13 Malte Mosbach , Sven Behnke

Transformation-Based Models of Video Sequences

In this work we propose a simple unsupervised approach for next frame prediction in video. Instead of directly predicting the pixels in a frame given past frames, we predict the transformations needed for generating the next frame in a…

Machine Learning · Computer Science 2023-02-07 Joost van Amersfoort , Anitha Kannan , Marc'Aurelio Ranzato , Arthur Szlam , Du Tran , Soumith Chintala

A Review on Deep Learning Techniques for Video Prediction

The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a…

Computer Vision and Pattern Recognition · Computer Science 2021-04-22 Sergiu Oprea , Pablo Martinez-Gonzalez , Alberto Garcia-Garcia , John Alejandro Castro-Vargas , Sergio Orts-Escolano , Jose Garcia-Rodriguez , Antonis Argyros

A Study on Self-Supervised Object Detection Pretraining

In this work, we study different approaches to self-supervised pretraining of object detection models. We first design a general framework to learn a spatially consistent dense representation from an image, by randomly sampling and…

Computer Vision and Pattern Recognition · Computer Science 2022-08-12 Trung Dang , Simon Kornblith , Huy Thong Nguyen , Peter Chin , Maryam Khademi

Keypoint-Based Category-Level Object Pose Tracking from an RGB Sequence with Uncertainty Estimation

We propose a single-stage, category-level 6-DoF pose estimation algorithm that simultaneously detects and tracks instances of objects within a known category. Our method takes as input the previous and current frame from a monocular RGB…

Computer Vision and Pattern Recognition · Computer Science 2022-05-24 Yunzhi Lin , Jonathan Tremblay , Stephen Tyree , Patricio A. Vela , Stan Birchfield

Self-Supervised Viewpoint Learning From Image Collections

Training deep neural networks to estimate the viewpoint of objects requires large labeled training datasets. However, manually labeling viewpoints is notoriously hard, error-prone, and time-consuming. On the other hand, it is relatively…

Computer Vision and Pattern Recognition · Computer Science 2020-04-07 Siva Karthik Mustikovela , Varun Jampani , Shalini De Mello , Sifei Liu , Umar Iqbal , Carsten Rother , Jan Kautz

Self-supervised Object-Centric Learning for Videos

Unsupervised multi-object segmentation has shown impressive results on images by utilizing powerful semantics learned from self-supervised pretraining. An additional modality such as depth or motion is often used to facilitate the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Görkay Aydemir , Weidi Xie , Fatma Güney

Unsupervised Object Learning via Common Fate

Learning generative object models from unlabelled videos is a long standing problem and required for causal scene modeling. We decompose this problem into three easier subtasks, and provide candidate solutions for each of them. Inspired by…

Computer Vision and Pattern Recognition · Computer Science 2023-05-16 Matthias Tangemann , Steffen Schneider , Julius von Kügelgen , Francesco Locatello , Peter Gehler , Thomas Brox , Matthias Kümmerer , Matthias Bethge , Bernhard Schölkopf

Learning Motion Patterns in Videos

The problem of determining whether an object is in motion, irrespective of camera motion, is far from being solved. We address this challenging task by learning motion patterns in videos. The core of our approach is a fully convolutional…

Computer Vision and Pattern Recognition · Computer Science 2017-04-11 Pavel Tokmakov , Karteek Alahari , Cordelia Schmid

Learning Goals from Failure

We introduce a framework that predicts the goals behind observable human action in video. Motivated by evidence in developmental psychology, we leverage video of unintentional action to learn video representations of goals without direct…

Computer Vision and Pattern Recognition · Computer Science 2020-12-17 Dave Epstein , Carl Vondrick

Zero-Shot Learning via Class-Conditioned Deep Generative Models

We present a deep generative model for learning to predict classes not seen at training time. Unlike most existing methods for this problem, that represent each class as a point (via a semantic embedding), we represent each seen/unseen…

Machine Learning · Computer Science 2017-11-21 Wenlin Wang , Yunchen Pu , Vinay Kumar Verma , Kai Fan , Yizhe Zhang , Changyou Chen , Piyush Rai , Lawrence Carin

Self-Supervised Moving Vehicle Detection from Audio-Visual Cues

Robust detection of moving vehicles is a critical task for any autonomously operating outdoor robot or self-driving vehicle. Most modern approaches for solving this task rely on training image-based detectors using large-scale vehicle…

Computer Vision and Pattern Recognition · Computer Science 2022-06-14 Jannik Zürn , Wolfram Burgard

Unsupervised Learning of Spatiotemporally Coherent Metrics

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled…

Computer Vision and Pattern Recognition · Computer Science 2015-09-09 Ross Goroshin , Joan Bruna , Jonathan Tompson , David Eigen , Yann LeCun

Self-Supervised Goal-Conditioned Pick and Place

Robots have the capability to collect large amounts of data autonomously by interacting with objects in the world. However, it is often not obvious \emph{how} to learning from autonomously collected data without human-labeled supervision.…

Robotics · Computer Science 2020-08-27 Coline Devin , Payam Rowghanian , Chris Vigorito , Will Richards , Khashayar Rohanimanesh

PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and Planning

Predicting future scene representations is a crucial task for enabling robots to understand and interact with the environment. However, most existing methods rely on videos and simulations with precise action annotations, limiting their…

Computer Vision and Pattern Recognition · Computer Science 2025-05-22 Angel Villar-Corrales , Sven Behnke

Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

Accurate 3D object detection in LiDAR point clouds is crucial for autonomous driving systems. To achieve state-of-the-art performance, the supervised training of detectors requires large amounts of human-annotated data, which is expensive…

Computer Vision and Pattern Recognition · Computer Science 2024-08-08 Christian Fruhwirth-Reisinger , Wei Lin , Dušan Malić , Horst Bischof , Horst Possegger

Learning to Detect and Retrieve Objects from Unlabeled Videos

Learning an object detector or retrieval requires a large data set with manual annotations. Such data sets are expensive and time consuming to create and therefore difficult to obtain on a large scale. In this work, we propose to exploit…

Computer Vision and Pattern Recognition · Computer Science 2019-10-22 Elad Amrani , Rami Ben-Ari , Tal Hakim , Alex Bronstein

Weakly Supervised Learning of Keypoints for 6D Object Pose Estimation

State-of-the-art approaches for 6D object pose estimation require large amounts of labeled data to train the deep networks. However, the acquisition of 6D object pose annotations is tedious and labor-intensive in large quantity. To…

Computer Vision and Pattern Recognition · Computer Science 2022-03-08 Meng Tian , Gim Hee Lee