English
Related papers

Related papers: Unsupervised Keypoint Learning for Guiding Class-C…

200 papers

Human actions are comprised of a sequence of poses. This makes videos of humans a rich and dense source of human poses. We propose an unsupervised method to learn pose features from videos that exploits a signal which is complementary to…

Computer Vision and Pattern Recognition · Computer Science 2016-09-20 Senthil Purushwalkam , Abhinav Gupta

This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot. A…

Computer Vision and Pattern Recognition · Computer Science 2023-04-11 Shiyang Lu , Yunfu Deng , Abdeslam Boularias , Kostas Bekris

The ability to predict future outcomes conditioned on observed video frames is crucial for intelligent decision-making in autonomous systems. Recently, deep recurrent architectures have been applied to the task of video prediction. However,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-13 Malte Mosbach , Sven Behnke

In this work we propose a simple unsupervised approach for next frame prediction in video. Instead of directly predicting the pixels in a frame given past frames, we predict the transformations needed for generating the next frame in a…

Machine Learning · Computer Science 2023-02-07 Joost van Amersfoort , Anitha Kannan , Marc'Aurelio Ranzato , Arthur Szlam , Du Tran , Soumith Chintala

The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a…

In this work, we study different approaches to self-supervised pretraining of object detection models. We first design a general framework to learn a spatially consistent dense representation from an image, by randomly sampling and…

Computer Vision and Pattern Recognition · Computer Science 2022-08-12 Trung Dang , Simon Kornblith , Huy Thong Nguyen , Peter Chin , Maryam Khademi

We propose a single-stage, category-level 6-DoF pose estimation algorithm that simultaneously detects and tracks instances of objects within a known category. Our method takes as input the previous and current frame from a monocular RGB…

Computer Vision and Pattern Recognition · Computer Science 2022-05-24 Yunzhi Lin , Jonathan Tremblay , Stephen Tyree , Patricio A. Vela , Stan Birchfield

Training deep neural networks to estimate the viewpoint of objects requires large labeled training datasets. However, manually labeling viewpoints is notoriously hard, error-prone, and time-consuming. On the other hand, it is relatively…

Computer Vision and Pattern Recognition · Computer Science 2020-04-07 Siva Karthik Mustikovela , Varun Jampani , Shalini De Mello , Sifei Liu , Umar Iqbal , Carsten Rother , Jan Kautz

Unsupervised multi-object segmentation has shown impressive results on images by utilizing powerful semantics learned from self-supervised pretraining. An additional modality such as depth or motion is often used to facilitate the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Görkay Aydemir , Weidi Xie , Fatma Güney

Learning generative object models from unlabelled videos is a long standing problem and required for causal scene modeling. We decompose this problem into three easier subtasks, and provide candidate solutions for each of them. Inspired by…

The problem of determining whether an object is in motion, irrespective of camera motion, is far from being solved. We address this challenging task by learning motion patterns in videos. The core of our approach is a fully convolutional…

Computer Vision and Pattern Recognition · Computer Science 2017-04-11 Pavel Tokmakov , Karteek Alahari , Cordelia Schmid

We introduce a framework that predicts the goals behind observable human action in video. Motivated by evidence in developmental psychology, we leverage video of unintentional action to learn video representations of goals without direct…

Computer Vision and Pattern Recognition · Computer Science 2020-12-17 Dave Epstein , Carl Vondrick

We present a deep generative model for learning to predict classes not seen at training time. Unlike most existing methods for this problem, that represent each class as a point (via a semantic embedding), we represent each seen/unseen…

Machine Learning · Computer Science 2017-11-21 Wenlin Wang , Yunchen Pu , Vinay Kumar Verma , Kai Fan , Yizhe Zhang , Changyou Chen , Piyush Rai , Lawrence Carin

Robust detection of moving vehicles is a critical task for any autonomously operating outdoor robot or self-driving vehicle. Most modern approaches for solving this task rely on training image-based detectors using large-scale vehicle…

Computer Vision and Pattern Recognition · Computer Science 2022-06-14 Jannik Zürn , Wolfram Burgard

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled…

Computer Vision and Pattern Recognition · Computer Science 2015-09-09 Ross Goroshin , Joan Bruna , Jonathan Tompson , David Eigen , Yann LeCun

Robots have the capability to collect large amounts of data autonomously by interacting with objects in the world. However, it is often not obvious \emph{how} to learning from autonomously collected data without human-labeled supervision.…

Robotics · Computer Science 2020-08-27 Coline Devin , Payam Rowghanian , Chris Vigorito , Will Richards , Khashayar Rohanimanesh

Predicting future scene representations is a crucial task for enabling robots to understand and interact with the environment. However, most existing methods rely on videos and simulations with precise action annotations, limiting their…

Computer Vision and Pattern Recognition · Computer Science 2025-05-22 Angel Villar-Corrales , Sven Behnke

Accurate 3D object detection in LiDAR point clouds is crucial for autonomous driving systems. To achieve state-of-the-art performance, the supervised training of detectors requires large amounts of human-annotated data, which is expensive…

Computer Vision and Pattern Recognition · Computer Science 2024-08-08 Christian Fruhwirth-Reisinger , Wei Lin , Dušan Malić , Horst Bischof , Horst Possegger

Learning an object detector or retrieval requires a large data set with manual annotations. Such data sets are expensive and time consuming to create and therefore difficult to obtain on a large scale. In this work, we propose to exploit…

Computer Vision and Pattern Recognition · Computer Science 2019-10-22 Elad Amrani , Rami Ben-Ari , Tal Hakim , Alex Bronstein

State-of-the-art approaches for 6D object pose estimation require large amounts of labeled data to train the deep networks. However, the acquisition of 6D object pose annotations is tedious and labor-intensive in large quantity. To…

Computer Vision and Pattern Recognition · Computer Science 2022-03-08 Meng Tian , Gim Hee Lee