Related papers: Unsupervised Keypoint Learning for Guiding Class-C…

Towards Keypoint Guided Self-Supervised Depth Estimation

This paper proposes to use keypoints as a self-supervision clue for learning depth map estimation from a collection of input images. As ground truth depth from real images is difficult to obtain, there are many unsupervised and…

Computer Vision and Pattern Recognition · Computer Science 2020-11-09 Kristijan Bartol , David Bojanic , Tomislav Petkovic , Tomislav Pribanic , Yago Diez Donoso

Animating Arbitrary Objects via Deep Motion Transfer

This paper introduces a novel deep learning framework for image animation. Given an input image with a target object and a driving video sequence depicting a moving object, our framework generates a video in which the target object is…

Graphics · Computer Science 2019-09-04 Aliaksandr Siarohin , Stéphane Lathuilière , Sergey Tulyakov , Elisa Ricci , Nicu Sebe

Towards Object Detection from Motion

We present a novel approach to weakly supervised object detection. Instead of annotated images, our method only requires two short videos to learn to detect a new object: 1) a video of a moving object and 2) one or more "negative" videos of…

Computer Vision and Pattern Recognition · Computer Science 2019-10-01 Rico Jonschkowski , Austin Stone

Unsupervised Bi-directional Flow-based Video Generation from one Snapshot

Imagining multiple consecutive frames given one single snapshot is challenging, since it is difficult to simultaneously predict diverse motions from a single image and faithfully generate novel frames without visual distortions. In this…

Computer Vision and Pattern Recognition · Computer Science 2019-03-05 Lu Sheng , Junting Pan , Jiaming Guo , Jing Shao , Xiaogang Wang , Chen Change Loy

Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction

The emerging field of action prediction plays a vital role in various computer vision applications such as autonomous driving, activity analysis and human-computer interaction. Despite significant advancements, accurately predicting future…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Izzeddin Teeti , Rongali Sai Bhargav , Vivek Singh , Andrew Bradley , Biplab Banerjee , Fabio Cuzzolin

Symbolic Pregression: Discovering Physical Laws from Distorted Video

We present a method for unsupervised learning of equations of motion for objects in raw and optionally distorted unlabeled video. We first train an autoencoder that maps each video frame into a low-dimensional latent space where the laws of…

Computer Vision and Pattern Recognition · Computer Science 2021-04-28 Silviu-Marian Udrescu , Max Tegmark

Unsupervised Representation Learning by Sorting Sequences

We present an unsupervised representation learning approach using videos without semantic labels. We leverage the temporal coherence as a supervisory signal by formulating representation learning as a sequence sorting task. We take…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Hsin-Ying Lee , Jia-Bin Huang , Maneesh Singh , Ming-Hsuan Yang

Learning and Predicting Multimodal Vehicle Action Distributions in a Unified Probabilistic Model Without Labels

We present a unified probabilistic model that learns a representative set of discrete vehicle actions and predicts the probability of each action given a particular scenario. Our model also enables us to estimate the distribution over…

Robotics · Computer Science 2022-12-15 Charles Richter , Patrick R. Barragán , Sertac Karaman

Unsupervised learning from videos using temporal coherency deep networks

In this work we address the challenging problem of unsupervised learning from videos. Existing methods utilize the spatio-temporal continuity in contiguous video frames as regularization for the learning process. Typically, this temporal…

Computer Vision and Pattern Recognition · Computer Science 2018-10-12 Carolina Redondo-Cabrera , Roberto J. López-Sastre

STEPs: Self-Supervised Key Step Extraction and Localization from Unlabeled Procedural Videos

We address the problem of extracting key steps from unlabeled procedural videos, motivated by the potential of Augmented Reality (AR) headsets to revolutionize job training and performance. We decompose the problem into two steps:…

Computer Vision and Pattern Recognition · Computer Science 2023-09-12 Anshul Shah , Benjamin Lundell , Harpreet Sawhney , Rama Chellappa

Self-Supervised Learning of Audio-Visual Objects from Video

Our objective is to transform a video into a set of discrete audio-visual objects using self-supervised learning. To this end, we introduce a model that uses attention to localize and group sound sources, and optical flow to aggregate…

Computer Vision and Pattern Recognition · Computer Science 2020-08-11 Triantafyllos Afouras , Andrew Owens , Joon Son Chung , Andrew Zisserman

Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras

We present a novel method for simultaneous learning of depth, egomotion, object motion, and camera intrinsics from monocular videos, using only consistency across neighboring video frames as supervision signal. Similarly to prior work, our…

Computer Vision and Pattern Recognition · Computer Science 2019-10-31 Ariel Gordon , Hanhan Li , Rico Jonschkowski , Anelia Angelova

Unsupervised Learning of Edges

Data-driven approaches for edge detection have proven effective and achieve top results on modern benchmarks. However, all current data-driven edge detectors require manual supervision for training in the form of hand-labeled region…

Computer Vision and Pattern Recognition · Computer Science 2016-04-12 Yin Li , Manohar Paluri , James M. Rehg , Piotr Dollár

Semi-Weakly Supervised Object Kinematic Motion Prediction

Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters. Due to the large variations in both topological structure and geometric details of 3D objects, this remains a…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Gengxin Liu , Qian Sun , Haibin Huang , Chongyang Ma , Yulan Guo , Li Yi , Hui Huang , Ruizhen Hu

Using Motion Cues to Supervise Single-Frame Body Pose and Shape Estimation in Low Data Regimes

When enough annotated training data is available, supervised deep-learning algorithms excel at estimating human body pose and shape using a single camera. The effects of too little such data being available can be mitigated by using other…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Andrey Davydov , Alexey Sidnev , Artsiom Sanakoyeu , Yuhua Chen , Mathieu Salzmann , Pascal Fua

Unsupervised learning of object frames by dense equivariant image labelling

One of the key challenges of visual perception is to extract abstract models of 3D objects and object categories from visual measurements, which are affected by complex nuisance factors such as viewpoint, occlusion, motion, and…

Computer Vision and Pattern Recognition · Computer Science 2017-11-21 James Thewlis , Hakan Bilen , Andrea Vedaldi

Predicting Long-horizon Futures by Conditioning on Geometry and Time

Our work explores the task of generating future sensor observations conditioned on the past. We are motivated by `predictive coding' concepts from neuroscience as well as robotic applications such as self-driving vehicles. Predictive video…

Computer Vision and Pattern Recognition · Computer Science 2024-04-18 Tarasha Khurana , Deva Ramanan

Self-supervised classification of dynamic obstacles using the temporal information provided by videos

Nowadays, autonomous driving systems can detect, segment, and classify the surrounding obstacles using a monocular camera. However, state-of-the-art methods solving these tasks generally perform a fully supervised learning process and…

Computer Vision and Pattern Recognition · Computer Science 2020-06-09 Sid Ali Hamideche , Florent Chiaroni , Mohamed-Cherif Rahal

Self-Supervised Visual Learning by Variable Playback Speeds Prediction of a Video

We propose a self-supervised visual learning method by predicting the variable playback speeds of a video. Without semantic labels, we learn the spatio-temporal visual representation of the video by leveraging the variations in the visual…

Computer Vision and Pattern Recognition · Computer Science 2021-06-02 Hyeon Cho , Taehoon Kim , Hyung Jin Chang , Wonjun Hwang

Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images

We address the problem of localisation of objects as bounding boxes in images and videos with weak labels. This weakly supervised object localisation problem has been tackled in the past using discriminative models where each object class…

Computer Vision and Pattern Recognition · Computer Science 2017-06-20 Zhiyuan Shi , Timothy M. Hospedales , Tao Xiang