Related papers: Unsupervised Keypoint Learning for Guiding Class-C…

Video (language) modeling: a baseline for generative models of natural videos

We propose a strong baseline model for unsupervised feature learning using video data. By learning to predict missing frames or extrapolate future frames from an input video sequence, the model discovers both spatial and temporal…

Machine Learning · Computer Science 2016-05-05 MarcAurelio Ranzato , Arthur Szlam , Joan Bruna , Michael Mathieu , Ronan Collobert , Sumit Chopra

Efficient Unsupervised Video Object Segmentation Network Based on Motion Guidance

Due to the problem of performance constraints of unsupervised video object detection, its large-scale application is limited. In response to this pain point, we propose another excellent method to solve this problematic point. By…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Chao Hu , Liqiang Zhu

Learning to Predict Robot Keypoints Using Artificially Generated Images

This work considers robot keypoint estimation on color images as a supervised machine learning task. We propose the use of probabilistically created renderings to overcome the lack of labeled real images. Rather than sampling from…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Christoph Heindl , Sebastian Zambal , Josef Scharinger

Adversarial Framework for Unsupervised Learning of Motion Dynamics in Videos

Human behavior understanding in videos is a complex, still unsolved problem and requires to accurately model motion at both the local (pixel-wise dense prediction) and global (aggregation of motion cues) levels. Current approaches based on…

Computer Vision and Pattern Recognition · Computer Science 2019-09-19 C. Spampinato , S. Palazzo , P. D'Oro , D. Giordano , M. Shah

Shape and Viewpoint without Keypoints

We present a learning framework that learns to recover the 3D shape, pose and texture from a single image, trained on an image collection without any ground truth 3D shape, multi-view, camera viewpoints or keypoint supervision. We approach…

Computer Vision and Pattern Recognition · Computer Science 2020-07-22 Shubham Goel , Angjoo Kanazawa , Jitendra Malik

Object-Centric Representation Learning from Unlabeled Videos

Supervised (pre-)training currently yields state-of-the-art performance for representation learning for visual recognition, yet it comes at the cost of (1) intensive manual annotations and (2) an inherent restriction in the scope of data…

Computer Vision and Pattern Recognition · Computer Science 2016-12-05 Ruohan Gao , Dinesh Jayaraman , Kristen Grauman

Learning Velocity and Acceleration: Self-Supervised Motion Consistency for Pedestrian Trajectory Prediction

Understanding human motion is crucial for accurate pedestrian trajectory prediction. Conventional methods typically rely on supervised learning, where ground-truth labels are directly optimized against predicted trajectories. This amplifies…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Yizhou Huang , Yihua Cheng , Kezhi Wang

Learning to Predict Gradients for Semi-Supervised Continual Learning

A key challenge for machine intelligence is to learn new visual concepts without forgetting the previously acquired knowledge. Continual learning is aimed towards addressing this challenge. However, there is a gap between existing…

Machine Learning · Computer Science 2024-02-01 Yan Luo , Yongkang Wong , Mohan Kankanhalli , Qi Zhao

Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction

The success of deep neural networks generally requires a vast amount of training data to be labeled, which is expensive and unfeasible in scale, especially for video collections. To alleviate this problem, in this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2019-04-05 Longlong Jing , Xiaodong Yang , Jingen Liu , Yingli Tian

Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

Deep visuomotor policy learning, which aims to map raw visual observation to action, achieves promising results in control tasks such as robotic manipulation and autonomous driving. However, it requires a huge number of online interactions…

Computer Vision and Pattern Recognition · Computer Science 2022-07-19 Qihang Zhang , Zhenghao Peng , Bolei Zhou

Video Prediction via Example Guidance

In video prediction tasks, one major challenge is to capture the multi-modal nature of future contents and dynamics. In this work, we propose a simple yet effective framework that can efficiently predict plausible future states. The key…

Computer Vision and Pattern Recognition · Computer Science 2020-07-06 Jingwei Xu , Huazhe Xu , Bingbing Ni , Xiaokang Yang , Trevor Darrell

Unsupervised Representation Learning by Predicting Random Distances

Deep neural networks have gained tremendous success in a broad range of machine learning tasks due to its remarkable capability to learn semantic-rich features from high-dimensional data. However, they often require large-scale labelled…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Hu Wang , Guansong Pang , Chunhua Shen , Congbo Ma

Unsupervised Learning from Continuous Video in a Scalable Predictive Recurrent Network

Understanding visual reality involves acquiring common-sense knowledge about countless regularities in the visual world, e.g., how illumination alters the appearance of objects in a scene, and how motion changes their apparent spatial…

Computer Vision and Pattern Recognition · Computer Science 2016-10-03 Filip Piekniewski , Patryk Laurent , Csaba Petre , Micah Richert , Dimitry Fisher , Todd Hylton

Objects do not disappear: Video object detection by single-frame object location anticipation

Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by…

Computer Vision and Pattern Recognition · Computer Science 2023-08-10 Xin Liu , Fatemeh Karimi Nejadasl , Jan C. van Gemert , Olaf Booij , Silvia L. Pintea

Shuffle and Learn: Unsupervised Learning using Temporal Order Verification

In this paper, we present an approach for learning a visual representation from the raw spatiotemporal signals in videos. Our representation is learned without supervision from semantic labels. We formulate our method as an unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2016-07-27 Ishan Misra , C. Lawrence Zitnick , Martial Hebert

Self-supervised Learning of Motion Capture

Current state-of-the-art solutions for motion capture from a single camera are optimization driven: they optimize the parameters of a 3D human model so that its re-projection matches measurements in the video (e.g. person segmentation,…

Computer Vision and Pattern Recognition · Computer Science 2017-12-06 Hsiao-Yu Fish Tung , Hsiao-Wei Tung , Ersin Yumer , Katerina Fragkiadaki

Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos

Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic…

Computer Vision and Pattern Recognition · Computer Science 2017-03-29 Alejandro Betancourt , Natalia Díaz-Rodríguez , Emilia Barakova , Lucio Marcenaro , Matthias Rauterberg , Carlo Regazzoni

Unsupervised Keypoints from Pretrained Diffusion Models

Unsupervised learning of keypoints and landmarks has seen significant progress with the help of modern neural network architectures, but performance is yet to match the supervised counterpart, making their practicability questionable. We…

Computer Vision and Pattern Recognition · Computer Science 2024-05-24 Eric Hedlin , Gopal Sharma , Shweta Mahajan , Xingzhe He , Hossam Isack , Abhishek Kar Helge Rhodin , Andrea Tagliasacchi , Kwang Moo Yi

Keypoint Abstraction using Large Models for Object-Relative Imitation Learning

Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics. Keypoint-based representations have been proven effective as a succinct representation for capturing…

Robotics · Computer Science 2024-10-31 Xiaolin Fang , Bo-Ruei Huang , Jiayuan Mao , Jasmine Shone , Joshua B. Tenenbaum , Tomás Lozano-Pérez , Leslie Pack Kaelbling

MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation

Unsupervised domain adaptation (UDA) has been a potent technique to handle the lack of annotations in the target domain, particularly in semantic segmentation task. This study introduces a different UDA scenarios where the target domain…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Fei Pan , Xu Yin , Seokju Lee , Axi Niu , Sungeui Yoon , In So Kweon