English
Related papers

Related papers: Video Representation Learning by Recognizing Tempo…

200 papers

In this paper, we present an approach for learning a visual representation from the raw spatiotemporal signals in videos. Our representation is learned without supervision from semantic labels. We formulate our method as an unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2016-07-27 Ishan Misra , C. Lawrence Zitnick , Martial Hebert

We propose a self-supervised visual learning method by predicting the variable playback speeds of a video. Without semantic labels, we learn the spatio-temporal visual representation of the video by leveraging the variations in the visual…

Computer Vision and Pattern Recognition · Computer Science 2021-06-02 Hyeon Cho , Taehoon Kim , Hyung Jin Chang , Wonjun Hwang

Self-supervised tasks have been utilized to build useful representations that can be used in downstream tasks when the annotation is unavailable. In this paper, we introduce a self-supervised video representation learning method based on…

Computer Vision and Pattern Recognition · Computer Science 2021-02-23 Duc Quang Vu , Ngan T. H. Le , Jia-Ching Wang

We address the problem of video representation learning without human-annotated labels. While previous efforts address the problem by designing novel self-supervised tasks using video data, the learned features are merely on a…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Jiangliu Wang , Jianbo Jiao , Linchao Bao , Shengfeng He , Yunhui Liu , Wei Liu

We propose a self-supervised approach for learning representations and robotic behaviors entirely from unlabeled videos recorded from multiple viewpoints, and study how this representation can be used in two robotic imitation settings:…

Computer Vision and Pattern Recognition · Computer Science 2018-03-21 Pierre Sermanet , Corey Lynch , Yevgen Chebotar , Jasmine Hsu , Eric Jang , Stefan Schaal , Sergey Levine

In this paper, a novel video classification method is presented that aims to recognize different categories of third-person videos efficiently. Our motivation is to achieve a light model that could be trained with insufficient training…

Computer Vision and Pattern Recognition · Computer Science 2022-03-29 Ali Javidani , Ahmad Mahmoudi-Aznaveh

We introduce a novel self-supervised contrastive learning method to learn representations from unlabelled videos. Existing approaches ignore the specifics of input distortions, e.g., by learning invariance to temporal transformations.…

Computer Vision and Pattern Recognition · Computer Science 2021-12-08 Simon Jenni , Hailin Jin

The success of deep neural networks generally requires a vast amount of training data to be labeled, which is expensive and unfeasible in scale, especially for video collections. To alleviate this problem, in this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2019-04-05 Longlong Jing , Xiaodong Yang , Jingen Liu , Yingli Tian

Recent advances in deep learning have achieved promising performance for medical image analysis, while in most cases ground-truth annotations from human experts are necessary to train the deep model. In practice, such annotations are…

Computer Vision and Pattern Recognition · Computer Science 2020-03-03 Jianbo Jiao , Richard Droste , Lior Drukker , Aris T. Papageorghiou , J. Alison Noble

Recent single image unsupervised representation learning techniques show remarkable success on a variety of tasks. The basic principle in these works is instance discrimination: learning to differentiate between two augmented versions of…

Computer Vision and Pattern Recognition · Computer Science 2020-05-08 Daniel Gordon , Kiana Ehsani , Dieter Fox , Ali Farhadi

This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation.…

Computer Vision and Pattern Recognition · Computer Science 2017-04-13 Deepak Pathak , Ross Girshick , Piotr Dollár , Trevor Darrell , Bharath Hariharan

Video motion magnification techniques allow us to see small motions previously invisible to the naked eyes, such as those of vibrating airplane wings, or swaying buildings under the influence of the wind. Because the motion is small, the…

Computer Vision and Pattern Recognition · Computer Science 2019-02-18 Tae-Hyun Oh , Ronnachai Jaroensri , Changil Kim , Mohamed Elgharib , Frédo Durand , William T. Freeman , Wojciech Matusik

Video representation learning has seen tremendous progress in recent years. This has been driven by many factors, including the scale of training and the success of visual models trained contrastively with language. While these factors have…

Computer Vision and Pattern Recognition · Computer Science 2026-05-25 Mantas Skackauskas , Xinyue Hao , Laura Sevilla-Lara

Action recognition in videos has attracted a lot of attention in the past decade. In order to learn robust models, previous methods usually assume videos are trimmed as short sequences and require ground-truth annotations of each video…

Computer Vision and Pattern Recognition · Computer Science 2019-02-21 Xiao-Yu Zhang , Haichao Shi , Changsheng Li , Kai Zheng , Xiaobin Zhu , Lixin Duan

The recent success in human action recognition with deep learning methods mostly adopt the supervised learning paradigm, which requires significant amount of manually labeled data to achieve good performance. However, label collection is an…

Computer Vision and Pattern Recognition · Computer Science 2018-09-07 Junnan Li , Yongkang Wong , Qi Zhao , Mohan S. Kankanhalli

Unsupervised representation learning aims at finding methods that learn representations from data without annotation-based signals. Abstaining from annotations not only leads to economic benefits but may - and to some extent already does -…

Computer Vision and Pattern Recognition · Computer Science 2023-12-04 Bonifaz Stuhr

The objective of this paper is self-supervised learning of spatio-temporal embeddings from video, suitable for human action recognition. We make three contributions: First, we introduce the Dense Predictive Coding (DPC) framework for…

Computer Vision and Pattern Recognition · Computer Science 2019-09-30 Tengda Han , Weidi Xie , Andrew Zisserman

We show that useful video representations can be learned from synthetic videos and natural images, without incorporating natural videos in the training. We propose a progression of video datasets synthesized by simple generative processes,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-20 Xueyang Yu , Xinlei Chen , Yossi Gandelsman

We propose a self-supervised method for learning motion-focused video representations. Existing approaches minimize distances between temporally augmented videos, which maintain high spatial similarity. We instead propose to learn…

Computer Vision and Pattern Recognition · Computer Science 2023-09-29 Fida Mohammad Thoker , Hazel Doughty , Cees Snoek

Well structured visual representations can make robot learning faster and can improve generalization. In this paper, we study how we can acquire effective object-centric representations for robotic manipulation tasks without human labeling…

Robotics · Computer Science 2018-11-20 Eric Jang , Coline Devin , Vincent Vanhoucke , Sergey Levine
‹ Prev 1 2 3 10 Next ›