Related papers: Learning Robust Video Synchronization without Anno…

Learning Blind Video Temporal Consistency

Applying image processing algorithms independently to each frame of a video often leads to undesired inconsistent results over time. Developing temporally consistent video-based extensions, however, requires domain knowledge for individual…

Computer Vision and Pattern Recognition · Computer Science 2018-08-02 Wei-Sheng Lai , Jia-Bin Huang , Oliver Wang , Eli Shechtman , Ersin Yumer , Ming-Hsuan Yang

Revisiting Temporal Alignment for Video Restoration

Long-range temporal alignment is critical yet challenging for video restoration tasks. Recently, some works attempt to divide the long-range alignment into several sub-alignments and handle them progressively. Although this operation is…

Computer Vision and Pattern Recognition · Computer Science 2021-12-02 Kun Zhou , Wenbo Li , Liying Lu , Xiaoguang Han , Jiangbo Lu

Learning by Aligning Videos in Time

We present a self-supervised approach for learning video representations using temporal video alignment as a pretext task, while exploiting both frame-level and video-level information. We leverage a novel combination of temporal alignment…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Sanjay Haresh , Sateesh Kumar , Huseyin Coskun , Shahram Najam Syed , Andrey Konin , Muhammad Zeeshan Zia , Quoc-Huy Tran

Video alignment using unsupervised learning of local and global features

In this paper, we tackle the problem of video alignment, the process of matching the frames of a pair of videos containing similar actions. The main challenge in video alignment is that accurate correspondence should be established despite…

Computer Vision and Pattern Recognition · Computer Science 2024-09-09 Niloufar Fakhfour , Mohammad ShahverdiKondori , Sajjad Hashembeiki , Mohammadjavad Norouzi , Hoda Mohammadzade

Video Representation Learning by Recognizing Temporal Transformations

We introduce a novel self-supervised learning approach to learn representations of videos that are responsive to changes in the motion dynamics. Our representations can be learned from data without human annotation and provide a substantial…

Computer Vision and Pattern Recognition · Computer Science 2020-07-22 Simon Jenni , Givi Meishvili , Paolo Favaro

Temporally stable video segmentation without video annotations

Temporally consistent dense video annotations are scarce and hard to collect. In contrast, image segmentation datasets (and pre-trained models) are ubiquitous, and easier to label for any novel task. In this paper, we introduce a method to…

Computer Vision and Pattern Recognition · Computer Science 2022-03-18 Aharon Azulay , Tavi Halperin , Orestis Vantzos , Nadav Borenstein , Ofir Bibi

Video Understanding: Through A Temporal Lens

This thesis explores the central question of how to leverage temporal relations among video elements to advance video understanding. Addressing the limitations of existing methods, the work presents a five-fold contribution: (1) an…

Computer Vision and Pattern Recognition · Computer Science 2026-04-06 Thong Thanh Nguyen

Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos

Sequential video understanding, as an emerging video understanding task, has driven lots of researchers' attention because of its goal-oriented nature. This paper studies weakly supervised sequential video understanding where the accurate…

Computer Vision and Pattern Recognition · Computer Science 2023-03-29 Sixun Dong , Huazhang Hu , Dongze Lian , Weixin Luo , Yicheng Qian , Shenghua Gao

Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets

Temporal video alignment aims to synchronize the key events like object interactions or action phase transitions in two videos. Such methods could benefit various video editing, processing, and understanding tasks. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2024-09-04 Ishan Rajendrakumar Dave , Fabian Caba Heilbron , Mubarak Shah , Simon Jenni

Temporal Alignment Networks for Long-term Video

The objective of this paper is a temporal alignment network that ingests long term video sequences, and associated text sentences, in order to: (1) determine if a sentence is alignable with the video; and (2) if it is alignable, then…

Computer Vision and Pattern Recognition · Computer Science 2022-04-07 Tengda Han , Weidi Xie , Andrew Zisserman

Shuffle and Learn: Unsupervised Learning using Temporal Order Verification

In this paper, we present an approach for learning a visual representation from the raw spatiotemporal signals in videos. Our representation is learned without supervision from semantic labels. We formulate our method as an unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2016-07-27 Ishan Misra , C. Lawrence Zitnick , Martial Hebert

Unsupervised Representation Learning by Sorting Sequences

We present an unsupervised representation learning approach using videos without semantic labels. We leverage the temporal coherence as a supervisory signal by formulating representation learning as a sequence sorting task. We take…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Hsin-Ying Lee , Jia-Bin Huang , Maneesh Singh , Ming-Hsuan Yang

Learning Temporal Regularity in Video Sequences

Perceiving meaningful activities in a long video sequence is a challenging problem due to ambiguous definition of 'meaningfulness' as well as clutters in the scene. We approach this problem by learning a generative model for regular motion…

Computer Vision and Pattern Recognition · Computer Science 2016-04-18 Mahmudul Hasan , Jonghyun Choi , Jan Neumann , Amit K. Roy-Chowdhury , Larry S. Davis

Learning Implicit Temporal Alignment for Few-shot Video Classification

Few-shot video classification aims to learn new video categories with only a few labeled examples, alleviating the burden of costly annotation in real-world applications. However, it is particularly challenging to learn a class-invariant…

Computer Vision and Pattern Recognition · Computer Science 2021-05-12 Songyang Zhang , Jiale Zhou , Xuming He

Learning Knowledge-Rich Sequential Model for Planar Homography Estimation in Aerial Video

This paper presents an unsupervised approach that leverages raw aerial videos to learn to estimate planar homographic transformation between consecutive video frames. Previous learning-based estimators work on pairs of images to estimate…

Computer Vision and Pattern Recognition · Computer Science 2023-04-07 Pu Li , Xiaobai Liu

Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics

We address the problem of video representation learning without human-annotated labels. While previous efforts address the problem by designing novel self-supervised tasks using video data, the learned features are merely on a…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Jiangliu Wang , Jianbo Jiao , Linchao Bao , Shengfeng He , Yunhui Liu , Wei Liu

Unaligned Image-to-Sequence Transformation with Loop Consistency

We tackle the problem of modeling sequential visual phenomena. Given examples of a phenomena that can be divided into discrete time steps, we aim to take an input from any such time and realize this input at all other time steps in the…

Computer Vision and Pattern Recognition · Computer Science 2019-10-10 Siyang Wang , Justin Lazarow , Kwonjoon Lee , Zhuowen Tu

Learning Temporal Embeddings for Complex Video Analysis

In this paper, we propose to learn temporal embeddings of video frames for complex video analysis. Large quantities of unlabeled video data can be easily obtained from the Internet. These videos possess the implicit weak label that they are…

Computer Vision and Pattern Recognition · Computer Science 2015-05-05 Vignesh Ramanathan , Kevin Tang , Greg Mori , Li Fei-Fei

Weakly-Supervised Alignment of Video With Text

Suppose that we are given a set of videos, along with natural language descriptions in the form of multiple sentences (e.g., manual annotations, movie scripts, sport summaries etc.), and that these sentences appear in the same temporal…

Computer Vision and Pattern Recognition · Computer Science 2015-12-22 Piotr Bojanowski , Rémi Lajugie , Edouard Grave , Francis Bach , Ivan Laptev , Jean Ponce , Cordelia Schmid

Sequential Modeling Enables Scalable Learning for Large Vision Models

We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data. To do this, we define a common format, "visual sentences", in which we can represent raw images…

Computer Vision and Pattern Recognition · Computer Science 2023-12-04 Yutong Bai , Xinyang Geng , Karttikeya Mangalam , Amir Bar , Alan Yuille , Trevor Darrell , Jitendra Malik , Alexei A Efros