Related papers: Learning to encode motion using spatio-temporal sy…

Collaborative Spatio-temporal Feature Learning for Video Action Recognition

Spatio-temporal feature learning is of central importance for action recognition in videos. Existing deep neural network models either learn spatial and temporal features independently (C2D) or jointly with unconstrained parameters (C3D).…

Computer Vision and Pattern Recognition · Computer Science 2019-03-05 Chao Li , Qiaoyong Zhong , Di Xie , Shiliang Pu

Self-supervised Motion Learning from Static Images

Motions are reflected in videos as the movement of pixels, and actions are essentially patterns of inconsistent motions between the foreground and the background. To well distinguish the actions, especially those with complicated…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Ziyuan Huang , Shiwei Zhang , Jianwen Jiang , Mingqian Tang , Rong Jin , Marcelo Ang

Modeling sequential data using higher-order relational features and predictive training

Bi-linear feature learning models, like the gated autoencoder, were proposed as a way to model relationships between frames in a video. By minimizing reconstruction error of one frame, given the previous frame, these models learn "mapping…

Machine Learning · Computer Science 2014-02-12 Vincent Michalski , Roland Memisevic , Kishore Konda

Learning Geo-Temporal Image Features

We propose to implicitly learn to extract geo-temporal image features, which are mid-level features related to when and where an image was captured, by explicitly optimizing for a set of location and time estimation tasks. To train our…

Computer Vision and Pattern Recognition · Computer Science 2019-09-18 Menghua Zhai , Tawfiq Salem , Connor Greenwell , Scott Workman , Robert Pless , Nathan Jacobs

Learning a Generative Motion Model from Image Sequences based on a Latent Motion Matrix

We propose to learn a probabilistic motion model from a sequence of images for spatio-temporal registration. Our model encodes motion in a low-dimensional probabilistic space - the motion matrix - which enables various motion analysis tasks…

Computer Vision and Pattern Recognition · Computer Science 2021-02-02 Julian Krebs , Hervé Delingette , Nicholas Ayache , Tommaso Mansi

Learning Long-term Motion Embeddings for Efficient Kinematics Generation

Understanding and predicting motion is a fundamental component of visual intelligence. Although modern video models exhibit strong comprehension of scene dynamics, exploring multiple possible futures through full video synthesis remains…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Nick Stracke , Kolja Bauer , Stefan Andreas Baumann , Miguel Angel Bautista , Josh Susskind , Björn Ommer

Using Motion and Internal Supervision in Object Recognition

In this thesis we address two related aspects of visual object recognition: the use of motion information, and the use of internal supervision, to help unsupervised learning. These two aspects are inter-related in the current study, since…

Computer Vision and Pattern Recognition · Computer Science 2018-12-14 Daniel Harari

STM: SpatioTemporal and Motion Encoding for Action Recognition

Spatiotemporal and motion features are two complementary and crucial information for video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn spatiotemporal features and another flow stream to learn motion…

Computer Vision and Pattern Recognition · Computer Science 2019-08-19 Boyuan Jiang , Mengmeng Wang , Weihao Gan , Wei Wu , Junjie Yan

Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection

Fine-grained action detection is an important task with numerous applications in robotics and human-computer interaction. Existing methods typically utilize a two-stage approach including extraction of local spatio-temporal features…

Computer Vision and Pattern Recognition · Computer Science 2019-11-11 Khoi-Nguyen C. Mac , Dhiraj Joshi , Raymond A. Yeh , Jinjun Xiong , Rogerio S. Feris , Minh N. Do

Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition

Spatio-temporal convolution often fails to learn motion dynamics in videos and thus an effective motion representation is required for video understanding in the wild. In this paper, we propose a rich and robust motion representation based…

Computer Vision and Pattern Recognition · Computer Science 2021-11-03 Heeseung Kwon , Manjin Kim , Suha Kwak , Minsu Cho

Spatio-temporal Action Recognition: A Survey

The task of action recognition or action detection involves analyzing videos and determining what action or motion is being performed. The primary subject of these videos are predominantly humans performing some action. However, this…

Computer Vision and Pattern Recognition · Computer Science 2019-01-29 Amlaan Bhoi

Learning Energy-based Spatial-Temporal Generative ConvNets for Dynamic Patterns

Video sequences contain rich dynamic patterns, such as dynamic texture patterns that exhibit stationarity in the temporal domain, and action patterns that are non-stationary in either spatial or temporal domain. We show that an energy-based…

Computer Vision and Pattern Recognition · Computer Science 2019-09-27 Jianwen Xie , Song-Chun Zhu , Ying Nian Wu

Seeing Fast and Slow: Learning the Flow of Time in Videos

How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and…

Computer Vision and Pattern Recognition · Computer Science 2026-04-24 Yen-Siang Wu , Rundong Luo , Jingsen Zhu , Tao Tu , Ali Farhadi , Matthew Wallingford , Yu-Chiang Frank Wang , Steve Marschner , Wei-Chiu Ma

Synthesizing Dynamic Patterns by Spatial-Temporal Generative ConvNet

Video sequences contain rich dynamic patterns, such as dynamic texture patterns that exhibit stationarity in the temporal domain, and action patterns that are non-stationary in either spatial or temporal domain. We show that a…

Machine Learning · Statistics 2017-05-31 Jianwen Xie , Song-Chun Zhu , Ying Nian Wu

On multi-view feature learning

Sparse coding is a common approach to learning local features for object recognition. Recently, there has been an increasing interest in learning features from spatio-temporal, binocular, or other multi-observation data, where the goal is…

Computer Vision and Pattern Recognition · Computer Science 2012-06-22 Roland Memisevic

Spatio-temporal prediction in video coding by best approximation

Within the scope of this contribution we propose a novel efficient spatio-temporal prediction algorithm for video coding. The algorithm operates in two stages. First, motion compensation is performed on the block to be predicted in order to…

Image and Video Processing · Electrical Eng. & Systems 2022-07-21 Jürgen Seiler , Haricharan Lakshman , André Kaup

Learning Image and Video Compression through Spatial-Temporal Energy Compaction

Compression has been an important research topic for many decades, to produce a significant impact on data transmission and storage. Recent advances have shown a great potential of learning image and video compression. Inspired from related…

Image and Video Processing · Electrical Eng. & Systems 2019-07-01 Zhengxue Cheng , Heming Sun , Masaru Takeuchi , Jiro Katto

Multiple Selection Approximation for Improved Spatio-Temporal Prediction in Video Coding

In this contribution, a novel spatio-temporal prediction algorithm for video coding is introduced. This algorithm exploits temporal as well as spatial redundancies for effectively predicting the signal to be encoded. To achieve this, the…

Image and Video Processing · Electrical Eng. & Systems 2022-07-05 Jürgen Seiler , André Kaup

Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction

Ever-increasing smartphone-generated video content demands intelligent techniques to edit and enhance videos on power-constrained devices. Most of the best performing algorithms for video understanding tasks like action recognition,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-05 Rishubh Parihar , Gaurav Ramola , Ranajit Saha , Ravi Kini , Aniket Rege , Sudha Velusamy

Learning Temporal Regularity in Video Sequences

Perceiving meaningful activities in a long video sequence is a challenging problem due to ambiguous definition of 'meaningfulness' as well as clutters in the scene. We approach this problem by learning a generative model for regular motion…

Computer Vision and Pattern Recognition · Computer Science 2016-04-18 Mahmudul Hasan , Jonghyun Choi , Jan Neumann , Amit K. Roy-Chowdhury , Larry S. Davis