Related papers: Synthesizing Dynamic Patterns by Spatial-Temporal …

Learning Energy-based Spatial-Temporal Generative ConvNets for Dynamic Patterns

Video sequences contain rich dynamic patterns, such as dynamic texture patterns that exhibit stationarity in the temporal domain, and action patterns that are non-stationary in either spatial or temporal domain. We show that an energy-based…

Computer Vision and Pattern Recognition · Computer Science 2019-09-27 Jianwen Xie , Song-Chun Zhu , Ying Nian Wu

Consistent Generative Query Networks

Stochastic video prediction models take in a sequence of image frames, and generate a sequence of consecutive future image frames. These models typically generate future frames in an autoregressive fashion, which is slow and requires the…

Computer Vision and Pattern Recognition · Computer Science 2019-04-23 Ananya Kumar , S. M. Ali Eslami , Danilo J. Rezende , Marta Garnelo , Fabio Viola , Edward Lockhart , Murray Shanahan

StNet: Local and Global Spatial-Temporal Modeling for Action Recognition

Despite the success of deep learning for static image understanding, it remains unclear what are the most effective network architectures for the spatial-temporal modeling in videos. In this paper, in contrast to the existing CNN+RNN or…

Computer Vision and Pattern Recognition · Computer Science 2018-12-12 Dongliang He , Zhichao Zhou , Chuang Gan , Fu Li , Xiao Liu , Yandong Li , Limin Wang , Shilei Wen

Learning a Generative Motion Model from Image Sequences based on a Latent Motion Matrix

We propose to learn a probabilistic motion model from a sequence of images for spatio-temporal registration. Our model encodes motion in a low-dimensional probabilistic space - the motion matrix - which enables various motion analysis tasks…

Computer Vision and Pattern Recognition · Computer Science 2021-02-02 Julian Krebs , Hervé Delingette , Nicholas Ayache , Tommaso Mansi

Synthesising Dynamic Textures using Convolutional Neural Networks

Here we present a parametric model for dynamic textures. The model is based on spatiotemporal summary statistics computed from the feature representations of a Convolutional Neural Network (CNN) trained on object recognition. We demonstrate…

Computer Vision and Pattern Recognition · Computer Science 2017-02-24 Christina M. Funke , Leon A. Gatys , Alexander S. Ecker , Matthias Bethge

Learning Dynamic Generator Model by Alternating Back-Propagation Through Time

This paper studies the dynamic generator model for spatial-temporal processes such as dynamic textures and action sequences in video data. In this model, each time frame of the video sequence is generated by a generator model, which is a…

Machine Learning · Statistics 2018-12-31 Jianwen Xie , Ruiqi Gao , Zilong Zheng , Song-Chun Zhu , Ying Nian Wu

Generating Long Videos of Dynamic Scenes

We present a video generation model that accurately reproduces object motion, changes in camera viewpoint, and new content that arises over time. Existing video generation methods often fail to produce new content as a function of time…

Computer Vision and Pattern Recognition · Computer Science 2022-06-10 Tim Brooks , Janne Hellsten , Miika Aittala , Ting-Chun Wang , Timo Aila , Jaakko Lehtinen , Ming-Yu Liu , Alexei A. Efros , Tero Karras

Learning Sequential Latent Variable Models from Multimodal Time Series Data

Sequential modelling of high-dimensional data is an important problem that appears in many domains including model-based reinforcement learning and dynamics identification for control. Latent variable models applied to sequential data…

Machine Learning · Computer Science 2023-01-23 Oliver Limoyo , Trevor Ablett , Jonathan Kelly

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles…

Computer Vision and Pattern Recognition · Computer Science 2016-08-03 Limin Wang , Yuanjun Xiong , Zhe Wang , Yu Qiao , Dahua Lin , Xiaoou Tang , Luc Van Gool

Spatiotemporal Residual Networks for Video Action Recognition

Two-stream Convolutional Networks (ConvNets) have shown strong performance for human action recognition in videos. Recently, Residual Networks (ResNets) have arisen as a new technique to train extremely deep architectures. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2016-11-08 Christoph Feichtenhofer , Axel Pinz , Richard P. Wildes

Comparison of Spatiotemporal Networks for Learning Video Related Tasks

Many methods for learning from video sequences involve temporally processing 2D CNN features from the individual frames or directly utilizing 3D convolutions within high-performing 2D CNN architectures. The focus typically remains on how to…

Computer Vision and Pattern Recognition · Computer Science 2020-09-17 Logan Courtney , Ramavarapu Sreenivas

StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN

Generative adversarial models (GANs) continue to produce advances in terms of the visual quality of still images, as well as the learning of temporal correlations. However, few works manage to combine these two interesting capabilities for…

Computer Vision and Pattern Recognition · Computer Science 2021-12-01 Gereon Fox , Ayush Tewari , Mohamed Elgharib , Christian Theobalt

DTSGAN: Learning Dynamic Textures via Spatiotemporal Generative Adversarial Network

Dynamic texture synthesis aims to generate sequences that are visually similar to a reference video texture and exhibit specific stationary properties in time. In this paper, we introduce a spatiotemporal generative adversarial network…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Xiangtian Li , Xiaobo Wang , Zhen Qi , Han Cao , Zhaoyang Zhang , Ao Xiang

Generating Videos with Scene Dynamics

We capitalize on large amounts of unlabeled video in order to learn a model of scene dynamics for both video recognition tasks (e.g. action classification) and video generation tasks (e.g. future prediction). We propose a generative…

Computer Vision and Pattern Recognition · Computer Science 2016-10-27 Carl Vondrick , Hamed Pirsiavash , Antonio Torralba

Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks

We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames…

Computer Vision and Pattern Recognition · Computer Science 2019-08-13 Tianfan Xue , Jiajun Wu , Katherine L. Bouman , William T. Freeman

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. However, a central challenge in video prediction is that the future is…

Computer Vision and Pattern Recognition · Computer Science 2020-02-13 Manoj Kumar , Mohammad Babaeizadeh , Dumitru Erhan , Chelsea Finn , Sergey Levine , Laurent Dinh , Durk Kingma

DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data

Despite recent progress, video diffusion models still struggle to synthesize realistic videos involving highly dynamic motions or requiring fine-grained motion controllability. A central limitation lies in the scarcity of such examples in…

Computer Vision and Pattern Recognition · Computer Science 2026-04-03 Wonjoon Jin , Jiyun Won , Janghyeok Han , Qi Dai , Chong Luo , Seung-Hwan Baek , Sunghyun Cho

Motion Selective Prediction for Video Frame Synthesis

Existing conditional video prediction approaches train a network from large databases and generalize to previously unseen data. We take the opposite stance, and introduce a model that learns from the first frames of a given video and…

Computer Vision and Pattern Recognition · Computer Science 2018-12-27 Veronique Prinet

Flow-Grounded Spatial-Temporal Video Prediction from Still Images

Existing video prediction methods mainly rely on observing multiple historical frames or focus on predicting the next one-frame. In this work, we study the problem of generating consecutive multiple future frames by observing one single…

Computer Vision and Pattern Recognition · Computer Science 2018-08-28 Yijun Li , Chen Fang , Jimei Yang , Zhaowen Wang , Xin Lu , Ming-Hsuan Yang

Learning Long-term Motion Embeddings for Efficient Kinematics Generation

Understanding and predicting motion is a fundamental component of visual intelligence. Although modern video models exhibit strong comprehension of scene dynamics, exploring multiple possible futures through full video synthesis remains…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Nick Stracke , Kolja Bauer , Stefan Andreas Baumann , Miguel Angel Bautista , Josh Susskind , Björn Ommer